Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting KeyError #4095

Closed
AbhayGodbole opened this issue Aug 7, 2019 · 10 comments
Closed

Getting KeyError #4095

AbhayGodbole opened this issue Aug 7, 2019 · 10 comments
Labels
feat / ner Feature: Named Entity Recognizer models Issues related to the statistical models

Comments

@AbhayGodbole
Copy link

AbhayGodbole commented Aug 7, 2019

I am getting error while executing “spacy_ner_custom_entities.py”
Loaded model 'en'

KeyError Traceback (most recent call last)
in
----> 1 main(model="en",output_dir="../Data/Out")

C:\Abhay\AI\Spacy-NER\Notebooks\spacy_ner_custom_entities.py in main(model, new_model_name, output_dir, n_iter)
68 texts, annotations = zip(*batch)
69 nlp.update(texts, annotations, sgd=optimizer, drop=0.35,
---> 70 losses=losses)
71 print('Losses', losses)
72

c:\abhay\ai\spacy-ner\vir-spacy\lib\site-packages\spacy\language.py in update(self, docs, golds, drop, sgd, losses, component_cfg)
450 kwargs = component_cfg.get(name, {})
451 kwargs.setdefault("drop", drop)
--> 452 proc.update(docs, golds, sgd=get_grads, losses=losses, **kwargs)
453 for key, (W, dW) in grads.items():
454 sgd(W, dW, key=key)

nn_parser.pyx in spacy.syntax.nn_parser.Parser.update()

nn_parser.pyx in spacy.syntax.nn_parser.Parser._init_gold_batch()

ner.pyx in spacy.syntax.ner.BiluoPushDown.preprocess_gold()

ner.pyx in spacy.syntax.ner.BiluoPushDown.lookup_transition()

KeyError: "[E022] Could not find a transition with the name 'U-Tag' in the NER model."
I am using following custom lables:
LABEL = [‘B-NewNom’,’I-NewNom’,’B-OldNom’,’I-OldNom’]
NewNom = New Nominee
OldNom = Old Nominee

I am using latest Sapcy:
===== Info about spaCy =========
spaCy version 2.1.0
Platform Windows-10–10.0.16299-SP0
Python version 3.7.1
Models en

Environment

  • Operating System: Windows-10–10.0.16299-SP0
  • Python Version Used: 3.7.1
  • spaCy Version Used: 2.1.0
  • Environment Information:
@svlandeg svlandeg added feat / ner Feature: Named Entity Recognizer models Issues related to the statistical models labels Aug 7, 2019
@AbhayGodbole
Copy link
Author

Hi
Please let me know what is the issue. I am stuck.

@pratapaprasanna
Copy link

pratapaprasanna commented Aug 8, 2019

hi @AbhayGodbole there might be an issue with your data. please check your data
one good way is to remove the batch and send in just the text and annotations

so i mean

text ===> the entire text and
annotations===> contain only the annotations

try:
      nlp.update(text, annotations, sgd=optimizer, drop=0.35, losses=losses)
  except:
      import pdb;pdb.set_trace() 

to check the exact record where you are facing issue.

and

do you have


    for _, annotations in train_data:
        for ent in annotations.get('entities'):
            ner.add_label(ent[2])

where you add the entities to your ner pipeline.

we can be of better help if you paste the entire code here.

@AbhayGodbole
Copy link
Author

Thanks Gautham,
I have tried with the ner_dataset downloaded from Kaggle, getting same error. PFA the code... "
spacy_ner_custom_entities.zip

Regards

@pratapaprasanna
Copy link

pratapaprasanna commented Aug 9, 2019

Hi @AbhayGodbole

I dont think

{"content": "Can you please make the address change in your system and confirm ", "annotation": []}
{"content": "My new address is Flat A Sunrise Apartments No ", "annotation": []}
{"content": "10 Tank Bund Road Nungambakkam Chennai - 600034 Thanks Britto Subject : RE : Address change Hi I am moving to my new residence which I have hired recently ", "annotation": []}
{"content": "Can you please make the address change in your system and confirm ", "annotation": []}
{"content": "My new address is Flat A Prince Residency No ", "annotation": []}
{"content": "10 Tank Bund Road Kodambakkam Delhi - 500034 Thanks Britto Subject : RE : Address change Hi I am moving to my new residence which I have hired recently ", "annotation": []}
{"content": "Can you please make the address change in your system and confirm ", "annotation": []}
{"content": "My new address is Flat A Sunrise Apartments No ", "annotation": []}
{"content": "10 Tank Bund Road Nungambakkam Chennai - 600034 Thanks Britto '' Subject : RE : Address change Hi I am moving to my new residence which I have hired recently ", "annotation": []}

this is the format u have to send-in for spacy (as far as i know .)

Your data should be similar to the one present in this page [Training Data section](

https://timkuhn.github.io/TextMining/spacy/ner/2018/01/24/spaCy_NER_Training.html )

Or send the data in BILUO format.

Thanks

@AbhayGodbole
Copy link
Author

Hi Guatham,
When I used your suggested add_label code, the error got fixed. but I am not getting entities following is the out put:
Loaded model 'en'
Tag
B-NEWNOM
I-NEWNOM
B-NEWNOM
I-NEWNOM
B-NEWNOM
I-NEWNOM
B-OLDNOM
I-OLDNOM
B-NEWNOM
I-NEWNOM
B-NEWNOM
I-NEWNOM
B-NEWNOM
I-NEWNOM
B-OLDNOM
B-NEWNOM
B-OLDNOM
I-OLDNOM
B-NEWNOM
I-NEWNOM
B-OLDNOM
B-NEWNOM
B-OLDNOM
B-NEWNOM
B-OLDNOM
B-NEWNOM
B-NEWNOM
I-NEWNOM
B-NEWNOM
I-NEWNOM
B-OLDNOM
I-OLDNOM
B-NEWNOM
I-NEWNOM
B-NEWNOM
B-OLDNOM
I-OLDNOM
B-NEWNOM
B-NEWNOM
I-NEWNOM
B-OLDNOM
B-NEWNOM
Losses {'ner': 9262.345964670181}
Losses {'ner': 7998.943309083581}
Losses {'ner': 7308.93450319767}
Losses {'ner': 6865.295005142689}
Losses {'ner': 6465.674752284307}
Losses {'ner': 6089.345280632377}
Losses {'ner': 6018.423113524914}
Losses {'ner': 6126.822117805481}
Losses {'ner': 5845.172473907471}
Losses {'ner': 5784.758152004273}
Entities in 'Gianni Infantino is the president of FIFA.'
Saved model to ..\Data\Out
Loading from ..\Data\Out

Regarding the Data. I have .json shared is in the same format that you have mentioned. I have converted this json to spacy required format with script attached.
json_to_spacy.zip

@AbhayGodbole
Copy link
Author

AbhayGodbole commented Aug 9, 2019

Hi Guatham,
When I am executing the function without passing the Model, i.e. with "Created blank 'en' model"
I am getting the output....
#test_text = "Apart from my permanent address in India I want you to change the Nomination in One of the Policy ( 76489323 ) I want make make my son as Nomini His name is Sudev Bhandary and his age is 12 year Please update and let me know With Regards Hari "
Loading from ..\Data\Out
<spacy.lang.en.English object at 0x000001FEBE8A4F98>
(Sudev, Bhandary)
B-NEWNOM Sudev
I-NEWNOM Bhandary

But issue is its forgetting the previously trained entities like "India".
Secondly when I provide some different sentence (outside from training data) with these entities, its not able to extract for example :
test_text = "subject request for changes hi policy number 45677 need an nominee change please update the nominee as Sudev Bhandary with regards Hari"

@BreakBB
Copy link
Contributor

BreakBB commented Aug 13, 2019

Forget entities is normal if you train a model a lot for different entities without including some examples for "already known" entities (e.g. India). Moreover you have to make sure to include enough examples of your new entities and representive ones, with different positions inside the sentences.

@AbhayGodbole
Copy link
Author

Thanks BB. I got it.

@ines ines closed this as completed Sep 12, 2019
@zakarianamikaz
Copy link

@AbhayGodbole can you tell how you solved the issue

@lock
Copy link

lock bot commented Oct 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Oct 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feat / ner Feature: Named Entity Recognizer models Issues related to the statistical models
Projects
None yet
Development

No branches or pull requests

6 participants