-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NER fine tuning keeps failing (sometimes wired results, sometimes malloc errors) #910
Comments
I haven't run this code yet, but I think I see the problem. I doubt that the labels are being added to the The workaround for now would be to readd the labels after loading. The better fix will be to add them to the cfg during |
Might that also be the cause for the malloc error or is that likely to be unrelated? |
It would make sense. The model would produce a class that's out-of-bounds for the transition system. I doubt I'm checking for that, so it would cause a memory error. |
Two further issues here, one with your code, one with the library
|
Thanks for letting me know. You should probably change your example code then (or remove it if it is not up to date) https://github.com/explosion/spaCy/blob/master/examples/training/train_ner.py#L26 |
Closing this because the specific bug has been fixed. Still need to fix the docs and the save/load process, but that's covered in other issues. |
I've got exactly the same, even after upgrading to 1.7.3 |
If you have added new labels during fine tuning, you need to add them again after loading the model from disk (if you are saving the model in between train and use) |
The rationale behind is to save to disk nlp.entity, updating it like the code below
|
Still without adding any new entity type, it messed up |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
There is no consistent behaviour when fine tuning a spacy NER model ("en" in this case).
Sometimes the newly trained model will annotate every word in the test sentence as an entity, sometimes
I am looking for a restaurant
is recognized asTIME
. The added minimal failing example shows the behaviour.Info about spaCy
Code to reproduce:
Sometimes the result is a malloc error, sometimes I get
sometimes its
and when unlucky I run into this
The text was updated successfully, but these errors were encountered: