New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After Spacy 2.3 upgrade, command-line spacy train command keeps failing #5620
Comments
A minor correction:
So the only option that seems to allow me to continue with Spacy using vectors is using one of my older (Spacy 2.2.3) models as 'base', and keep re-training them. However, the message Spacy is giving me is not very encouraging: |
Thanks for the report, the train command error with
The warning above is correct: v2.2 and v2.3 models aren't compatible, so you need to be sure the models you're using match the spacy version you're using. |
This looks like an unintended side effect of #5374. |
Thank you for suggestion to revert to Spacy 2.2.4: That said, I would prefer NOT to have to revert to 2.2.4. I was able to re-train one of my 2.1.3 models (using it as 'base'). My hope was 2.3 fixed the memory leak in beam-search. And in 2.3 it seems better, though still leaking - see bug ... (I need NER prediction confidence, and I really can not re-load the model in prediction server every 500 or so requests).
Since bug 3782 has been closed, should I submit it again? Or is it an enhancement request "please allow adding new entity(ies) in re-training a base model"? |
You shouldn't have to modify your CUDA/cupy installation when upgrading spacy. It should be relatively independent. You can uninstall The beam search memory leak should be fixed in 2.2.4. If you'd like to use 2.3.0, you can apply this very short patch and install spacy from source (or just edit this one file in your spacy install) to fix the If you'd like to use the train CLI |
Thank you. I will try to 'patch' the 2.3 ... the fix seems pretty simple/obvious. I got the GPU working, but I hoped for a better performance boost (but I did not spend that much on GPU either). Perhaps I can improve it by some parameter tuning. Right now Spacy reports about 30,000 GPU WPS, but the GPU utilization is very low, about 20% CUDA use on my GeForce GPX 1660 card. Looks like I will get 2x the speed... The beam search memory leak should be fixed in 2.2.4. And thank you very much for the advice on adding the labels. CLI 'train' too complicated? I guess 'complicated' is a relative term. I may try it, though I do not trust my Python skills yet. Perhaps, almost 20 years ago, when my co-worker Mark Lutz was enthusiastically writing the first Python book, I should have joined him :-). |
In terms of the memory usage, maybe #5083 is relevant? The vocab in the model is not static and the memory usage will grow to some extent as you use on texts with tokens it hasn't seen before. If the memory usage doesn't look like it's explained by this, you can open a new issue with a minimal example that demonstrates the problem and we'll try to look into it. It could be related to the Most of the development has focused on efficient CPU implementations and the cupy/GPU implementation is not particularly efficient at this point. I think I normally see about a 2-4x difference between CPU and GPU, but it'll depend on your system. It's probably not going to be a huge difference. With the train CLI there are so many possible combinations of options that it's hard to keep everything tested sufficiently. There are several configurations that we test thoroughly when training the provided models (we also use the train CLI directly for internal training), but some like this |
Thank you very much.
With regards to memory usage, the #5083 is relevant. My data is from OCR documents, and there is an infinite amount of ‘garbage’ (noise, unreadable words) and ‘misspellings’ (words such as “inthe”). So my vocabulary is almost infinite.
My idea is replacing the ‘garbage’ in data with some special token, like ‘…’ to indicate there is an unknown text – though I doubt I know enough about Spacy and ML to choose the ‘right’ approach .
We also try ‘fixing’ the ‘misspellings’ but that has been an uphill battle (perhaps another AI ML project ☺). Hence adding nlp.vocab.strings._reset_and_load(minimal_strings) makes sense, and I would expect it to be much faster than re-loading the entire model.
I am not familiar with the iterator.tee() problem, but the prediction code using beam search is NOT thread safe. Using it in multi-threaded server is out of question.
Further with regards to memory usage, I see (a smaller but steady) memory utilization increase even when I repeatedly (1000 times) request prediction for the same phrase (using beam width up to 16). I will try to reproduce it in a small unit test and submit as a bug.
Finally, thanks again the –v patch. Works like a champ. And switching from command line options to config file(s) makes a lot of sense. That could include the environmental variables . I have learned to use Java Properties (key=value pairs, i.e. persistent dictionary) a LOT…
From: Adriane Boyd [mailto:notifications@github.com]
Sent: Wednesday, June 24, 2020 2:12 AM
To: explosion/spaCy <spaCy@noreply.github.com>
Cc: Martin Brunecky <Martin.Brunecky@kofile.us>; Author <author@noreply.github.com>
Subject: Re: [explosion/spaCy] After Spacy 2.3 upgrade, command-line spacy train command keeps failing (#5620)
In terms of the memory usage, maybe #5083<#5083> is relevant? The vocab in the model is not static and the memory usage will grow to some extent as you use on texts with tokens it hasn't seen before. If the memory usage doesn't look like it's explained by this, you can open a new issue with a minimal example that demonstrates the problem and we'll try to look into it. It could be related to the tee problem mentioned in #5083<#5083>?
Most of the development has focused on efficient CPU implementations and the cupy/GPU implementation is not particularly efficient at this point. I think I normally see about a 2-4x difference between CPU and GPU, but it'll depend on your system. It's probably not going to be a huge difference.
With the train CLI there are so many possible combinations of options that it's hard to keep everything tested sufficiently. There are several configurations that we test thoroughly when training the provided models (we also use the train CLI directly for internal training), but some like this -v bug still end up falling through the cracks. The current train CLI is going to be replaced with training from config files in spacy v3, which should hopefully be easier to maintain.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#5620 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AOGQRXMFQIQIY4VMZADQYRLRYGYONANCNFSM4ODSORTA>.
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
How to reproduce the behaviour
I have been using Spacy 2.2.3 and the command below to train my models for almost a month. I decided to add GPU, and updated my Spacy installation using
pip install -D spacy[CUDA101]
This updated my install to Spacy 2.3 (I then repeatedly uninstall, pip --no-cache-dir install -U spacy[cuda101] - but that did not change the 'new' behavior)
I updated Spacy models (see environment below) AND pip install -U spacy-lookups-data
My training command now fails:
When trying to train using an existing model (this never worked in 2.2.3), I get:
(the file is there, in C:\Program Files\Python\Lib\site-packages\en_core_web_md\meta.json)
Your Environment
Installation performed as Administrator, NO venv
py -m spacy validate
←[2K✔ Loaded compatibility table
←[1m
====================== Installed models (spaCy v2.3.0) ======================←[0m
ℹ spaCy installation: C:\Program Files\Python\lib\site-packages\spacy
TYPE NAME MODEL VERSION
package en-core-web-sm en_core_web_sm 2.3.0 ✔
package en-core-web-md en_core_web_md 2.3.0 ✔
package en-core-web-lg en_core_web_lg 2.3.0 ✔
The text was updated successfully, but these errors were encountered: