-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a better way to save the model to the disk? #8
Comments
There are ways to save and load these kind of models, but I have not gotten around to properly implementing this yet, since it will take up significant time, and I didn't have too much spare time the past few months. |
As a work-around, you can use the stand alone usage and simply safe the sklearn classifier as a separate component. |
Let me know if this makes sense and works for you. If so, it might be an easy hot-fix to include the feature of passing such a classifier within the spacy pipeline along with the embedding model. |
I will give it a try in a couple of hours and let you know. Thanks a lot for your reply for now 😊 |
Hi! Thank you again for your help. I finally solved the problem following your advice of using the standalone usage. I saved the entire classifier object with |
Thanks for the shoutout✌️. Awesome that iw worked! Could you share your code? Then I can formalize it and potentially use it as a base for the spacy save update.
… On 17 Jun 2022, at 17:01, fMercatili ***@***.***> wrote:
Hi! Thank you again for your help. I finally solved the problem following your advice of using the standalone usage. I saved the entire classifier object with pickle.dump() and retrieved afterward with pickle as well. It only takes a couple of seconds to load the model.
This library is pure gold btw 😍
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.
|
Literally four lines of code 😁
and then to load the model and use it:
|
Hello everyone. I am a novice in this field and I really would like to thank you all for this wonderful library that is making my work way easier than I expected!
For a project, I have to do a few-shot classification with a relatively large amount of samples (27 labels with roughly 500 entries each). This process takes about 15 minutes on my computer using the sentence-transformers embeddings. I wanted to save the resulting model to the disk in order to be used again in the future. I tried the instructions of the spacy documentation (https://spacy.io/usage/saving-loading), but I observe a strange behavior: if I use the "to_disk" and "from_disk" methods the model is saved on the disk, but when it's time to load it, it seems to start the loading from zero, taking 15 minutes again to load. I also tried with pickle, but I get the following error at loading time: "AttributeError: [E047] Can't assign a value to unregistered extension attribute 'cats'. Did you forget to call the
set_extension
method?".Does anybody know a way to save the model (or the spacy object) to the disk and retrieve it rapidly afterwards?
The text was updated successfully, but these errors were encountered: