-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Queston: Difference btw Spacy WordVec and Gensim/Google WordVec #338
Comments
Google's wordvec is able to generate word vectors from text. Spacy makes it easy to load these and other word vectors so that you can use them in your NLP tasks. By default, spaCy currently loads vectors produced by the Levy and Goldberg (2014) dependency-based word2vec model but you can also load Google's word2vec or Glove vectors. Please see this blog post for more details on how to do that: |
Thanks Yasser |
Easiest way to load GloVe vectors is now: import spacy
nlp = spacy.load('en', vectors='en_glove_cc_300_1m') This will load a subset of the GloVe common crawl vectors --- it'll give you vectors for 1m words. This is a large vocabulary and you should get high coverage with this, without the crazy memory requirements of the original unpruned data. This function isn't well documented yet, because we've only recently stabilised the API. I'll fix the blog post. |
this doesn't work and throws exception:
the reason is the regex should be just '_', which will work fine both for 'en' and for 'en_glove_cc_300_1m' returning the desired 'en' However even after fixing the regex there is another exception:
running "python -m spacy.en.download --force all" doesn't help running version 0.101.0 |
Ran into the same issue. Per @aie0's suggestion I switched |
This should all be cleaned up in 1.0 — the GloVe vectors are installed by default, and it's much easier to use different vectors. |
i always get this error even after installing the 'en': |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hi ,
Thanks a lot for your fantastic tool, keep up with the good work!
I want to ask you the difference between the Google word vector library ( https://code.google.com/archive/p/word2vec/ ) and the one you use in Spacy.
Kind regards
The text was updated successfully, but these errors were encountered: