Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

word2vec pickle error help #860

Closed
Max-programmer opened this issue Sep 19, 2016 · 6 comments
Closed

word2vec pickle error help #860

Max-programmer opened this issue Sep 19, 2016 · 6 comments

Comments

@Max-programmer
Copy link

Hello,
I am new to gensim and am trying to load an english word2vec model with my python script model.py and test it:

import gensim.models.word2vec
model = gensim.models.Word2Vec.load("en.model")
model.similarity('woman', 'man')

Now I googled and found out it is an error caused by pickling it. A suggestion is to use:
pickle.load(file_obj, encoding='latin1')

But how do I apply that suggestion? Or is there another way to solve the problem?

"C:\Program Files (x86)\Anaconda3\python.exe" "C:/Users/M/PycharmProjects/Twitter Sentiment Analysis/Word2Vec/Model.py"
Traceback (most recent call last):
File "C:/Users/M/PycharmProjects/Twitter Sentiment Analysis/Word2Vec/Model.py", line 5, in
model = gensim.models.Word2Vec.load("german.model")
File "C:\Program Files (x86)\Anaconda3\lib\site-packages\gensim\models\word2vec.py", line 1684, in load

model = super(Word2Vec, cls).load(*args, **kwargs)

File "C:\Program Files (x86)\Anaconda3\lib\site-packages\gensim\utils.py", line 248, in load
obj = unpickle(fname)
File "C:\Program Files (x86)\Anaconda3\lib\site-packages\gensim\utils.py", line 911, in unpickle
return _pickle.loads(f.read())
_pickle.UnpicklingError: invalid load key, '6'.

Process finished with exit code 1

@Max-programmer Max-programmer changed the title word2vec word2vec pickle error help Sep 19, 2016
@gojomo
Copy link
Collaborator

gojomo commented Sep 19, 2016

How was the model in file 'en.model' initially created and saved? (Was it using the same versions of Python, gensim, OS, etc?)

@Max-programmer
Copy link
Author

I got the model from:
http://devmount.github.io/GermanWordEmbeddings/
I do not know of any specific versions that he used except that it was python and gensim.
Now I use python 3.5 on Windows 8 to load the model.

@gojomo
Copy link
Collaborator

gojomo commented Sep 20, 2016

Looking at their code (https://github.com/devmount/GermanWordEmbeddings/blob/c2b603a07d968146995ee9dde54a25fd0aa8586a/training.py#L56), I see they've saved the model via save_word2vec_format() - which means you'd need to use Word2Vec.load_word2vec_format() to have a chance of loading.

I can also tell from the included notebook that Python 2.7.6 was used. (See the bottom of: https://raw.githubusercontent.com/devmount/GermanWordEmbeddings/master/code/training.ipynb). So if you still have problems after using load_word2vec_format(), you may want to try using Python 2.7.6.

@tmylk
Copy link
Contributor

tmylk commented Sep 22, 2016

@Max-programmer Did using Python 2 help? If yes, then I would like to close this issue.

@Max-programmer
Copy link
Author

@tmylk: I got sick, will try it out tomorrow or saturday and will post about the results here. Thanks in advance!

@Max-programmer
Copy link
Author

Thanks, you can close the issue now.

@tmylk tmylk closed this as completed Sep 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants