Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convertvec output inconsistent with word2vec ouput #1

Closed
skbach opened this issue Dec 21, 2014 · 2 comments
Closed

convertvec output inconsistent with word2vec ouput #1

skbach opened this issue Dec 21, 2014 · 2 comments

Comments

@skbach
Copy link

skbach commented Dec 21, 2014

Hi,

I think I found an issue with the code.

1.) Run the word2vec/demo-word.sh script.

Generates vectors.bin

2.) Run the following (modified version of above, but the output mode is text not binary)

time ./word2vec -train text8 -output vectors.txt -cbow 1 -size 200 -window 8 -negative 25 -hs 0 -sample 1e-4 -threads 20 -binary 0 -iter 15

3.) Convert the original binary into text with convertvec

./convertvec bin2txt vectors.bin vectors-converted.txt

4.) diff vectors.txt with vectors-converted.txt and the files are not the same. They have the same words in each file, and each word has 200 vectors, but the vectors do not correspond to the original vectors generated by word2vec in text mode.

Am I crazy? :)

@marekrei
Copy link
Owner

Word2vec initialises the vectors randomly. So even if you just run the same command twice, you're not going to get the same vectors.

@skbach
Copy link
Author

skbach commented Dec 22, 2014

Ah, that makes sense. Thanks!

@skbach skbach closed this as completed Dec 22, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants