Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why skip-gram takes context word as input and predict word itself #2579

Closed
truythu169 opened this issue Aug 14, 2019 · 4 comments
Closed

why skip-gram takes context word as input and predict word itself #2579

truythu169 opened this issue Aug 14, 2019 · 4 comments
Assignees
Labels
bug Issue described a bug

Comments

@truythu169
Copy link

truythu169 commented Aug 14, 2019

Problem description

I'm going to use the python code of skip-gram (sg) in my research but recognize difference between the implementation and the original in Mikolov's paper.
The detail of the difference will be mentioned below.
Please let me know if this difference is intentionally or just a bug.

Steps/code/corpus to reproduce

code in:

gensim/gensim/models/word2vec.py

https://github.com/RaRe-Technologies/gensim/blob/f97d0e793faa57877a2bbedc15c287835463eaa9/gensim/models/word2vec.py#L399-L414

We can see input word is treated as output of NN, while context is embedded by matrix syn0 (vectors matrix)

...

https://github.com/RaRe-Technologies/gensim/blob/f97d0e793faa57877a2bbedc15c287835463eaa9/gensim/models/word2vec.py#L443-L456

as a result, we're going to optimize P( input / context ) while, in the original paper, they tried to optimize P( context / input) in skip-gram architecture.

@AMR-KELEG
Copy link
Contributor

@truythu169 I have a feeling you have found a bug but let's wait for a confirmation from the maintainers since this would be a major one conceptually 😅

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 7, 2019

@piskvorky WDYT? Does this look like a bug? I'm not familiar with the implementation (TBH, I don't think at this stage anyone is).

@truythu169 Are you able to make a PR? A good start would be a unit test that fails because of the suspected bug.

@mpenkov mpenkov self-assigned this Sep 7, 2019
@mpenkov mpenkov added the bug Issue described a bug label Sep 7, 2019
@piskvorky
Copy link
Owner

piskvorky commented Sep 7, 2019

No, zero chance there's a bug in the word2vec algo.

@AMR-KELEG The question of context-vs-target direction comes up a lot, check the mailing list. I remember @gojomo answered it repeatedly, although I cannot find his great answers now. @gojomo can you add it to the Gensim FAQ?

@gojomo
Copy link
Collaborator

gojomo commented Sep 7, 2019

I recall answering this a few times, and though I can't find my answers at the moment, it was @piskvorky first at: #300 (comment)

While this has come up a few times – like confusion about the proper handling of averaging/dividing CBOW vectors/gradients – it's still very insider, for people obsessing over the source – I wouldn't assign it a slot in the overall FAQ. Maybe a new "implementation details FAQ"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue described a bug
Projects
None yet
Development

No branches or pull requests

5 participants