Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to ensure that the negative sampled words are not the target word? #9

Closed
jeffchy opened this issue Nov 20, 2018 · 1 comment
Closed

Comments

@jeffchy
Copy link

jeffchy commented Nov 20, 2018

First, thanks for you excellent code :)

In model.py, the following piece of code suggests that we may get positive word when we do negative sampling, though the probability is very small.
nwords = t.multinomial(self.weights, batch_size * context_size * self.n_negs, replacement=True).view(batch_size, -1)
I'm wondering why you didn't perform equality check, is that because it doesn't affect the quality of trained word vectors but slow down the training speed?
Are there other reasons?

@theeluwin
Copy link
Owner

Simply because of the training speed, yes.
I once implemented the 'correct sampling'

def sample(self, iword_b, owords_b):
but it was way too slow, while faster training yields more iterations.
But I believe that the correct sampling is still the right way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants