Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Gensim doesn't allow changing negative sampling distribution parameter #2090
Like pointed out in the following article, the negative sampling distribution parameter, which is fixed as 0.75 in Gensim, is worth tuning, specially for other applications beyond NLP. So, I'd be very helpful to make it a parameter for the Word2Vec, instead of fixing it.
Hello, @gojomo. I'm preparing a PR, but I'm having some trouble with some tests falling. The following tests failed:
Both had the same error:
I guess it's because of the new attribute I've placed in the Word2Vec. What am I supposed to do to make it compatible with old saved models?
Feel free to create the PR from your branch to gensim/develop even before it's fully ready - that'll make it easier for others to see the changes, and to see the unit-test results in the project's own continuous-integration setup. You can mark it "[WIP]" to be clear it's a Work-In-Progress.
In general, if a class gets a new (necessary) parameter, older saved (python pickled) objects of that type will need to be patched-up, upon
The various "if" fixups there either rebuild things that intentionally weren't saved (because they can be fully rebuilt from other saved state), or path-up missing state that newer source requires and older source didn't save. I'm not sure why missing this parameter creates exactly the error you've reported – maybe an earlier error is being silently suppressed – but one of the
Without a full review yet, a few other thoughts: