Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanity check for hs and negative in Word2Vec #3443

Merged
merged 1 commit into from Feb 17, 2023
Merged

Sanity check for hs and negative in Word2Vec #3443

merged 1 commit into from Feb 17, 2023

Conversation

gau-nernst
Copy link
Contributor

Fixes #1983

Changes:

  • Raise ValueError when both hs=0 and negative=0
  • Raise warning when hs and negative are nonzero
  • Add test for the above
  • Update description about hs and negative to avoid confusion

@gojomo Please let me know if you want any changes. Cheers!

@gojomo
Copy link
Collaborator

gojomo commented Feb 17, 2023

Looks very good to me!

@piskvorky piskvorky added this to the Next release milestone Feb 17, 2023
@piskvorky
Copy link
Owner

piskvorky commented Feb 17, 2023

Many thanks! Let's wait for @mpenkov to review and merge.

By the way, could other models in Gensim (doc2vec) benefit from a similar cleanup?

@gau-nernst
Copy link
Contributor Author

Many thanks! Let's wait for @mpenkov to review and merge.

By the way, could other models in Gensim (doc2vec) benefit from a similar cleanup?

It seems like doc2vec and fasttext in gensim have the same problem and would benefit from the same fix. I can add the changes to doc2vec and fasttext also, once you confirm it!

@piskvorky
Copy link
Owner

Yes please!

Copy link
Collaborator

@mpenkov mpenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mpenkov
Copy link
Collaborator

mpenkov commented Feb 17, 2023

Merging. Let's handle the other models in a separate PR, to keep the ball rolling.

@mpenkov mpenkov merged commit f260d1e into piskvorky:develop Feb 17, 2023
@gau-nernst gau-nernst deleted the word2vec-warnings branch February 17, 2023 14:46
@gau-nernst
Copy link
Contributor Author

Merging. Let's handle the other models in a separate PR, to keep the ball rolling.

I did a quick check. Since FastText and Doc2Vec inherit Word2Vec class, they automatically have the sanity check for hs and negative after this commit. We can add tests for FastText and Doc2Vec but I think it is not so necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Word2Vec and Doc2Vec do not update word embeddings if negative keyword is set to 0
4 participants