Skip to content

bug in gensim.summarization.mz_entropy.mz_keywords #2523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bbaranow opened this issue Jun 10, 2019 · 1 comment
Open

bug in gensim.summarization.mz_entropy.mz_keywords #2523

bbaranow opened this issue Jun 10, 2019 · 1 comment
Labels
bug Issue described a bug difficulty easy Easy issue: required small fix good first issue Issue for new contributors (not required gensim understanding + very simple) Hacktoberfest Issues marked for hacktoberfest impact LOW Low impact on affected users reach LOW Affects only niche use-case users

Comments

@bbaranow
Copy link

Problem statement:

It seems to be a bug if the text is too short and number of words is lower than blocksize. In my case the values were: n_words (232.0) and blocksize (1024).

Log:

gensim\summarization\mz_entropy.py:127: RuntimeWarning: invalid value encountered in double_scalars
  - __log_combinations(n_words, blocksize)

Dirty solution:

Override blocksize value from the default 1024 to something lower:

mz_keywords(text, blocksize=128)

@mpenkov mpenkov added the bug Issue described a bug label Jun 21, 2019
@mpenkov mpenkov added difficulty easy Easy issue: required small fix good first issue Issue for new contributors (not required gensim understanding + very simple) Hacktoberfest Issues marked for hacktoberfest labels Sep 28, 2019
@csaranbalaji
Copy link

I would like to work on this issue. Will submit a PR soon.

csaranbalaji added a commit to csaranbalaji/gensim that referenced this issue Oct 5, 2019
csaranbalaji added a commit to csaranbalaji/gensim that referenced this issue Oct 5, 2019
csaranbalaji added a commit to csaranbalaji/gensim that referenced this issue Oct 5, 2019
@piskvorky piskvorky added reach LOW Affects only niche use-case users impact LOW Low impact on affected users labels Oct 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue described a bug difficulty easy Easy issue: required small fix good first issue Issue for new contributors (not required gensim understanding + very simple) Hacktoberfest Issues marked for hacktoberfest impact LOW Low impact on affected users reach LOW Affects only niche use-case users
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants