-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
bug in gensim.summarization.mz_entropy.mz_keywords #2523
Copy link
Copy link
Open
Labels
HacktoberfestIssues marked for hacktoberfestIssues marked for hacktoberfestbugIssue described a bugIssue described a bugdifficulty easyEasy issue: required small fixEasy issue: required small fixgood first issueIssue for new contributors (not required gensim understanding + very simple)Issue for new contributors (not required gensim understanding + very simple)impact LOWLow impact on affected usersLow impact on affected usersreach LOWAffects only niche use-case usersAffects only niche use-case users
Metadata
Metadata
Assignees
Labels
HacktoberfestIssues marked for hacktoberfestIssues marked for hacktoberfestbugIssue described a bugIssue described a bugdifficulty easyEasy issue: required small fixEasy issue: required small fixgood first issueIssue for new contributors (not required gensim understanding + very simple)Issue for new contributors (not required gensim understanding + very simple)impact LOWLow impact on affected usersLow impact on affected usersreach LOWAffects only niche use-case usersAffects only niche use-case users
Problem statement:
It seems to be a bug if the text is too short and number of words is lower than blocksize. In my case the values were:
n_words (232.0)andblocksize (1024).Log:
Dirty solution:
Override
blocksizevalue from the default1024to something lower:mz_keywords(text, blocksize=128)