You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
natural.Tfidf.addDocument accepts either a string or an array of pre-tokenized texts. When a document is added using an array of tokens, listTerms still applies the tokenization to the individual document tokens when computing the tfidf score, resulting in a tfidf score of 0, even though the tf and idf scores are > 0.
natural.Tfidf.addDocument accepts either a string or an array of pre-tokenized texts. When a document is added using an array of tokens, listTerms still applies the tokenization to the individual document tokens when computing the tfidf score, resulting in a tfidf score of 0, even though the tf and idf scores are > 0.
(natural version: ^5.1.11)
An example:
The second document should have a tfidf score of 0.306... (1 * .0.3068..), but it is 0.
The fix is simple.. Update the listTerms(...) function to pass an array in
tfidf: _this.tfidf(term, d)
call (change to:tfidf: _this.tfidf([term], d)
(line 174 here: https://github.com/NaturalNode/natural/blob/master/lib/natural/tfidf/tfidf.js ).Thanks.
The text was updated successfully, but these errors were encountered: