You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Although I could not find a statement clarifying the difference in the online help or in Voyant Tools info popups, it appears that "unique word forms count" is pre-stopwords processing and corpus terms count is post-stopwords processing. Is that correct?
The text was updated successfully, but these errors were encountered:
There are info popups in the summary panel which explain this. Unique word forms count is the total number of words after discarding duplicates occurrences. So "the" would only be counted once even if it occurs 100 times in the corpus. It is unrelated to stopwords.
Yes the info blocks are helpful. However, terms such as "the" are part of the default stopwords list. The corpus terms count is lower than unique words count. Is the corpus terms count result after stopwords filtering, whereas unique word forms count includes words in the stopwords list?
The default stopwords list is applied globally by default, so this includes the corpus terms panel.
Metadata is unaffected however, e.g. the unique word forms statement in the summary panel or the stats in the documents panel.
Although I could not find a statement clarifying the difference in the online help or in Voyant Tools info popups, it appears that "unique word forms count" is pre-stopwords processing and corpus terms count is post-stopwords processing. Is that correct?
The text was updated successfully, but these errors were encountered: