Skip to content

Fix language specificaion in stemmer type#116

Merged
aviks merged 2 commits intoJuliaText:masterfrom
nickto:fix-stemmer-language
Jan 11, 2019
Merged

Fix language specificaion in stemmer type#116
aviks merged 2 commits intoJuliaText:masterfrom
nickto:fix-stemmer-language

Conversation

@nickto
Copy link
Copy Markdown
Contributor

@nickto nickto commented Jan 8, 2019

Currently the language for the stemmer is inferred using name(language(d)) where d is an ::AsbtractDocument. This produces the name of the language in that language (e.g., "русский" for Russian). Snowball stemmer, however, requires it be in English or as an ISO code:

[...] The algorithm may be selected using the english name of the
language, or using the 2 or 3 letter ISO 639 language codes. [...]

This PR fixes it by using english_name instead of name, thus producing, e.g., "russian" instead of "русский".

(Tested only on Russian)

@aviks aviks merged commit 58cbff1 into JuliaText:master Jan 11, 2019
zgornel added a commit to zgornel/StringAnalysis.jl that referenced this pull request Jan 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants