Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Added Daitch-Mokotoff Soundex Coding algorithm implementation and diacritics stripping utility #59

Merged
merged 6 commits into from Sep 11, 2012

Conversation

Projects
None yet
2 participants
Contributor

maslennikov commented Sep 5, 2012

When working with cyrillic-transliterated words, traditional soundex and metaphone algorithms show worse efficacy than Daitch-Mokotoff Soundex, which we are using in our project.

Owner

chrisumbel commented Sep 10, 2012

i'll check this out within the next week.

Contributor

maslennikov commented Sep 11, 2012

Changed regexps in aggressive_tokenizer_ru because there were some non-alphabetical symbols that illegally passed through it (e.g. *-character)

@chrisumbel chrisumbel added a commit that referenced this pull request Sep 11, 2012

@chrisumbel chrisumbel Merge pull request #59 from geeqie/master
Added Daitch-Mokotoff Soundex Coding algorithm implementation and diacritics stripping utility
dd66c5c

@chrisumbel chrisumbel merged commit dd66c5c into NaturalNode:master Sep 11, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment