Normalizer ja #54

merged 11 commits into from Aug 21, 2012


None yet
2 participants

gmarty commented Aug 20, 2012

Add a normalizer for Japanese:

  • Use normalize_ja() to get a consistent corpus before further processing.
  • Function below converters namespace can be used to perform several conversion (ex: fullwidth to halfwidth characters, hiragana to katakana...).

Everything is thoroughly tested, but we should write tests for helper functions in lib/natural/util/utils.js.

chrisumbel added a commit that referenced this pull request Aug 21, 2012

@chrisumbel chrisumbel merged commit 60d1f9b into NaturalNode:master Aug 21, 2012


This comment has been minimized.

Show comment Hide comment

chrisumbel Aug 21, 2012


excellent. thanks!


chrisumbel commented Aug 21, 2012

excellent. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment