@wareya wareya released this Apr 6, 2017 · 19 commits to master since this release

Assets 4
  • Fixed mistake where part of speech filter wasn't catching proper names
  • Add second branch working off the neologd dictionary instead of the kanaaccent one.

The neologd version uses an indev version of kuromoji.

If your corpus is reasonably large (hundreds of megabytes or larger), you want

If your corpus is small (tens of megabytes or smaller) and contains a density of proper names (novels, VNs, etc) you want


EDIT: There was an issue with analyzer.jar in in this release. If you downloaded it within the first 10 minutes after the release, please redownload it if you have any problems with the part of speech filter.