ngramix = N-gram + Mix of both ideographic character indexing and normal alphabetic words
Basically ngramix is a customized N-gram parser plugin of MySQL which is pretty good for simple use of Chinese full-text search at some not-that-big sites.
- It is simply a MySQL (5.7+) plugin that you can just install and enable it. No other software or applications required to install or configure
- It is simply build MySQL full-text indexes. As you know, the bigger web sites however choose more professional solutions like Lucene or other Chinese tokenizer/indexer combinations
But if you run a start-up and small web site, it could be one of the most proper choice.
- treat one alphabetic word as a token to be indexed, rather than seeing it as combination of letters
if Bigram:
"你好啊dave大帅哥" ==> "你好 好啊 啊dave dave大 大帅 帅哥" and the word "dave"
This is still in process of development/improving.
- Please pull requests for your updates
- Talk about issues by opening ISSUES