NLP toolkit written in C++11
For detail, please see wiki or README file in each directory.
- enumerate ngrams of a document
- allowing multi-document
- outputs pairs of frequency and n-gram
- calculates from tf-file
- caluculate Acc, Prec, Rec and F1-socre
- answer and predict files are written in SVM-light like format