Skip to content
A Chinese Nature Language Toolkit
Branch: master
Clone or download
Latest commit f9929ed Jun 27, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
data fix load file eroor Jan 19, 2018
fool fix graph code error Jun 27, 2018
test fix main file Jan 16, 2018
train embeding update Jan 22, 2018
.gitignore train process and fix dict error Jan 13, 2018
LICENSE change readme to en Apr 2, 2018
requirements.txt update requirement.txt Dec 28, 2017


A Chinese word processing toolkit

Chinese document


  • Although not the fastest, FoolNLTK is probably the most accurate open source Chinese word segmenter in the market
  • Trained based on the BiLSTM model
  • High-accuracy in participle, part-of-speech tagging, entity recognition
  • User-defined dictionary
  • Ability to self train models
  • Allows for batch processing

Getting Started

To download and build FoolNLTK, type:

get clone
cd FoolNLTK/train

For detailed instructions

  • Only tested in Linux Python 3 environment.


pip install foolnltk

Usage Intructions

For Participles:
import fool

text = "一个傻子在北京"
# ['一个', '傻子', '在', '北京']

For participle segmentations, specify a -b parameter to increase the number of lines segmented every run.

python -m fool [filename]
User-defined dictionary

The format of the dictionary is as follows: the higher the weight of a word, and the longer the word length is, the more likely the word is to appear. Word weight value should be greater than 1。

难受香菇 10
什么鬼 10
分词工具 10
北京 10
北京天安门 10

To load the dictionary:

import fool
text = ["我在北京天安门看你难受香菇", "我在北京晒太阳你在非洲看雪"]
#[['我', '在', '北京', '天安门', '看', '你', '难受', '香菇'],
# ['我', '在', '北京', '晒太阳', '你', '在', '非洲', '看', '雪']]

To delete the dictionary

POS tagging
import fool

text = ["一个傻子在北京"]
#[[('一个', 'm'), ('傻子', 'n'), ('在', 'p'), ('北京', 'ns')]]
Entity Recognition
import fool 

text = ["一个傻子在北京","你好啊"]
words, ners = fool.analysis(text)
#[[(5, 8, 'location', '北京')]]

Versions in Other languages


  • For any missing model files, try looking in sys.prefix, under /usr/local/
You can’t perform that action at this time.