- Stanford CoreNLP Java
- spaCy Python
- lingo Golang
词干提取(stemming)和词形还原(lemmatization)
- Stemming and lemmatization
- Lemmatization ListsDatasets by MBM
- The UniMorph Project
- 中文繁简转换
- Regex tagger
- commonregex, a collection of common regular expressions for Go.
- xurls, a Go package of regex for urls.
getlang
is much slower than franco
- getlang
- franco
- test scripts
- franco: Duration: 5.12s, 26.93%
- getlang: Duration: 11.58s, 59.54%