SciLK (pronounced as "silk") is a natural language toolkit created and optimised specifically for text-mining applications in natural sciences (primarily biology and chemistry). As of this moment, this package is purely experimental and is bound to be unstable for some time to come. Stable published models will be stored in separate stale branches until the master branch has matured. The list of such branches:
chemdner-pub
- a text tokeniser and chemical named entity recognition model trained on the CHEMDNER corpus (publication pending). Update We've added Windows support.