Welcome to NLTK-Trainer's documentation!
NLTK-Trainer is a set of Python command line scripts for natural language processing. With these scripts, you can do the following things without writing a single line of code:
- train NLTK based models
- evaluate pickled models against a corpus
- analyze a corpus
These scripts are Python 2 & 3 compatible and work with NLTK 2.0.4 and higher.
The scripts can be downloaded from nltk-trainer on github.
.. toctree:: :maxdepth: 2 train_classifier.rst train_tagger.rst train_chunker.rst analyze_tagged_corpus.rst analyze_tagger_coverage.rst
Python 3 Text Processing with NLTK 3 Cookbook contains many examples for training NLTK models with & without NLTK-Trainer.
- Chapter 4 covers part-of-speech tagging and :ref:`train_tagger.py <train_tagger>`.
- Chapter 5 shows how to train phrase chunkers and use :ref:`train_chunker.py <train_chunker>`.
- Chapter 7 demonstrates classifier training and :ref:`train_classifier.py <train_classifier>`.
- Training Binary Classifiers with NLTK Trainer
- Training Part of Speech Taggers with NLTK Trainer
- Analyzing Tagger Corpora and NLTK Part of Speech Taggers
- NLTK Default Tagger Coverage of treebank corpus
- NLTK Default Tagger Coverage of conll2000 corpus
Demos and APIs
Nearly all the models that power the text-processing.com NLTK demos and NLP APIs have been trained using NLTK-Trainer.