NLTK-Trainer is a set of Python command line scripts for natural language processing. With these scripts, you can do the following things without writing a single line of code:
- train NLTK based models
- evaluate pickled models against a corpus
- analyze a corpus
These scripts are Python 2 & 3 compatible and work with NLTK 2.0.4 and higher.
The scripts can be downloaded from nltk-trainer on github.
train_classifier.rst train_tagger.rst train_chunker.rst analyze_tagged_corpus.rst analyze_tagger_coverage.rst
Python 3 Text Processing with NLTK 3 Cookbook contains many examples for training NLTK models with & without NLTK-Trainer.
- Chapter 4 covers part-of-speech tagging and
train_tagger.py <train_tagger>
. - Chapter 5 shows how to train phrase chunkers and use
train_chunker.py <train_chunker>
. - Chapter 7 demonstrates classifier training and
train_classifier.py <train_classifier>
.
- Training Binary Classifiers with NLTK Trainer
- Training Part of Speech Taggers with NLTK Trainer
- Analyzing Tagger Corpora and NLTK Part of Speech Taggers
- NLTK Default Tagger Coverage of treebank corpus
- NLTK Default Tagger Coverage of conll2000 corpus
Nearly all the models that power the text-processing.com NLTK demos and NLP APIs have been trained using NLTK-Trainer.
genindex
modindex
search