Skip to content
This repository


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Hidden Markov Model POS Tagger

branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time



This is a Part of Speech tagger written in Python, utilizing the Viterbi algorithm (an instantiation of Hidden Markov Models). It uses the Natural Language Toolkit and trains on Penn Treebank-tagged text files. It will use ten-fold cross validation to generate accuracy statistics, comparing its tagged sentences with the gold standard.


python [--clean]

Pass in the --clean option to clean a Treebank file before running the tagger. This can be time consuming, so you can leave it off during future runs.

Something went wrong with that request. Please try again.