Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
postag.py		postag.py
tagset.txt		tagset.txt
test.txt		test.txt
train.txt		train.txt

Repository files navigation

Postag

Overview

Postag is a simple part of speech tagger which is capable of two modes, a naive baseline mode and a second order Hidden Markov Model mode.

Usage

postag.py [mode] [train_data] [test_data]

mode is either baseline which tags words with their most common part of speech from the training data or hmm which uses a second order Hidden Markov Model
train_data and test_data must be of the same format as the included train.txt and test.txt

Files

All data from the SUSANNE corpus, compiled by Gregory Sampson et al.

train.txt: Sample training data
test.txt: Sample test data
tagset.txt: Explanation of the part-of-speech tags used

About

Simple part-of-speech tagger

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%