Skip to content

JakeBurchard/postag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Postag

Overview

Postag is a simple part of speech tagger which is capable of two modes, a naive baseline mode and a second order Hidden Markov Model mode.

Usage

postag.py [mode] [train_data] [test_data]

  • mode is either baseline which tags words with their most common part of speech from the training data or hmm which uses a second order Hidden Markov Model
  • train_data and test_data must be of the same format as the included train.txt and test.txt

Files

All data from the SUSANNE corpus, compiled by Gregory Sampson et al.

  • train.txt: Sample training data
  • test.txt: Sample test data
  • tagset.txt: Explanation of the part-of-speech tags used

About

Simple part-of-speech tagger

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages