Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

A Part of Speech Tagger using a Hidden Markov Model

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 data
Octocat-spinner-32 scoring
Octocat-spinner-32 src
Octocat-spinner-32 .gitignore
Octocat-spinner-32 README
README
For the Sequence-Tagging final project, we implemented a Hidden Markov Model
part-of-speech tagger. We implemented the algorithms from scratch, and have
included the code for our system. For the description of the system, see the
report.

To run baseline system:
java Baseline
java Scorer
(note: data must be placed in the data folder and be called train.pos/test.pos)
open scoring/score.html

To run HMM system:
java HMM
java Scorer
(note: data must be placed in the data folder and be called train.pos/test.pos)
open scoring/score.html

The file scoring/score.html contains the percent correct as well as the number
of times a POS tag on the top was chosen for a POS tag on the left (the correct
tag in the test data). That is, the left tags were guessed as the top tags.
To see the words that contributed to the number in a particular box, click the
box to toggle the word list.

Each score-*.html file is the score file for that particular configuration of
the HMM system. Refer to the report for a description of each.
Something went wrong with that request. Please try again.