Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 475 Bytes

README.md

File metadata and controls

9 lines (6 loc) · 475 Bytes

Part-of-Speech-tagging

In this project, I use the Pomegranate library to build a hidden Markov model for part of speech tagging using a "universal" tagset. I achieved a >96% tag accuracy with larger tagsets on realistic text corpora. This project includes three steps.

1 Process raw texts. 2 Build a Most Frequent Class tagger to use as a baseline. 3 Build an HMM Part of Speech tagger and compare to the MFC baseline.

All codes are stored in the jupyter notebook.