Part of Speech tagging using HMM
Hidden Markov Model part-of-speech tagger for English, Chinese, and a surprise language.
The training data are provided tokenized and tagged; the test data will be provided tokenized, and the tagger will add the tags.
The assignment was graded based on the performance of the tagger, that is how well it performs on unseen test data compared to the performance of a reference tagger.
For more details : http://ron.artstein.org/csci544-2018/coding-1.html
Results:
Baseline | Reference | My Accuracy | |
English: | 0.842365317182 | 0.887910423972 | 0.884882052917 |
Chinese: | 0.838827838828 | 0.869547119547 | 0.873959373959 |
Hindi: | 0.85817104149 | 0.924188540785 | 0.920716906576 |