Skip to content

Part of Speech Tagging project from Udacity's NLP nanodegree.

License

Notifications You must be signed in to change notification settings

CostaFernando/part-of-speech-tagging

Repository files navigation

Part of speech tagging

Part of Speech Tagging project from Udacity's NLP nanodegree.

In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech,[1] based on both its definition and its context. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

This project uses the Pomegranate library to build a hidden Markov model for part of speech tagging with a universal tagset. Hidden Markov models have been able to achieve >96% tag accuracy with larger tagsets on realistic text corpora. Hidden Markov models have also been used for speech recognition and speech generation, machine translation, gene recognition for bioinformatics, and human gesture recognition for computer vision, and more.

About

Part of Speech Tagging project from Udacity's NLP nanodegree.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published