Skip to content

kz-khan/POS-Tagging

Repository files navigation

POS-Tagging

Part-of-Speech tagging of Code-Mixed Hindi English text from social media (Facebook comments)

This project aims at achieving high accuracy for PoS tagging of code mixed language. The data has been taken from Amitava Das's website for ICON 2016 (http://www.amitavadas.com/Code-Mixing.html). Due to the limitations on the dataset like less samples and the absence of word-vectors for Hinglish text, only primitive models could be applied.

This repo contains the dataset and the python scripts I used to preprocess it.

Training and prediction was done using the java based MALLET (Machine Learning for Language Toolkit) library available at http://mallet.cs.umass.edu/ .

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published