Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
ner
 
 
 
 

Language Identification and Named Entity Recognition in Hinglish Code Mixed Tweets

Kushagra Singh, Indira Sen, Ponnurangam Kumaraguru
ACL 2018, SRW
Link to paper

Repository contains
(i) Seq2seq based transliterator (Roman to Devanagri)
(ii) Language identification tool for Hindi-English code switched text (English, Hindi, Rest)
(iii) CRF based Named Entity Recogntion tool for Hindi-English code switched text (Person, Location, Organisation)

Check http://precog.iiitd.edu.in/resources.html for the annotated corpus.

  • Install dependencies using requirements.txt file in a virtualenv.

  • Check the README in transliteration dir and follow instructions to set up.

  • Export the following env variables before running demo files

export TRANSLITERATION_DIR={{path_to_parent_dir}}/hindi-english-code-mixing-lidf-ner/transliteration
export HINGLISH_ROOT_DIR={{path_to_parent_dir}}/hindi-english-code-mixing-lidf-ner

About

No description, website, or topics provided.

Resources

Releases

No releases published

Packages

No packages published