NIT_Agartala_NLP_Team at SemEval-2019 Task 6

System Submission for SemEval Task 6: OffensEval 2019 (https://competitions.codalab.org/competitions/20011)

Abstract: Developed an Ensemble Approach (Vote based) Classifier for Offensive Language detection trained on the OLID dataset (https://scholar.harvard.edu/malmasi/olid). Also includes a simple LSTM network to compare performance with DLL methods

Files:

proto.py - Ensemble model approach
LSTM.ipynb - Deep Learning Approach (Rudimentary Model)

Resources Required:

CMU POS Tagger (http://www.cs.cmu.edu/~ark/TweetNLP/)
OLID Training Data (https://scholar.harvard.edu/malmasi/olid)
GLoVe Embeddings (Current Version uses the Twitter, 2B tweets, 27B tokens, 1.2M vocab, uncased, 25d, 50d, 100d, & 200d vector variant)for LSTM (https://nlp.stanford.edu/projects/glove/)

Getting the code Ready:

Set the global variables filename and test_filename to your dataset paths
Download ark tweet nlp and extract into the code location (https://bit.ly/33x2WJT) also download the python wrapper (https://github.com/ianozsvald/ark-tweet-nlp-python/blob/master/CMUTweetTagger.py) (Used as library)

Changing Subtasks/Running the code:

Use the terminal command: python3 proto.py q
Replace q: a,b,c (to run the different subtasks)
Change the test_filename if performing submission prediction

Detatiled System Description: https://www.aclweb.org/anthology/papers/S/S19/S19-2124/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NIT_Agartala_NLP_Team at SemEval-2019 Task 6

Files

README.md

Latest commit

History

README.md

File metadata and controls

NIT_Agartala_NLP_Team at SemEval-2019 Task 6