Sarcasm Detection in Text
Removed reply tweets, tweets having links/non-ascii characters, remaining length<3. Saved diveded pos and neg DB as .npy files:
- posproc.npy
- negproc.py
- Pilot Model: Naive Bayes with TFIDF feature vectors.
- BernaoulliNB
- MultinomialNB
- Final Model: Utilized Linear Dirichlet Allocation (LDA) for topic modeling and other basic features like bigrams for the classification.