Text Classification Competition: Twitter Sarcasm Detection

Repository Description

/BERT-base-uncase-code:
Everything you need to test the BERT model
/CNN_LSTM_DNN:
Everything you need to test the neural-network-based model
/ML:
Everything you need to test the machine learning model
/data:
Raw Data
/resource:
Supportive documents for neural-network-based model
/CS 410 Text Information Systems Course Project Final Report.docx:
Final Report
/answer.txt：
Best output we have
/slides.pptx：
Slides, a brief version of Report

Important Notes (Read before testing the model):

We highly recommend you to contact us for a live demo before you test the model on your own. Some pre-requisites are hard to meet which may lead to errors during the test. The testing process is also highly time-consuming. In order to better display our results and to save your time, please contact us via the following email address:

hh21@illinois.edu

Or you can this live demo.

Thank you for your understanding!

How to test models on your PC?

We performed three different models for this competition, machine learning model, BERT-based model and neural-network model. You can test whichever you like by following the instructions respectively.

Machine Learning Model

Pre-requisite:

numpy
pandas
matplotlib
seaborn
sklearn

Run the script:

Clone the repository to your computer
cd ClassificationCompetition/ML
Run python TFIDF_RandomForests.py
Output prediction result (anser.txt) will be able to find in ClassificationCompetition folder

BERT-base-uncased Model

Pre-requisite:

Tensorflow
Transformer
PyTorch
torchtext
BERT
cuda toolkit https://anaconda.org/anaconda/cudatoolkit (Hardware requirments: https://docs.anaconda.com/anaconda/user-guide/tasks/gpu-packages/)
numpy
pandas
matplotlib
seaborn
sklearn

Run the script:

Clone the BERT-base-uncase-code into your local computer (all the dataset is already prepared for this model)
Run python BERT_Model.py (If you see this message in the console "Running this sequence through the model will result in indexing errors", this is just the warning, NOT actual error!)
The final models (metrics.pt and model.pt) and output prediction result (anser.txt) will be able to find in result folder

CNN+LSTM+DNN Model

Pre-requisite:

nltk (TweetTokenizer)
Keras
Tensorflow
numpy
scipy
gensim (if you are using word2vec)
itertools
sklearn

Run the script:

Clone the repository
Download following files from the link - https://drive.google.com/drive/folders/0B7C_0ZfEBcpRbDZKelBZTFFsV0E?usp=sharing, to the following directory: ClassificationCompetition/resource/text_model/weights
- GoogleNews-vectors-negative300.bin
- model.jsonhdf5
- weights.05__.hdf5
cd ClassificationCompetition/CNN_LSTM_DNN
run Python sarcasm_detection_model_CNN_LSTM_DNN.py

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.spyproject/config		.spyproject/config
BERT-base-uncase-code		BERT-base-uncase-code
CNN_LSTM_DNN		CNN_LSTM_DNN
ML		ML
data		data
resource		resource
.gitignore		.gitignore
CS 410 Text Information Systems Course Project Final Report.docx		CS 410 Text Information Systems Course Project Final Report.docx
CS 410 Text Information Systems Course Project Progress Report.docx		CS 410 Text Information Systems Course Project Progress Report.docx
CS 410 Text Information Systems Course Project Proposal.docx		CS 410 Text Information Systems Course Project Proposal.docx
ML methods results record.xlsx		ML methods results record.xlsx
README.md		README.md
answer.txt		answer.txt
livedatalab_config.json		livedatalab_config.json
slides.pptx		slides.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Classification Competition: Twitter Sarcasm Detection

Repository Description

Important Notes (Read before testing the model):

hh21@illinois.edu

Or you can this live demo.

Thank you for your understanding!

How to test models on your PC?

Machine Learning Model

BERT-base-uncased Model

CNN+LSTM+DNN Model

About

Releases

Packages

Languages

HHZ1995/ClassificationCompetition

Folders and files

Latest commit

History

Repository files navigation

Text Classification Competition: Twitter Sarcasm Detection

Repository Description

Important Notes (Read before testing the model):

hh21@illinois.edu

Or you can this live demo.

Thank you for your understanding!

How to test models on your PC?

Machine Learning Model

BERT-base-uncased Model

CNN+LSTM+DNN Model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages