Joint Modeling of Humor and Offense

Authors: Gabriel Brookman, Akshay Gugnani, Nicholas Samoray, Brian Zylich

This repo is for our 685 - Final Project & SemEval'21 Event 7.

Humor classification is one the hardest problems in the area of Natural Language Understanding. In this project, we explore different pre-trained models and ensembles to combine and classify humor and offensive detection.

We are able to achieve submissions with significant performance in all of the tasks (detailed more in the report). As of the end of day of the submission ofthe project report, our team holds a position in the top 3 for each of the tasks, and we hold the first rank for 3 of the 4 tasks, among 21 other com-peting teams. We discuss this in more detail in our report - (https://competitions.codalab.org/competitions/27446#results)

Folder structure

.
├── notebooks               # Contains all the colab .ipynb used for model training and analysis
├── pipeline                # Contains all the .py files used for data collection and analysis
├── data                    # The data files (.csv) used in our work
│   ├── train_split   		    # Has the 90-10 train-test (.csv) files that we use for internal evaluations
├── docs                    # Documentation files - project report
│   ├── related_work   		   # Contains some of the relavent related work
└── README.md

Datasets

Hahackathon dataset (https://competitions.codalab.org/competitions/27446)
200k short texts for humor detection (https://www.kaggle.com/moradnejad/200k-short-texts-for-humor-detection, binary humor/not humor labels)
Hate speech/offensive language/clean speech (https://github.com/t-davidson/hate-speech-and-offensive-language)
Offensive Tweets (https://sites.google.com/site/offensevalsharedtask/olid, binary offensive, targeted/untargeted insult, targeted at individual, group, or other)

Notebooks

Jupyter notebook is run on Google Colab, any extra packages required are included in the notebook itself.

Libraries

PyTorch
Transformers (https://github.com/huggingface/transformers)
SpaCy
Pandas
Numpy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

docs

docs

notebooks

notebooks

pipeline

pipeline

.DS_Store

.DS_Store

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Joint Modeling of Humor and Offense

Folder structure

Datasets

Notebooks

Libraries

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
data		data
docs		docs
notebooks		notebooks
pipeline		pipeline
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

bzylich/humor-by-demographic

Folders and files

Latest commit

History

Repository files navigation

Joint Modeling of Humor and Offense

Folder structure

Datasets

Notebooks

Libraries

About

Resources

Stars

Watchers

Forks

Languages