Hateful Memes Challenge-Team HateDetectron Submissions

Check out the paper on and check out my which offers an in-depth analysis of the approach as well as an overview of Multimodal Research and its foundations.

This repository contains all the code used at the Hateful Memes Challenge by Facebook AI. There are 2 main Jupyter notebooks where all the job is done and documented:

The 'reproducing results' notebook -->
The 'end-to-end' notebook -->

The first notebook is only for reproducing the results of Phase-2 submissions by the team HateDetectron. In other words, just loading the final models and getting predictions for the test set. See the end-to-end notebook to have a look at the whole approach in detail: how the models are trained, how the image features are extracted, which datasets are used, etc.

About the Competition

The Hateful Memes Challenge and Data Set is a competition and open source data set designed to measure progress in multimodal vision-and-language classification.

Check out the following sources to get more on the challenge:

Competition Results:

We are placed the 3rd out of 3.173 participants in total!

See the official Leaderboard here!

Repository structure

The repository consists of the following folders:

hyperparameter_sweep/ : where scripts for hyperparameter search are.

get_27_models.py: iterates through the folders those that were created for hyperparameter search and collects the metrics (ROC-AUC, accuracy) on the 'dev_unseen' set and stores them in a pd.DataFrame. Then, it sorts the models according to AUROC metric and moves the best 27 models into a generated folder majority_voting_models/
remove_unused_file.py: removes unused files, e.g. old checkpoints, to free the disk.
sweep.py: defines the hyperparameters and starts the process by calling /sweep.sh
sweep.sh: is the mmf cli command to do training on a defined dataset, parameters, etc.

notebooks/ : where Jupyter notebooks are stored.

[GitHub]end2end_process.ipynb: presents the whole approach end-to-end: expanding data, image feature extraction, hyperparameter search, fine-tuning, majority voting.
[GitHub]reproduce_submissions.ipynb: loads our fine-tuned (final) models and generates predictions.
[GitHub]label_memotion.ipynb: a notebook which uses /utils/label_memotion.py to label memes from Memotion and to save it in an appropriate form.
[GitHub]simple_model.ipynb: includes a simple multimodal model implementation, also known as 'mid-level concat fusion'. We train the model and generate submission for the challenge test set.
[GitHub]benchmarks.ipynb: reproduces the benchmark results.

utils/ : where some helper scripts are stored, such as labeling Memotion Dataset and merging the two datasets.

concat_memotion-hm.py: concatenates the labeled memotion samples and the hateful memes samples and saves them in a new train.jsonl file.
generate_submission.sh: generates predictions for 'test_unseen' set (phase 2 test set).
label_memotion.jsonl: presents the memes labeled by us from memotion dataset.
label_memotion.py: is the script for labelling Memotion Dataset. The script iterates over the samples in Memotion and labeler labels the samples by entering 1 or 0 on the keyboard. The labels and the sample metadata is saved at the end as a label_memotion.jsonl.

Citation:

@article{velioglu2020hateful,
  author = {Velioglu, Riza and Rose, Jewgeni},
  title = {Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge},
  doi = {https://doi.org/jhb3}, 
  publisher = {arXiv},
  year = {2020}, 
}

Please also consider citing my thesis:

@mastersthesis{velioglu2021detecting,
  title   = "Detecting Hate Speech In Multimodal Memes Using Vision-Language Models",
  author  = "Velioglu, Riza",
  school  = "Bielefeld University",
  year    = "2021",
  url     = "http://rizavelioglu.github.io/files/RizaVelioglu-MScThesis.pdf"
}

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
hyperparameter_sweep		hyperparameter_sweep
notebooks		notebooks
utils		utils
.gitattributes		.gitattributes
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
logo-1000x500.png		logo-1000x500.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hyperparameter_sweep

hyperparameter_sweep

notebooks

notebooks

utils

utils

.gitattributes

.gitattributes

CITATION.cff

CITATION.cff

LICENSE

LICENSE

README.md

README.md

logo-1000x500.png

logo-1000x500.png

Repository files navigation

Hateful Memes Challenge-Team HateDetectron Submissions

About the Competition

Competition Results:

Repository structure

Citation:

Contact:

About

Releases

Packages

Languages

License

rizavelioglu/hateful_memes-hate_detectron

Folders and files

Latest commit

History

Repository files navigation

Hateful Memes Challenge-Team HateDetectron Submissions

About the Competition

Competition Results:

Repository structure

Citation:

Contact:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages