HASOC

This repo contains the code for our solutions at the Forum for Information Retrieval Evaluation (FIRE-2021). Our team called neuro-utmn-thales participated in two tasks on binary and fine-grained classification of English tweets that contain hate, offensive, and profane content (English Subtasks A & B) and one task on identification of problematic content in Marathi (Marathi Subtask A). For English subtasks, we investigate the impact of additional corpora for hate speech detection to fine-tune transformer models. We also apply a one-vs-rest approach based on Twitter-RoBERTa to discrimination between hate, profane and offensive posts. Our models ranked third in English Subtask A with the F1-score of 81.99% and ranked second in English Subtask B with the F1-score of 65.77%. For the Marathi tasks, we propose a system based on the Language-Agnostic BERT Sentence Embedding (LaBSE). This model achieved the second result in Marathi Subtask A obtaining an F1 of 88.08%.

Colab:

create a new notebook
do something
save
File->Save a copy to GitHub
append "src/" to the file name, e.g. "src/test.ipynb"

Please cite this paper if you use this method or codes:

@article{glazkova2021fine,
  title={Fine-tuning of Pre-trained Transformers for Hate, Offensive, and Profane Content Detection in English and Marathi},
  author={Glazkova, Anna and Kadantsev, Michael and Glazkov, Maksim},
  journal={arXiv preprint arXiv:2110.12687},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
data		data
src		src
.gitignore		.gitignore
CITATION.cff		CITATION.cff
README.md		README.md
hasoc.ipynb		hasoc.ipynb
hasoc_multilingual.ipynb		hasoc_multilingual.ipynb
roBERTa-hate.ipynb		roBERTa-hate.ipynb
tf_prepare_data.ipynb		tf_prepare_data.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

src

src

.gitignore

.gitignore

CITATION.cff

CITATION.cff

README.md

README.md

hasoc.ipynb

hasoc.ipynb

hasoc_multilingual.ipynb

hasoc_multilingual.ipynb

roBERTa-hate.ipynb

roBERTa-hate.ipynb

tf_prepare_data.ipynb

tf_prepare_data.ipynb

Repository files navigation

HASOC

About

Releases

Packages

Contributors 3

Languages

ixomaxip/hasoc

Folders and files

Latest commit

History

Repository files navigation

HASOC

About

Resources

Stars

Watchers

Forks

Languages