Transformers and Ensemble methods: A solution for Hate Speech Detection in the Arabic language

Description

This repository contains the code for the paper Transformers and Ensemble methods: A solution for Hate Speech Detection in the Arabic language, where we describe our participation in the CERIST NLP Challenge 2022. This paper will be published by the journal Revue de l'Information Scientifique et Technique. Descriptions of the implementation and the dataset are contained in the paper.

Paper Abstract

This paper describes our participation in the shared task of hate speech detection, which is one of the subtasks of the CERIST NLP Challenge 2022. Our experiments evaluate the performance of six transformer models and their combination using 2 ensemble approaches. The best results on the training set, in a five-fold cross validation scenario, were obtained by using the ensemble approach based on the majority vote. The evaluation of this approach on the test set resulted in an F1-score of 0.60 and Accuracy of 0.86

Cite

If you find this article or the code useful in your research, please cite us as:

@article{angelfmp@cerist2022,
  title={Transformers and Ensemble methods: A solution for Hate Speech Detection in Arabic languages},
  author={Magnoss{\~a}o de Paula, Angel Felipe and  Bensalem, Imene and  Rosso, Paolo  and  Zaghouani, Wajdi},
  journal={Revue de l'Information Scientifique et Technique},
  year={2023}
}

Credits

CERIST NLP challenge 2022 Organizers

CERIST NLP Challenge website: http://www.nlpchallenge.cerist.dz/

Contact: nlpchallenge@cerist.dz

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
dependences		dependences
parallel		parallel
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data-analyse.py		data-analyse.py
data.py		data.py
dataset.py		dataset.py
engine.py		engine.py
ensembles.py		ensembles.py
ensembles[deprecate].py		ensembles[deprecate].py
kfold.py		kfold.py
model.py		model.py
preprocess.py		preprocess.py
table_results.png		table_results.png
test.py		test.py
token-analysis.py		token-analysis.py
utils.py		utils.py

License

AngelFelipeMP/Arabic-Hate-Speech-Covid-19

Folders and files

Latest commit

History

Repository files navigation

Transformers and Ensemble methods: A solution for Hate Speech Detection in the Arabic language

Description

Paper Abstract

Cite

Credits

About

Resources

License

Stars

Watchers

Forks

Languages