Arabic Hate Speech Dataset

This is a repo to accompany the paper "Hate and Offensive Speech Detection on Arabic Social Media". It contins Hate speech dataset with 5360 annotated Arabic tweets.

Our dataset is composed of two csv files (train.csv) and testing (test.csv). They contain the tweets ids and the annotations described in our paper:

Tweet ID(column: ID),
Binary classifcation Task (column: 2-Class): Tweets are classifed as Clean(C) vs Offensive/Hate(OH)
3-way classifcation Task (column: 3-Class): Tweets are classifed as Clean(C) vs Offensive(O) vs Hate(H)
6-way classifcation Task (column: 6-Class): Tweets are classifed as Clean(C) vs Offensive(O) vs GenderHate(GH) vs ReligiousHate(RH) vs
NationalityHate(NH) vs EthnicityHate(EH)

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
AH-Test.csv		AH-Test.csv
AH-Train.csv		AH-Train.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AH-Test.csv

AH-Test.csv

AH-Train.csv

AH-Train.csv

README.md

README.md

Repository files navigation

Arabic Hate Speech Dataset

About

Releases

Packages

sbalsefri/ArabicHateSpeechDataset

Folders and files

Latest commit

History

Repository files navigation

Arabic Hate Speech Dataset

About

Resources

Stars

Watchers

Forks