Skip to content

sbalsefri/ArabicHateSpeechDataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

Arabic Hate Speech Dataset

This is a repo to accompany the paper "Hate and Offensive Speech Detection on Arabic Social Media". It contins Hate speech dataset with 5360 annotated Arabic tweets.

Our dataset is composed of two csv files (train.csv) and testing (test.csv). They contain the tweets ids and the annotations described in our paper:

  • Tweet ID(column: ID),
  • Binary classifcation Task (column: 2-Class): Tweets are classifed as Clean(C) vs Offensive/Hate(OH)
  • 3-way classifcation Task (column: 3-Class): Tweets are classifed as Clean(C) vs Offensive(O) vs Hate(H)
  • 6-way classifcation Task (column: 6-Class): Tweets are classifed as Clean(C) vs Offensive(O) vs GenderHate(GH) vs ReligiousHate(RH) vs
    NationalityHate(NH) vs EthnicityHate(EH)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published