Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Latest commit 7feed56 Oct 14, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md Update README.md Oct 14, 2018
hatespeech_labels.csv add twitter features and remove duplicates Jul 4, 2018

README.md

Hate and Abusive Speech on Twitter

Repository for "Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior" paper, published in ICWSM 2018. Full text of the paper can be found here. All updates on this public dataset can be found in this repository.

The dataset provided here includes an updated version of the original dataset, with ~100k tweets annotated using the CrowdFlower platform:

  • hatespeech_labels.csv: contains ~100k rows, where every row is consisted of a unique Tweet ID and its according majority annotation

UPDATE: It has come to our understanding that a number of the tweets are not available anymore for download on Twitter. Therefore, under request, we can provide one more file with the full 100k tweet text and their associated majority labels. The tweets are shuffled so that there is no connection between tweet IDs and texts (in order to be aligned with the T&C of Twitter). To obtain the file contact the authors through email.

Please cite the paper in any published work that uses any of these resources.

@inproceedings{founta2018large,
    title={Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior},
    author={Founta, Antigoni-Maria and Djouvas, Constantinos and Chatzakou, Despoina and Leontiadis, Ilias and Blackburn, Jeremy and Stringhini, Gianluca and Vakali, Athena and Sirivianos, Michael and Kourtellis, Nicolas},
    booktitle={11th International Conference on Web and Social Media, ICWSM 2018},
    year={2018},
    organization={AAAI Press}
}

For any further questions contact a.m.founta at gmail dot com.

You can’t perform that action at this time.