Skip to content
No description or website provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
GUIDELINES-IT.pdf
GUIDELINES.pdf
LICENSE.md
README.md
hate-speech-final.tsv

README.md

Italian Twitter Corpus of Hate Speech

Corpus description

This is a Twitter corpus built with the aim of representing and analyzing hate speech against some minority groups in Italy: immigrants in particular, but also Muslims and Roma.

The amount of annotated data currently available in this repository consists of 1,827 tweets, thoroughly revised by expert annotators; in the meantime, it has also been expanded with brand new data, annotated in part by experts and in part by CrowdFlower contributors. However, due to the fact that the resource is going to be used within the context of an evaluation campaign, we plan to make the whole dataset freely available not until Fall 2018.

Similar to the one provided by Wasseem and Hovy (2016), the corpus released here only contains the tweets' ID and their annotation. The content of each tweet can thus be retrieved using the Twitter APIs and querying the corresponding ID.

The corpus development forms part of the Hate Speech Monitoring program coordinated by the Computer Science Department of the University of Turin (Italy).

References

If you use the resource, please cite:

@InProceedings{SanguinettiEtAlLREC2018,
  author    = {Manuela Sanguinetti and Fabio Poletto and Cristina Bosco and Viviana Patti and Marco Stranisci},
  title     = {An Italian Twitter Corpus of Hate Speech against Immigrants},
  booktitle = {Proceedings of the 11th Conference on Language Resources and Evaluation (LREC2018), May 2018, Miyazaki, Japan},
  month     = {},
  year      = {2018},
  address   = {},
  publisher = {},
  pages     = {2798--2895},
  url       = {}
}

Other references:

Poletto F., Stranisci M.,Sanguinetti M., Patti V., Bosco C. (2017) Hate speech annotation: Analysis of an Italian Twitter corpus. In: Proceedings of the 4th Italian Conference on Computational Linguistics (CLiC-it 2017), Rome, Italy.

Acknowledgements

The work is funded by Progetto di Ateneo/CSP 2016 (Immigrants, Hate and Prejudice in Social Media, project S1618_L2_BOSC_01) and by Fondazione CRT (Hate Speech and Social Media, project n. 2016.0688).

You can’t perform that action at this time.