Skip to content

Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"

License

Notifications You must be signed in to change notification settings

JelenaMitrovic/AbuseEval

 
 

Repository files navigation

AbuseEval

Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"

The repository is structured as follows:

  • data/ : the folder contains the enriched versions of the OffensEval/OLID dataset with the distinction of explicit/implicit offensive messages (./data/offenseval_explicit_implicit) and the newly proposed annotations of abusive messages (./data/abuseval_labels)
  • dictionary-based_experiments/ : the folder contains the script to replicate the dictionary experiments reported in the paper (OffenseEval sub-task A and AbuseEval binary classification)
  • keywords/ : the folder contains the list of the top 50 keywords from the OffensEval training and test data for sub-task A per class (list of keywords for offensive and not offensive messages)

OLID/OffensEval Data: https://competitions.codalab.org/competitions/20011

Data Statement (Bender and Friedman, 2018)

The annotation of the explicit-implicit labels in OffensEval has been conducted by a male (38, Italian) and a female (39, Serbian) annotators, highly educated, with a background in computational linguistics, and familiar with Twitter.

The inter-annotator agreement of AbuseEval has been conducted by three annotators: 1 man (38, Italian) and 2 women (39, Serbian; 23, Russian); all highly educated, with a background in computational linguistics, and familiar with Twitter. The full annotation of AbuseEval has been conducted by one annotator (23, Russian), highly educated and with a background in computational linguistics.

All ages refer to the time of annotation: 2019.

References

@inproceedings{zampierietal2019, 
    title={{Predicting the Type and Target of Offensive Posts in Social Media}}, 
    author={Zampieri, Marcos and Malmasi, Shervin and Nakov, Preslav and Rosenthal, Sara and Farra, Noura and Kumar, Ritesh}, 
    booktitle={Proceedings of NAACL}, 
    year={2019}
} 

@inproceedings{casellietal2020, 
    title={{I Feel Offended, Don’t Be Abusive! Implicit/Explicit Messages in Offensive and Abusive Language}}, 
    author={Tommaso Caselli,Valerio Basile, Jelena Mitrovi\'{c}, Inga Kartoziya, Michael Granitzer}, 
    booktitle={Proceedings of LREC}, 
    year={2020}
} 

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

About

Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%