Skip to content

AmitDasRup123/HateBiasNet

Repository files navigation

HateBiasNet

Overview

The HateBiasNet is a dataset collected from Twitter, consisting of 3003 tweets. It has been meticulously annotated by three speech-language pathology graduate students, ensuring high-quality labeling of hate speech. This dataset is invaluable for researchers and practitioners working on hate speech detection and natural language processing.

Source: Twitter
Length: 3003 tweets
Annotators: Three speech-language pathology graduate students

The data was annotated by 4 Large Language Models (GPT 3.5, GPT 4o, Gemma, and Llama) for the following 11 biases: Asian, Black, Female, Mental Disability, Muslim, Physical Disability, No Disability, Not Asian, Not Black, Not Female and Not Muslim.

Paper

For a detailed investigation of the OffensiveLang dataset, refer to the associated paper: Investigating Annotator Bias in Large Language Models for Hate Speech Detection.

Citation

If you use this dataset in your research, please cite the following paper:

@article{das2024investigating, title={Investigating Annotator Bias in Large Language Models for Hate Speech Detection}, author={Das, Amit and Zhang, Zheng and Jamshidi, Fatemeh and Jain, Vinija and Chadha, Aman and Raychawdhary, Nilanjana and Sandage, Mary and Pope, Lauramarie and Dozier, Gerry and Seals, Cheryl}, journal={arXiv preprint arXiv:2406.11109}, year={2024} }

License: CC BY

Contact Information: adas@una.edu

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published