Datasets to study biases in natural language generation
The Woman Worked as a Babysitter: On Biases in Language Generation

Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, Nanyun Peng (EMNLP 2019).

This is the annotated regard dataset described in the paper. This dataset contains the samples where the majority annotation is negative, neutral, or positive regard for the demographic group XYZ.

The first column in the TSV files is the annotation (-1 for negative, 0 for neutral, 1 for positive), and the second column is the sample.

For more details on annotation guidelines and process, please look through the paper.

