Skip to content
Datasets to study biases in natural language generation
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md
demographics.txt
dev.tsv
test.tsv
train.tsv

README.md

nlg-bias

Data

The Woman Worked as a Babysitter: On Biases in Language Generation

Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, Nanyun Peng (EMNLP 2019).

This is the annotated regard dataset described in the paper. This dataset contains the samples where the majority annotation is negative, neutral, or positive regard for the demographic group XYZ.

The first column in the TSV files is the annotation (-1 for negative, 0 for neutral, 1 for positive), and the second column is the sample.

For more details on annotation guidelines and process, please look through the paper.

You can’t perform that action at this time.