“Fifty Shades of Bias”: Normative Ratings of Gender Bias in GPT Generated English Text

Language serves as a powerful tool for the manifestation of societal belief systems. In doing so, it also perpetuates the prevalent biases in our society. Gender bias is one of the most pervasive biases in our society and is seen in online and offline discourses. With LLMs increasingly gaining human-like fluency in text generation, gaining a nuanced understanding of the biases these systems can generate is imperative. Prior work often treats gender bias as a binary classification task. However, acknowledging that bias must be perceived at a relative scale; we investigate the generation and consequent receptivity of manual annotators to bias of varying degrees. Specifically, we create the first dataset of GPT-generated English text with normative ratings of gender bias. Ratings were obtained using Best–Worst Scaling – an efficient comparative annotation framework. Next, we systematically analyze the variation of themes of gender biases in the observed ranking and show that identity-attack is most closely related to gender bias. Finally, we show the performance of existing automated models trained on related concepts on our dataset.

The paper can be found at: “Fifty Shades of Bias”: Normative Ratings of Gender Bias in GPT Generated English Text

Our online talk at EMNLP’23 can be found here.

If you use our work, please cite us:

@inproceedings{hada-etal-2023-fifty,
    title = "{``}Fifty Shades of Bias{''}: Normative Ratings of Gender Bias in {GPT} Generated {E}nglish Text",
    author = "Hada, Rishav  and
      Seth, Agrima  and
      Diddee, Harshita  and
      Bali, Kalika",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.115",
    doi = "10.18653/v1/2023.emnlp-main.115",
    pages = "1862--1876",
    abstract = "Language serves as a powerful tool for the manifestation of societal belief systems. In doing so, it also perpetuates the prevalent biases in our society. Gender bias is one of the most pervasive biases in our society and is seen in online and offline discourses. With LLMs increasingly gaining human-like fluency in text generation, gaining a nuanced understanding of the biases these systems can generate is imperative. Prior work often treats gender bias as a binary classification task. However, acknowledging that bias must be perceived at a relative scale; we investigate the generation and consequent receptivity of manual annotators to bias of varying degrees. Specifically, we create the first dataset of GPT-generated English text with normative ratings of gender bias. Ratings were obtained using Best{--}Worst Scaling {--} an efficient comparative annotation framework. Next, we systematically analyze the variation of themes of gender biases in the observed ranking and show that identity-attack is most closely related to gender bias. Finally, we show the performance of existing automated models trained on related concepts on our dataset.",
}

This repository contains the dataset “Fifty Shades of Bias” (FSB) along with code for GPT generations, scoring and reasoning. The repository is structured as follows:

.
├── CODE_OF_CONDUCT.md
├── LICENSE
├── README.md
├── SECURITY.md
├── SUPPORT.md
├── data
│   ├── FSB
│   │   ├── FSB_text.csv
│   │   ├── fsb-tuples-annotations.csv
│   │   └── fsb_final_scores.csv
│   ├── in_context_examples
│   │   ├── explicit_completion_ic.json
│   │   ├── explicit_conversion_ic.json
│   │   ├── implicit_completion_ic.json
│   │   ├── implicit_conversion_ic.json
│   │   ├── neutral_completion_ic.json
│   │   └── neutral_conversion_ic.json
│   └── seeds
│       ├── explict_bias_seed.txt
│       ├── implicit_bias_seed.txt
│       └── neutral_bias_seed.txt
├── index.html
├── requirements.txt
├── scripts
│   ├── generate_biased_sentences.py
│   ├── gpt_reasoning.py
│   ├── gpt_scoring.py
│   └── utils.py
└── FSB_Underline.pdf

Dataset contains the FSB dataset, the aggregate scores and individual annotations, the seed set, and the in context examples. Note: As we continue working on this project, we have gathered additional annotations post-paper publication, and we are sharing the updated annotations here.
Scripts contains the code for GPT generations, scoring, and reasoning.
requirements.txt lists down the requirements for running the code on the repository. The requirements can be installed using : pip install -r requirements.txt
index.html contains the code for the annotation task interface.

Datasets Used

StereoSet: StereoSet: Measuring stereotypical bias in pretrained language models, License: CC-BY
COPA: Choice of Plausible Alternatives (COPA), License: BSD

Running the script

Requirements:

GPT-3.5-Turbo access is required for running this code.
Place your API version, type, and base in utils.py file.
Place your GPT API key in a text file and pass the path as an argument when running the scripts.

Files for obtaining biased sentence, reasons, or scores can be run as:

generate_biased_sentences.py

python -m scripts/generate_biased_sentences --keypath PATH_TO_GPT_KEY --seed_dataset_name FILENAME_FOR_GENERATED_TEXT --task TYPE_OF_PROMPT --ic_file INCONTEXT_EXAMPLES_FILE --queries_file SEED_SENTENCE_FILE

gpt_reasoning.py

python -m scripts/gpt_reasoning --keypath PATH_TO_GPT_KEY --queries_file FILE_WITH_SENTENCE_AND_SCORE

gpt_scoring.py

python -m scripts/gpt_scoring --keypath PATH_TO_GPT_KEY --queries_file FILE_WITH_SENTENCE

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

data

data

scripts

scripts

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

FSB_Underline.pdf

FSB_Underline.pdf

LICENSE

LICENSE

README.md

README.md

SECURITY.md

SECURITY.md

SUPPORT.md

SUPPORT.md

index.html

index.html

requirements.txt

requirements.txt

Repository files navigation

“Fifty Shades of Bias”: Normative Ratings of Gender Bias in GPT Generated English Text

Datasets Used

Running the script

Contributing

Trademarks

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
data		data
scripts		scripts
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
FSB_Underline.pdf		FSB_Underline.pdf
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
index.html		index.html
requirements.txt		requirements.txt

License

microsoft/fifty-shades-of-bias

Folders and files

Latest commit

History

Repository files navigation

“Fifty Shades of Bias”: Normative Ratings of Gender Bias in GPT Generated English Text

Datasets Used

Running the script

Contributing

Trademarks

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages