Skip to content

microsoft/fifty-shades-of-bias

Repository files navigation

“Fifty Shades of Bias”: Normative Ratings of Gender Bias in GPT Generated English Text

Language serves as a powerful tool for the manifestation of societal belief systems. In doing so, it also perpetuates the prevalent biases in our society. Gender bias is one of the most pervasive biases in our society and is seen in online and offline discourses. With LLMs increasingly gaining human-like fluency in text generation, gaining a nuanced understanding of the biases these systems can generate is imperative. Prior work often treats gender bias as a binary classification task. However, acknowledging that bias must be perceived at a relative scale; we investigate the generation and consequent receptivity of manual annotators to bias of varying degrees. Specifically, we create the first dataset of GPT-generated English text with normative ratings of gender bias. Ratings were obtained using Best–Worst Scaling – an efficient comparative annotation framework. Next, we systematically analyze the variation of themes of gender biases in the observed ranking and show that identity-attack is most closely related to gender bias. Finally, we show the performance of existing automated models trained on related concepts on our dataset.

The paper can be found at: “Fifty Shades of Bias”: Normative Ratings of Gender Bias in GPT Generated English Text

Our online talk at EMNLP’23 can be found here.

If you use our work, please cite us:

@inproceedings{hada-etal-2023-fifty,
    title = "{``}Fifty Shades of Bias{''}: Normative Ratings of Gender Bias in {GPT} Generated {E}nglish Text",
    author = "Hada, Rishav  and
      Seth, Agrima  and
      Diddee, Harshita  and
      Bali, Kalika",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.115",
    doi = "10.18653/v1/2023.emnlp-main.115",
    pages = "1862--1876",
    abstract = "Language serves as a powerful tool for the manifestation of societal belief systems. In doing so, it also perpetuates the prevalent biases in our society. Gender bias is one of the most pervasive biases in our society and is seen in online and offline discourses. With LLMs increasingly gaining human-like fluency in text generation, gaining a nuanced understanding of the biases these systems can generate is imperative. Prior work often treats gender bias as a binary classification task. However, acknowledging that bias must be perceived at a relative scale; we investigate the generation and consequent receptivity of manual annotators to bias of varying degrees. Specifically, we create the first dataset of GPT-generated English text with normative ratings of gender bias. Ratings were obtained using Best{--}Worst Scaling {--} an efficient comparative annotation framework. Next, we systematically analyze the variation of themes of gender biases in the observed ranking and show that identity-attack is most closely related to gender bias. Finally, we show the performance of existing automated models trained on related concepts on our dataset.",
}

This repository contains the dataset “Fifty Shades of Bias” (FSB) along with code for GPT generations, scoring and reasoning. The repository is structured as follows:

.
├── CODE_OF_CONDUCT.md
├── LICENSE
├── README.md
├── SECURITY.md
├── SUPPORT.md
├── data
│   ├── FSB
│   │   ├── FSB_text.csv
│   │   ├── fsb-tuples-annotations.csv
│   │   └── fsb_final_scores.csv
│   ├── in_context_examples
│   │   ├── explicit_completion_ic.json
│   │   ├── explicit_conversion_ic.json
│   │   ├── implicit_completion_ic.json
│   │   ├── implicit_conversion_ic.json
│   │   ├── neutral_completion_ic.json
│   │   └── neutral_conversion_ic.json
│   └── seeds
│       ├── explict_bias_seed.txt
│       ├── implicit_bias_seed.txt
│       └── neutral_bias_seed.txt
├── index.html
├── requirements.txt
├── scripts
│   ├── generate_biased_sentences.py
│   ├── gpt_reasoning.py
│   ├── gpt_scoring.py
│   └── utils.py
└── FSB_Underline.pdf
  • Dataset contains the FSB dataset, the aggregate scores and individual annotations, the seed set, and the in context examples. Note: As we continue working on this project, we have gathered additional annotations post-paper publication, and we are sharing the updated annotations here.
  • Scripts contains the code for GPT generations, scoring, and reasoning.
  • requirements.txt lists down the requirements for running the code on the repository. The requirements can be installed using : pip install -r requirements.txt
  • index.html contains the code for the annotation task interface.

Datasets Used

Running the script

Requirements:

  • GPT-3.5-Turbo access is required for running this code.
  • Place your API version, type, and base in utils.py file.
  • Place your GPT API key in a text file and pass the path as an argument when running the scripts.

Files for obtaining biased sentence, reasons, or scores can be run as:

  • generate_biased_sentences.py
python -m scripts/generate_biased_sentences --keypath PATH_TO_GPT_KEY --seed_dataset_name FILENAME_FOR_GENERATED_TEXT --task TYPE_OF_PROMPT --ic_file INCONTEXT_EXAMPLES_FILE --queries_file SEED_SENTENCE_FILE
  • gpt_reasoning.py
python -m scripts/gpt_reasoning --keypath PATH_TO_GPT_KEY --queries_file FILE_WITH_SENTENCE_AND_SCORE
  • gpt_scoring.py
python -m scripts/gpt_scoring --keypath PATH_TO_GPT_KEY --queries_file FILE_WITH_SENTENCE

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

About

This repo contains data and code for EMNLP'23 paper “Fifty Shades of Bias”: Normative Ratings of Gender Bias in GPT Generated English Text

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published