Skip to content

microsoft/FairPrism

Repository files navigation

FairPrism

This repository contains the FairPrism dataset, introduced in "FairPrism: Evaluating Fairness-Related Harms in Text Generation." The dataset consists of 5,000 examples of AI-generated text annotated for the harms that the text can cause.

Dataset Access

To request dataset access, please email fairprism@microsoft.com with your name, affiliation, and a brief description of your use case. Specifically, please address:

  • Will FairPrism be used as part of training data for mitigation methods? Directly using FairPrism to train classifiers for mitigating fairness-related harms prevents it from being useful as a measurement instrument.

  • Will FairPrism be used as a benchmark to be beaten? If AI systems are repeatedly trained to improve on any single aggregate metric calculated using FairPrism, this will result in overfitting to the dataset, which will make the dataset less useful for measurement and may also result in a false sense of complete coverage.

  • Are there specific applications you are considering? FairPrism contains examples of text generated in both reply scenarios and continuation scenarios. Its efficacy will therefore lessen for applications that are further removed from these scenarios and applications that are highly specific. FairPrism is also less well suited to measuring harms unrelated to gender and sexuality, specific to countries other than the U.S. and Canada, and in languages other than English.

Any additional details you could provide would be helpful.

Additional Information

For further information about privacy and this dataset, please see the Microsoft Privacy Statement.

To cite this dataset, please use:

@inproceedings{fleisig-etal-2023-fairprism,
  title = "{FairPrism}: Evaluating Fairness-Related Harms in Text Generation",
  author = "Fleisig, Eve and
    Amstutz, Audrie and
    Atalla, Chad and
    Blodgett, Su Lin and
    Daum{\'e} III, Hal and
    Olteanu, Alexandra and
    Sheng, Emily and
    Vann, Dan and
    Wallach, Hanna",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://github.com/microsoft/FairPrism/blob/main/FairPrism_paper.pdf"
}

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published