GitHub - amazon-science/generalized-fairness-metrics

Generalized Fairness Metrics

This repository contains the source code for the paper:

Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics
Paula Czarnowska, Yogarshi Vyas, Kashif Shah
Transaction of the Association for Computational Linguistics (TACL), 2021

Reproducing classification experiments:

Change the MODELSDIR variable in get_predictions.sh and the OUTDIR variable in run_experiments.sh to where your models will be/are saved.
Change the CUDA variable in setup.sh to the appropriate version of CUDA.
Run setup.sh to:
- fetch the required submodules
- create and activate a new environment named btools based on the requirements.yml
- download the SemEval valence classification data
Train the models from the config files in the experiments directory:

./run_experiment.sh train=1 DATASET=semeval-2 exp=experiments/roberta.jsonnet
./run_experiment.sh train=1 DATASET=semeval-3 exp=experiments/roberta.jsonnet
Create the test suites and test the models. The plots for the results are saved in the plots directory:

conda activate btools
python3 reproduce.py --classification --create-tests

Reproducing NER experiments:

Run the setup steps (1 and 2 above).
Get the CoNLL2003 data (https://www.clips.uantwerpen.be/conll2003/ner/). Place the eng.train, eng.testa and eng.testb files in datasets/conll2003/ner directory.
Train the model:

./run_experiment.sh train=1 DATASET=conll2003 exp=experiments/ner-roberta.jsonnet
Test the trained model:

python3 reproduce.py --ner

or, if you haven't created the test suites yet:

python3 reproduce.py --ner --create-tests

Metric implementations:

Implementations of all metrics can be found in expanded_checklist/checklist/tests.
The code for generalized metrics is located in expanded_checklist/checklist/tests/abstract_tests/generalized_metrics.py.

Acknowledgements

The code in the expanded_checklist directory is a restructured and expanded version of the repository

https://github.com/marcotcr/checklist

containing the code for testing NLP Models as described in the following paper:

Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh
Association for Computational Linguistics (ACL), 2020

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
expanded_checklist		expanded_checklist
experiments		experiments
notebooks		notebooks
src		src
templates		templates
terms/identity_terms		terms/identity_terms
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
get_predictions.sh		get_predictions.sh
reproduce.py		reproduce.py
requirements.yml		requirements.yml
run_experiment.sh		run_experiment.sh
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generalized Fairness Metrics

Acknowledgements

Security

License

About

Releases

Packages

Contributors 3

Languages

License

amazon-science/generalized-fairness-metrics

Folders and files

Latest commit

History

Repository files navigation

Generalized Fairness Metrics

Acknowledgements

Security

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages