An analysis of Societal Bias in SOTA NLP Transfer Learning

NOTE: This is not a complete recreation of the INLP algorithm, but only a toy example of applying a null projection onto one layer of the encoder. The bias can still be relearnt through the other preceeding layers.

This repository contains code for demonstrating a simplistic application of Nullspace Projection on a NLP Attention-Based Transformer.

The code is largely taken from pliang279's LM_bias repository and relies heavily on shauli-ravfogel 's nullspace projection repository.

Instructions

Prerequisites:

Python 3.6.7
You will need to clone the nullspace projection into this directory before running.

This repo can be accessed primarily via nullspace_bert_demonstration.ipynb. This notebook contains cells that install dependencies and generate the appropriate resources. It also runs the code described below.

Discovering Gender Bias Sensitive Tokens

The module get_bias_sensitive_tokens.py, uses our predefined gender defining terms to construction a gender bias subspace using PCA. It then takes the highest variance principle component and uses it to discover bias sensitive words in our vocabulary. These words are printed to the console and corresponding embeddings are saved for future use.

You can run this script with the command:

python get_bias_sensitive_tokens.py

Discovering a NullSpace Projection

The module context_nullspace_projection.py takes the previously discovered embeddings of bias sensitive words and uses their bias direction as a label in a classification task.

It iteratively trains several classifiers on the data, each time generating a projection, P_i, that removes the information used by the classifiers weights to linearly separate the embeddings with regards to gender.

You can run this script with the command:

python context_nullspace_projection.py

Papers

Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
Hila Gonen and Yoav Goldberg
Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection
Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton and Yoav Goldberg
Towards Understanding and Mitigating Social Biases in Language Models
Paul Pu Liang, Chiyu Wu,Louis-Philippe Morency, and Ruslan Salakhutdinov
ICML 2021
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama and Adam Kalai
Investigating Gender Bias in BERT
Rishabh Bhardwaj, Navonil Majumder and Soujanya Poria
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
images		images
nullspace_projection		nullspace_projection
LICENSE		LICENSE
README.md		README.md
context_nullspace_projection.py		context_nullspace_projection.py
get_bias_sensitive_tokens.py		get_bias_sensitive_tokens.py
nullspace_bert_demonstration.ipynb		nullspace_bert_demonstration.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

images

images

nullspace_projection

nullspace_projection

LICENSE

LICENSE

README.md

README.md

context_nullspace_projection.py

context_nullspace_projection.py

get_bias_sensitive_tokens.py

get_bias_sensitive_tokens.py

nullspace_bert_demonstration.ipynb

nullspace_bert_demonstration.ipynb

requirements.txt

requirements.txt

Repository files navigation

An analysis of Societal Bias in SOTA NLP Transfer Learning

Instructions

Prerequisites:

Discovering Gender Bias Sensitive Tokens

Discovering a NullSpace Projection

Papers

About

Releases

Packages

Languages

License

BenAjayiObe/pydata-bias-in-bert

Folders and files

Latest commit

History

Repository files navigation

An analysis of Societal Bias in SOTA NLP Transfer Learning

Instructions

Prerequisites:

Discovering Gender Bias Sensitive Tokens

Discovering a NullSpace Projection

Papers

About

Resources

License

Stars

Watchers

Forks

Languages