NOTE: This is not a complete recreation of the INLP algorithm, but only a toy example of applying a null projection onto one layer of the encoder. The bias can still be relearnt through the other preceeding layers.
This repository contains code for demonstrating a simplistic application of Nullspace Projection on a NLP Attention-Based Transformer.
The code is largely taken from pliang279's LM_bias repository and relies heavily on shauli-ravfogel 's nullspace projection repository.
Python 3.6.7
- You will need to clone the nullspace projection into this directory before running.
This repo can be accessed primarily via nullspace_bert_demonstration.ipynb
. This notebook contains cells that install dependencies and generate the appropriate resources. It also runs the code described below.
The module get_bias_sensitive_tokens.py
, uses our predefined gender defining terms to construction a gender bias subspace using PCA. It then takes the highest variance principle component and uses it to discover bias sensitive words in our vocabulary.
These words are printed to the console and corresponding embeddings are saved for future use.
You can run this script with the command:
python get_bias_sensitive_tokens.py
The module context_nullspace_projection.py
takes the previously discovered embeddings of bias sensitive words and uses their bias direction as a label in a classification task.
It iteratively trains several classifiers on the data, each time generating a projection, P_i
, that removes the information used by the classifiers weights to linearly separate the embeddings with regards to gender.
You can run this script with the command:
python context_nullspace_projection.py
-
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
Hila Gonen and Yoav Goldberg -
Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection
Shauli Ravfogel, Yanai Elazar, Hila Gonen, Michael Twiton and Yoav Goldberg -
Towards Understanding and Mitigating Social Biases in Language Models
Paul Pu Liang, Chiyu Wu,Louis-Philippe Morency, and Ruslan Salakhutdinov
ICML 2021 -
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama and Adam Kalai -
Investigating Gender Bias in BERT
Rishabh Bhardwaj, Navonil Majumder and Soujanya Poria -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova