Skip to content

BenAjayiObe/pydata-bias-in-bert

Repository files navigation

An analysis of Societal Bias in SOTA NLP Transfer Learning

NOTE: This is not a complete recreation of the INLP algorithm, but only a toy example of applying a null projection onto one layer of the encoder. The bias can still be relearnt through the other preceeding layers.

This repository contains code for demonstrating a simplistic application of Nullspace Projection on a NLP Attention-Based Transformer.

The code is largely taken from pliang279's LM_bias repository and relies heavily on shauli-ravfogel 's nullspace projection repository.

Instructions

Prerequisites:

This repo can be accessed primarily via nullspace_bert_demonstration.ipynb. This notebook contains cells that install dependencies and generate the appropriate resources. It also runs the code described below.


Discovering Gender Bias Sensitive Tokens

The module get_bias_sensitive_tokens.py, uses our predefined gender defining terms to construction a gender bias subspace using PCA. It then takes the highest variance principle component and uses it to discover bias sensitive words in our vocabulary. These words are printed to the console and corresponding embeddings are saved for future use.

You can run this script with the command:

  • python get_bias_sensitive_tokens.py

Discovering a NullSpace Projection

The module context_nullspace_projection.py takes the previously discovered embeddings of bias sensitive words and uses their bias direction as a label in a classification task.

It iteratively trains several classifiers on the data, each time generating a projection, P_i, that removes the information used by the classifiers weights to linearly separate the embeddings with regards to gender.

You can run this script with the command:

  • python context_nullspace_projection.py

Papers

About

This repo contains code for use in the PyData Global 2021 talk, An analysis of Societal Bias in SOTA NLP Transfer Learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published