Measuring Bias in BERT Context Embeddings

Author: Samantha Dobesh

Date: Feb 7th. 2022

Based on: Unmasking Contextual Stereotypes

GitHub: https://github.com/marionbartl/gender-bias-BERT

Outline

Learning objectives

Learn how BERT internalizes unintended gender biases from training.
Learn how how we can fine tune a layer to correct these biases using Counterfactual Data Substitution (CDS) on the Gender Ambiguous Pronoun (GAP) corpus.
Consider the wider impact of biases on protected groups, like racial or gender minorities, elders, and people with disabilities, and how we can cultivate our datasets to support and protect these people.

Activities

Supplementary reading on embeddings and transformers. Unsure how much of this will be review for students.
Interactive python portion.
- Learn the individual steps of how we measure a single embeddings bias scores.
Batch evaluate biases.
Fine tune layer to address issues? NOTE: Possibly extra credit? This part is pretty involved. Suitable for a final project.

Supplementary Material

Context embedding vs static embedding: Why do we want context embeddings?

Static embeddings are how a model processes a word on its own. Context embeddings show how a model processes a string of words to generate context. This is very important for detecting bias as this is highly context dependent. Words like 'queer', 'disabled', or 'black' can have very different implications based on who is using them and in reference to what. Contextual embeddings help capture this relationship.

From Static Embedding to Contextualized Embedding

What is masked language modeling?

Masked language modeling is where we take some model trained on the English language and the "mask" out words and ask the model to guess them. It is like a fill in the blank question. We can learn a lot about these models based on what likelihoods it assigns different words.

Masked Language Modeling with BERT

How do transformers make an embedding?

Transformer Text Embeddings

How can we measure an embedding?

Cosine Method is a very popular method.
We will use a masked language modeling method for measuring bias.
1. Take a sentence with a pronoun/gender marker and a profession.
  - 'he is a computer scientist.'
2. Mask the pronoun/gender marker. This is called the target.
  - '[MASK] is a computer scientist.'
3. Get the probability that the masked word would appear in this sentence.
  - target_probability = bert('[MASK] is a computer scientist', targets='he')[0]['score']
4. Mask the profession, including compound words as multiple masks.
  - '[MASK] is a [MASK] [MASK]'
5. Get the probability that the target words appears if it didn't contain the biased words.
  - prior_probability = bert('[MASK] is a [MASK] [MASK]', targets='he')[0][0]['score']
6. Calculate the log score, indicating the relative change in likelihood.
  - math.log(target_probability/prior_probability)

Do this for two targets, ('he' and 'she'), and compare the relative disparities. If the relative differences are unequal this means that the attributes we masked had a biased effect on the targets, causing their relative likelihood to shift.

Assignment

Interactive portion

This portion of the assignment should be performed in a live python interpreter or within a python notebook.

Instantiate a pretrained model

The first argument to pipeline indicates the task. here are other tasks if you want to play around with them.

feature-extraction (get the vector representation of a text)
ner (named entity recognition)
question-answering
sentiment-analysis
summarization
text-generation
translation
zero-shot-classification

For more information: https://huggingface.co/docs/transformers/main_classes/pipelines

from transformers import pipeline
bert = pipeline('fill-mask', model='bert-base-uncased')

This gives us a fully pretrained BERT ready for masked language modeling.

Make predictions

pred = bert('This sentence has a [MASK] in it.')

This by default returns the top 5 probabilities in a list. This can be overridden with the optional argument top_k=X to show the top X results. Each result in the list is a dictionary containing...

{
	'score': float,    # associated probability
	'token': int,      # token index in bert vector
	'token_str': str,  # the guess
	'sequence': str    # the whole sentence with guess inserted
}

So for example, to get just the top guess and no other info you can use...

bert('This sentence has a [MASK] in it.')[0]['sequence']  # get the first guess

Find probability of a specific prediction

We can also query BERT for a specific words probability like this. The targets argument can take a single word or a list of words to target.

pred = bert('This sentence has a [MASK] in it.', targets='word')
# or to just view the lkikely hood of the target
bert('This sentence has a [MASK] in it.', targets='word')[0]['score']
# note that if the word has multiple masks the list will be one deeper

BEC-Pro

BEC-Pro is a data set containing these masks pre-made. It is stored in the file BEC-Pro_EN.tsv where each row contains.

[
	Sentence,                   # original sentence
 	Sent_TM, Sent_AM, Sent_TAM, # masked sentences
	Template,                   # mask template
	Person, Gender,             # target, associated binary gender
	Profession, Prof_Gender     # attribute, associated binary gender
]

Each sentence in the BEC-Pro data set follows one of these templates.

{} is a {}.
{} works as a {}.
{} applied for the position of {}.
{}, the {}, had a good day at work.
{} wants to become a {}.

Load the BEC-Pro English version with pandas like so.

import pandas as pd

df = pd.read_csv('./BEC-Pro_EN.tsv', delimiter='\t')

# to get column data
sentence_col = df.Sentence
# or
sentence_col = df['Sentence']

# to index specific col, row combinations
row = col = 0
df.iloc[row, col] # gets the sentence in first row
# select more than one by wrapping rows and cols in lists

Single measure with BEC-Pro

import math
import pandas as pd

df = pd.read_csv('./BEC-Pro_EN.tsv', delimiter='\t')

target_probability = bert(df[0,1], targets=df[0,5])[0]['score']

prior_probability = bert(df[0,3], targets=df[0,5])[0][0]['score']

math.log(target_probability/prior_probability)

Switch from interpreter to coding a script

Batch measure with BEC-Pro

import math
import pandas as pd

target_probabilities = []
prior_probabilities = []
scores = []

for row in df.iloc:  # remember, anything with an index is iterable
	target_probabilities.append(bert(row['Sent_TM'], targets=row['Person'])[0]['score'])
	prior_probabilities.append(bert(row['Sent_TAM'], targets=row['Person'])[0]['score'])
	scores.append(math.log(target_probabilities[-1]/prior_probabilities[-1]))

Now try correlating matching sentences and look for gender disparities.

Extra Credit(?)

Fine tune a layer using the Gender Ambiguous Pronouns (GAP) corpus.

NOTE: Perhaps left as a task for the interested student?

The gender swapped GAP is corpus where every sentence has professions and names swapped to the opposite gender. This counter balances the issue in the original BERT model, insuring gender will not impact these predictions.

git clone git@github.com:allenai/dont-stop-pretraining.git Use the mask_tokens() function and inverted GAP to create fine tuning material from inverted GAP.

Fine tune the model to reduce biases.

Import sentence data.
Tokenize the data.
Create mask array.
Create a PyTorch dataset and dataloader class.
For epoch in range(epochs): train!
Save a checkpoint.
Load checkpoint and measure BEC-Pro again.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.mypy_cache/3.8		.mypy_cache/3.8
BEC-Pro		BEC-Pro
__pycache__		__pycache__
data		data
.gitignore		.gitignore
readme.md		readme.md
readme.md.bak		readme.md.bak
simple.py		simple.py

sjdobesh/gender-bias

Folders and files

Latest commit

History

Repository files navigation