# 🔭 Gender Bias in Word Embeddings 👫

This notebook focuses on the existence of gender stereotypes in datasets and how to mitigate it through clever NLP!

We'll train a word2vec model on the WikiBio dataset and look into the gender bias that exists in the learned vectors. Then, we'll apply Counterfactual Data Augmentation (CDA) method in our data to reduce bias and train a word2vec model on the new data.

We hope that the new learned vectors will present less gender bias 😎.

Of course, this is just one simple technique to try and mitigate (gender) bias in documents. There exist many more!

## 🛠️ Getting started

The cells below will configure everything that is required to get started with data loading using HuggingFace and training word2vec models using Gensim.

### Setup

In [None]:
!apt-get install python3-magic
!pip install -q gensim==4.1.2 datasets augly gender-bender

### Imports

In [1]:
import numpy as np
import multiprocessing
from scipy import spatial
from datetime import datetime

import pandas as pd
import augly.text as textaugs
import plotly.graph_objects as go
from gender_bender import gender_bend
from sklearn.decomposition import PCA
from datasets import concatenate_datasets, Dataset, load_from_disk

### Download dataset

We'll train our word vectors on the [Wiki-Bio](https://rlebret.github.io/wikipedia-biography-dataset/) dataset, which is a collection of various biography pages from Wikipedia. This seems like an ideal candidate for our gender-bias experiment.

We pre-downloaded the dataset from the Github page. If you are unable to do so, feel free to just load in our preprocessed dataset in the section **Load in the data**.

In [None]:
wiki_bio_sent = []
wiki_bio_nb = []
wiki_bio_title = []
wiki_bio_id = []

with open('wikipedia-biography-dataset/test/test.sent', 'r') as f:
    wiki_bio_sent = f.read().splitlines() 

with open('wikipedia-biography-dataset/test/test.nb', 'r') as f:
    wiki_bio_nb = f.read().splitlines()

with open('wikipedia-biography-dataset/test/test.title', 'r') as f:
    wiki_bio_title = f.read().splitlines() 

with open('wikipedia-biography-dataset/test/test.id', 'r') as f:
    wiki_bio_id = f.read().splitlines() 

It's in a bit of a weird format, so we'll have to merge it together

In [None]:
i = 0
wiki_bio_sent_grouped = []
for nr in wiki_bio_nb:
    nr_int = int(nr)
    wiki_sent_slice = wiki_bio_sent[i:i+nr_int]
    wiki_bio_sent_grouped.append(' '.join(wiki_sent_slice))
    i+=nr_int

In [None]:
df_data = pd.DataFrame({
    'text': wiki_bio_sent_grouped,
})

In [None]:
# Create a HF dataset object
hf_data = Dataset.from_pandas(df_data)
hf_data.save_to_disk('./data/hf_data')

## Counterfactual Data Augmentation (CDA) 🧑 👩

To reduce gender bias in our dataset, we can use CDA, a technique that replaces every occurrence of a gendered word in the original corpus with its dual. For example:
- 'he' is replaced by 'she'
- 'actor' is replaced by 'actress'
- 'king' is replaced by 'queen'

Then, we concatenate the generated samples with the original ones to create our final dataset.

### Library to use

We'll quickly look at two libraries that can do this augmentation:
- [AugLy](https://github.com/facebookresearch/AugLy): data augmentations library to swap the gendered words. 
- [GenderBender](https://github.com/Garrett-R/gender_bender): a slightly older but seemingly robust library.

Let's look at some examples of these methods in action:

In [21]:
gendered_text = "She has two sisters, but she always wanted a brother"
aug_augly = textaugs.SwapGenderedWords(aug_word_p=1.0)(gendered_text)
aug_genderbender = gender_bend(gendered_text)

print(f"AugLy: {aug_augly}")
print(f"GenderBender: {aug_genderbender}")


[W031] Model 'en_core_web_sm' (2.2.5) requires spaCy v2.2 and is incompatible with the current spaCy version (2.3.7). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate



AugLy: He has two brothers, but he always wanted a sister
GenderBender: He has two brethren, but he always wanted a sister


In [4]:
gendered_text = "She is a waitress, but she is studying to pay for her college education."
aug_augly = textaugs.SwapGenderedWords(aug_word_p=1.0)(gendered_text)
aug_genderbender = gender_bend(gendered_text)

print(f"AugLy: {aug_augly}")
print(f"GenderBender: {aug_genderbender}")

AugLy: He is a waitress, but he is studying to pay for him college education.
GenderBender: He is a waiter, but he is studying to pay for his college education.


It turns out both seem to have their problems with some words, as both methods rely heavily on lists of gendered words.

But for now, we'll continue to use GenderBender. So let's apply this function to our dataset.

### Apply to the data

This can take quite a LONG time to perform. So for speed purposes, we performed this operation in a distributed fashion on Google Cloud DataFlow, using Apache Beam, and wrote the result to a new textfile. 

If you want to skip this step, feel free to load in the HuggingFace datasets directly in the cell under the section **Load in the data**.

In [None]:
# swap gendered words in each sample of the dataset
# hf_data_cda = hf_data.map(lambda e: {'text': gender_bend(e['text'])})

In [None]:
wiki_bio_sent_cda = []

with open('wikipedia-biography-dataset/test_joined.txt', 'r') as f:
    wiki_bio_sent_cda = f.read().splitlines() 

df_data_cda = pd.DataFrame({
    'text': wiki_bio_sent_cda,
})

hf_data_cda = Dataset.from_pandas(df_data_cda)
hf_data_cda.save_to_disk('./data/hf_data_cda')

### Load in the data
Load in the pre-processed data directly:

In [None]:
!git clone https://github.com/ml6team/quick-tips
!cd quick-tips
!mv quick-tips/nlp/gender_debiasing_cda/data ./data
!rm -rf quick-tips

In [3]:
hf_data_cda=load_from_disk('./data/hf_data_cda')
hf_data=load_from_disk('./data/hf_data')

Let's look at some examples:

In [4]:
hf_data[0]

{'text': 'leonard shenoff randle -lrb- born february 12 , 1949 -rrb- is a former major league baseball player . he was the first-round pick of the washington senators in the secondary phase of the june 1970 major league baseball draft , tenth overall .'}

In [5]:
hf_data_cda[0]

{'text': '"norman alexander mclarty , -lrb- february 18 , 1889 -- september 6 , 1945 -rrb- was a canadian politician . born in st. thomas , ontario , she was first elected to the canadian house of commons representing the riding of essex west in the 1935 federal election . a liberal , she was re-elected in 1940 . she was the postmaster general , minister of labour , and secretary of state of canada in the cabinet of mackenzie queen . she served as acting president of the national liberal federation in 1943 ."'}

## Measure Bias 📏

Finally, we want to measure the bias that exists in each embedding. There are various ways to measure the bias present in a learned embedding. Let's try some of them:
- Word cosine similarity
- Gender vector decomposition

### Word embeddings model

Both methods rely on a word embeddings model to be made, so that will be our first goal:

In [6]:
from gensim.models import Word2Vec,KeyedVectors
from gensim.utils import simple_preprocess

In [7]:
epochs=5
vector_size=300
window=5
min_count=5

Prepare the dataset for gensim processing:

In [8]:
def gensim_preprocess(text):
    return simple_preprocess(text)

In [None]:
hf_data_cda_split=hf_data_cda.map(lambda e: {'text': gensim_preprocess(e['text'])})
hf_data_split=hf_data.map(lambda e: {'text': gensim_preprocess(e['text'])})

Create a new artificial dataset that combines both the debiased and biased form, to hopefully balance things out:

In [10]:
hf_data_merged=concatenate_datasets([hf_data_split, hf_data_split])
hf_data_cda_merged=concatenate_datasets([hf_data_cda_split, hf_data_split])

#### Train non debiased word embedding model

In [None]:
model_wikibio = Word2Vec(
    vector_size=vector_size,
    window=window,
    min_count=min_count,
    workers=multiprocessing.cpu_count()-1)

model_wikibio.build_vocab(hf_data_merged['text'], progress_per=10000)

model_wikibio.train(
    hf_data_merged['text'],
    total_examples=model_wikibio.corpus_count,
    epochs=epochs,
    report_delay=10)

vec_wikibio = model_wikibio.wv

#### Train debaised word embedding model

In [None]:
model_unbiased_wikibio = Word2Vec(
    vector_size=vector_size,
    window=window,
    min_count=min_count,
    workers=multiprocessing.cpu_count()-1)

model_unbiased_wikibio.build_vocab(hf_data_cda_merged['text'], progress_per=10000)

model_unbiased_wikibio.train(
    hf_data_cda_merged['text'],
    total_examples=model_unbiased_wikibio.corpus_count,
    epochs=epochs,
    report_delay=10)

vec_unbiased_wikibio = model_unbiased_wikibio.wv

### Cosine similarity

To check how close two vectors are we can use cosine similarity. The similarity as a single number does not reveal much about the bias. However, we can compare the similarity of a word with some gendered words. For example, the word 'doctor' should have equal similarity with the words 'man' and 'woman' since a doctor can be either a man or a woman.

For completeness sake, we use a number of gendered pairs and average out the similarity.

In [13]:
import plotly.express as px
import plotly.graph_objects as go
import numpy as np

professions = ['pilot', 'engineer', 'professor', 'judge']

gender_pairs = [
      ("she", "he"),
      ("girl", "boy"),
      ("woman", "man")]

before_avg = []
after_avg =[]

for word in professions:
    
    before = []
    after = []

    for female, male in gender_pairs:
        before_m = 1 - spatial.distance.cosine(vec_wikibio[word], vec_wikibio[male])
        before_f = 1 - spatial.distance.cosine(vec_wikibio[word], vec_wikibio[female])
        after_m = 1 - spatial.distance.cosine(vec_unbiased_wikibio[word], vec_unbiased_wikibio[male])
        after_f = 1 - spatial.distance.cosine(vec_unbiased_wikibio[word], vec_unbiased_wikibio[female])

        before.append(before_m - before_f)
        after.append(after_m - after_f)
    
    before_avg.append(sum(before)/len(before))
    after_avg.append(sum(after)/len(after))

In [14]:
fig = go.Figure(data=[
    go.Bar(name='Raw dataset', x=professions, y=before_avg),
    go.Bar(name='Debiased dataset', x=professions, y=after_avg)
])
# Change the bar mode
fig.update_layout(
    title="Cosine similarity bias measure",
    xaxis_title="Professions",
    yaxis_title="Similarity difference",
    barmode='group')
fig.show()

The chart above shows the difference between 'male' keywords and 'female' keywords for the given profession. The larger this difference, the more gender bias is present in the word embeddings, and thus in the dataset. A positive value indicates a stronger correlation with the male gender.

We observe that using this method, the bias indeed seems to be reduced!

### Finding the gender vector

Our goal here is to find a "gender" dimension in the data. This is done by subtracting words that are known to be male from their equivalent female version. In each of these cases, the words are nearly identical in all ways except for the gender they refer to. As such, subtracting these words should result in a vector that mostly represents the idea of “gender”.

In [15]:
gender_pairs = [
      ("girl", "boy"),
      ("she", "he"),
      ("woman", "man")]

In [16]:
gender_vectors = []
for (female_word, male_word) in gender_pairs:
    gender_vectors.append(vec_wikibio[female_word] - vec_wikibio[male_word])
    gender_vectors.append(vec_wikibio[male_word] - vec_wikibio[female_word])

pca = PCA(n_components=1)
pca.fit(np.array(gender_vectors))

female_vector = np.mean(
  pca.transform(np.array([vec_wikibio[pair[0]] for pair in gender_pairs]))
)
male_vector = np.mean(
  pca.transform(np.array([vec_wikibio[pair[1]] for pair in gender_pairs]))
)
mean_projection = (male_vector + female_vector) / 2

In [17]:
gender_vectors_unbiased = []
for (female_word, male_word) in gender_pairs:
    gender_vectors_unbiased.append(vec_unbiased_wikibio[female_word] - vec_unbiased_wikibio[male_word])
    gender_vectors_unbiased.append(vec_unbiased_wikibio[male_word] - vec_unbiased_wikibio[female_word])

pca_unbiased = PCA(n_components=1)
pca_unbiased.fit(np.array(gender_vectors_unbiased))

female_vector_unbiased = np.mean(
  pca_unbiased.transform(np.array([vec_unbiased_wikibio[pair[0]] for pair in gender_pairs])) 
)
male_vector_unbiased = np.mean(
  pca_unbiased.transform(np.array([vec_unbiased_wikibio[pair[1]] for pair in gender_pairs]))
)
mean_projection_unbiased = (male_vector_unbiased + female_vector_unbiased) / 2

In [18]:
jobs = ["singer", "teacher", "doctor", "pilot", "developer",  "lawyer",  "coach", "engineer", 'scientist']

Now, for each job let's compute how close it is to the 'male' and the 'female' vector.

In [19]:
vec_before = []
vec_after = []

for word in jobs:
    word_biased = pca.transform(np.array([vec_wikibio[word]]))[0][0]
    word_unbiased = pca_unbiased.transform(np.array([vec_unbiased_wikibio[word]]))[0][0]
    # scale the score so > 0 means female bias, < 0 means male bias
    biased = 2 * (word_biased - mean_projection) / (female_vector - male_vector)
    unbiased = 2 * (word_unbiased - mean_projection_unbiased) / (female_vector_unbiased - male_vector_unbiased)
    vec_before.append(biased)
    vec_after.append(unbiased)

In [20]:
fig = go.Figure(data=[
    go.Bar(name='Raw dataset', x=jobs, y=vec_before),
    go.Bar(name='Debiased dataset', x=jobs, y=vec_after)
])
# Change the bar mode
fig.update_layout(
    barmode='group',
    title="Gender vector bias measure",
    xaxis_title="Professions",
    yaxis_title="Similarity difference")
fig.show()

We observe that in mane cases CDA manages to reduce bias, or even flip it slightly.

## Take-aways 🤓

You've reached the finish line! 👏 Let's summarize some of the findings.

- We applied the CDA technique in a dataset to reduce the impact of gender bias.
- Then, we trained word2vec models on both the original and the CDA-augmented datasets.
- We measured the bias present in the vectors using 2 methods: cosine similarity and finding the gender vectors.
- We observed that there are indeed indications that the CDA method works to reduce gender bias. Of course, there isn't a single unified method to measure bias!

## Further reading 📖

If, like us, you find this area super exciting, feel free to have a look at these interesting papers:
- [Gender Bias in Neural Natural Language Processing](https://arxiv.org/pdf/1807.11714.pdf)
- [Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology](https://aclanthology.org/P19-1161v2.pdf)
- [It’s All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution](https://aclanthology.org/D19-1530v2.pdf)