In [1]:
import pickle
import zipfile
import utils
import hard_debias as hd
import double_hard_debias as dhd
import data_embeddings as data # this takes a while, files are retrieved from Wang et al.'s Github repository

successfully loaded hard_debias
successfully loaded double_hard_debias
successfully loaded utils
Original vocab size:  322636
Restricted vocab size:  47974
Neutral vocab size:  47597
successfully loaded data_embeddings


# Debiasing GloVe embeddings
Following the concepts described in the paper by [Wang et al. (2020)](https://arxiv.org/abs/2005.00965), we set out to replicate the proposed Double-Hard Debias algorithm. As the code uploaded to the [github repository by the original authors](https://github.com/uvavision/Double-Hard-Debias) is partly not understandable, partly (in our eyes) slightly deviating from what is proposed in the paper and partly simply not executable (without the respective files from which not all are uploaded) we present here a new, full implementation of the algorithm as described in Wang et al. (2020).

## Hard Debias
The authors make use of the Hard Debias algorithm proposed by [Bolukbasi et al. (2016)](https://arxiv.org/pdf/1607.06520.pdf) and we sticked to the original paper in order to re-implement Hard Debias. The paper describes two steps: First, the gender subspace has to be identified. As a second step Hard Debias neutralizes and equalizes the word embeddings.
### Step 1: Identify gender subspace
Inputs: word sets $W$, defining sets $D_1, D_2, ..., D_n \subset W$ as well as embedding $\{\vec{w}\in\mathbb{R}^d\}_{w\in W}$ and integer parameter $k \geq 1$.   
Let $$\mu_i := \sum_{w\in D_i}\vec{w}/|D_i|$$ be the means of the defining sets. Let the bias subspace $B$ be the first $k$ rows of SVD($C$) where $$C:=\sum_{i=1}^n \sum_{w\in D_i}(\vec{w}-\mu_i)^T(\vec{w}-\mu_i)/|D_i|$$.

In [2]:
gender_subspace = hd.idtfy_gender_subspace(data.embedding, data.vocab, data.w2id, data.definitional_pairs)

### Step 2: Hard de-biasing (neutralize and equalize)
Hard Debias then neutralizes the word embeddings by transforming each $\vec{w}$ such that every word $w\in N$ has zero projection in the gender subspace. For each word $w\in N$, we re-embed $\vec{w}$: $$\vec{w}:=\vec{w}-\vec{w}_B$$
Please see `hard_debias.py` for the full implementation of Hard Debias.

## Double-Hard Debias
We sticked strictly to the pseudocode from Wang et al. (2020):   
![](dhd_pseudocode.png "Double-Hard Debias")

In [3]:
# first, the male and female biased word sets need to be obtained:
index_f, index_m = dhd.most_biased(data.embedding, gender_subspace)

female_most_biased = [data.vocab[i] for i in index_f]
male_most_biased = [data.vocab[i] for i in index_m]
print("first 10 female most biased words: ", female_most_biased[:10])
print("first 10 male most biased words: ", male_most_biased[:10])

first 10 female most biased words:  ['actress', 'pregnant', 'louise', 'therese', 'abbess', 'sister', 'chairwoman', 'alumna', 'princess', 'ballerina']
first 10 male most biased words:  ['john', 'himself', 'his', 'brother', 'led', 'son', 'colonel', 'successor', 'nephew', 'footballing']


Please refer to `double_hard_debias.py` for the full implementation. From the paper alone we could not infer whether Wang et al. (2020) used the `equalize_pairs`, the original set by Bolukbasi et al. (2016), for the equalizing step in Hard Debias or the `female_male_pairs` as both were uploaded to the github repository. So we implemented both (the words contained in both sets are similar, but `female_male_pairs` contains more words). As we observed no difference in the results using either word set, we decided to continue with the `equalize_pairs` to stick to the original Bolukbasi et al. (2016) paper.

In [5]:
# result using equalize_pairs for equalizing
debiased_1 = dhd.double_hard_debias(data.embedding, data.w2id, data.embedding_neutral, data.id_neutral, data.equalize_pairs, index_m, index_f, gender_subspace)

with zipfile.ZipFile('debiased.zip', 'w', compression=zipfile.ZIP_DEFLATED) as folder:
    file_1 = open('debiased.p', 'wb')
    pickle.dump(debiased_1, file_1)
    file_1.close()
    folder.write('debiased.p')

smallest PC:  3


In [6]:
# result using female_male_pairs for equalizing
debiased_2 = dhd.double_hard_debias(data.embedding, data.w2id, data.embedding_neutral, data.id_neutral, data.female_male_pairs, index_m, index_f, gender_subspace)

smallest PC:  3


In [7]:
print("The resulting embeddings are identical:", (debiased_1 == debiased_2).all())

The resulting embeddings are identical: True
