In this homework, you will debias word embeddings using the method from [Bolukbasi et al. 2016](https://arxiv.org/abs/1607.06520) and interpreted through [Vargas and Cotterell 2020](https://arxiv.org/abs/2009.09435). 

In [1]:
import re
from gensim.models import KeyedVectors
import numpy as np
from sklearn.decomposition import PCA
from sklearn.metrics.pairwise import cosine_similarity

In [2]:
glove = KeyedVectors.load_word2vec_format("../data/glove.6B.100d.100K.w2v.txt", binary=False)

In [3]:
# let's print one sample vector just to see what it looks like
print(glove["man"].shape)
print(glove["man"])

(100,)
[ 3.7293e-01  3.8503e-01  7.1086e-01 -6.5911e-01 -1.0128e-03  9.2715e-01
  2.7615e-01 -5.6203e-02 -2.4294e-01  2.4632e-01 -1.8449e-01  3.1398e-01
  4.8983e-01  9.2560e-02  3.2958e-01  1.5056e-01  5.7317e-01 -1.8529e-01
 -5.2277e-01  4.6191e-01  9.2038e-01  3.1001e-02 -1.6246e-01 -4.0567e-01
  7.8621e-01  5.7722e-01 -5.3501e-01 -6.8228e-01  1.6987e-01  3.6310e-01
 -7.1773e-02  4.7233e-01  2.7806e-02 -1.4951e-01  1.7543e-01 -3.7573e-01
 -7.8517e-01  5.8171e-01  8.6859e-01  3.1445e-02 -4.5897e-01 -4.0917e-02
  9.5897e-01 -1.6975e-01  1.3045e-01  2.7434e-01 -6.9485e-02  2.2402e-02
  2.4977e-01 -2.1536e-01 -3.2406e-01 -3.9867e-01  6.8613e-01  1.7923e+00
 -3.7848e-01 -2.2477e+00 -7.7025e-01  4.6582e-01  1.2411e+00  5.7756e-01
  4.1151e-01  8.4328e-01 -5.4259e-01 -1.6715e-01  7.3927e-01 -9.3477e-02
  9.0278e-01  5.0889e-01 -5.0031e-01  2.6451e-01  1.5443e-01 -2.9432e-01
  1.0906e-01 -2.6667e-01  3.5438e-01  4.9079e-02  1.8018e-01 -5.8590e-01
 -5.5542e-01 -2.8987e-01  7.4278e-01  3.4530

Now let's calculate the cosine similarity of that vector ("man") with a set of other vectors ("king" and "cabbage").  This returns two cosine similarities, the first cos(man, king) and the second cos(man, cabbage).

In [4]:
glove.cosine_similarities(glove["man"], [glove["king"], glove["cabbage"]])

array([0.5118682 , 0.04780922], dtype=float32)

Let's use that machinery to find the differences between "man" and "woman" and a set of target terms.

In [5]:
targets=["doctor", "nurse", "actor", "actress", "mechanic", "librarian", "architect", "magician", "cook", "chef"]
diffs={}
for term in targets:
    
    m,w=glove.cosine_similarities(glove[term], [glove["man"], glove["woman"]])
    diffs[term]=m-w

for k, v in sorted(diffs.items(), key=lambda item: item[1], reverse=True):
    print("%.3f\t%s" % (v,k))

0.109	magician
0.095	mechanic
0.082	architect
0.046	actor
0.035	cook
0.012	chef
-0.024	doctor
-0.110	librarian
-0.154	actress
-0.158	nurse


We can see a gender difference here, where "man" is more aligned "magician" and "mechanic" and "woman" is more aligned with "actress" and "nurse".

**Q1.** Let's debias those embeddings, using the method from [Bolukbasi et al. 2016](https://arxiv.org/abs/1607.06520) and interpreted through [Vargas and Cotterell 2020](https://arxiv.org/abs/2009.09435).  Debiasing embeddings requires two steps: finding the gender subspace and then subtracting the orthogonal projection onto that subspace from the original embedding.  Let's start with the first step: creating "defining sets" that capture the variation:

$$
D_1 = \{man, woman\}\\
D_2 = \{mr., mrs.\}
$$

Following Vargas and Cotterell, we can find the gender subspace by constructing a new matrix $D'$ by substracting the embedding for a word in a defining set from the average of embeddings in that set. Using $e_{word}$ to denote the embedding for a word, this process would results in the following for the defining sets above:

$$
\begin{bmatrix}
e_{man} - \textrm{mean}(e_{man},e_{woman}) \\
e_{woman} - \textrm{mean}(e_{man},e_{woman})\\
e_{mr.} - \textrm{mean}(e_{mr.},e_{mrs.})\\
e_{mrs.} - \textrm{mean}(e_{mr.},e_{mrs.})\\
\end{bmatrix}
$$

If the original embeddings (e.g., for $e_{man}$) are 100 dimensions (and so the mean over any set of embeddings is also 100 dimensions), then the resulting matrix $D'$ should be $4 \times 100$.  Create this matrix $D'$ and name it `subspace_matrix`.

In [6]:
D1=["man", "woman"]
D2=["mr.", "mrs."]

# TODO
e_man = glove["man"]
e_woman = glove["woman"]
e_mr = glove["mr"]
e_mrs = glove["mrs"]

subspace_matrix = np.vstack(((e_man - np.mean([e_man, e_woman],axis=0)).reshape(1,100), 
                             (e_woman - np.mean([e_man, e_woman],axis=0)).reshape(1,100), 
                             (e_mr - np.mean([e_mr, e_mrs],axis=0)).reshape(1,100), 
                             (e_mrs - np.mean([e_mr, e_mrs],axis=0)).reshape(1,100)))

In [7]:
# This should be (4,100)
print(subspace_matrix.shape)

(4, 100)


Step two is to run [PCA](https://en.wikipedia.org/wiki/Principal_component_analysis) over that `subspace_matrix` matrix.  The gender subspace in this example is the first principle component of that process. Here's how you run PCA on a random matrix to get the first principle component.

In [8]:
fake_matrix=np.random.rand(3,3)
print("fake matrix:")
print(fake_matrix)

# We only need one principle component, so we'll set n_components=1
pca=PCA(n_components=1).fit(fake_matrix)
subspace=pca.components_[0]

print("first principle component:")
print(subspace)

fake matrix:
[[0.09100511 0.82725414 0.53150732]
 [0.73859949 0.12534586 0.28251   ]
 [0.75730885 0.36063383 0.40465668]]
first principle component:
[-0.70878339  0.66941289  0.222514  ]


In [9]:
# You'll see that this subspace is already normalized to unit length:
print(subspace)
print(subspace/np.sqrt(np.dot(subspace, subspace)))

[-0.70878339  0.66941289  0.222514  ]
[-0.70878339  0.66941289  0.222514  ]


**Q2.** Run PCA on that subspace matrix to get the subspace axis.

In [10]:
# To do

pca2=PCA(n_components=1).fit(subspace_matrix)
subspace2=pca2.components_[0]

print("first principle component:")
subspace2

first principle component:


array([-0.02956821, -0.02041973,  0.09069765, -0.15537918, -0.00504958,
       -0.11262711,  0.02766803, -0.08187196, -0.1678377 ,  0.0475144 ,
        0.10393932,  0.03615956,  0.06527   , -0.14132844,  0.03927333,
        0.04549308, -0.06324527, -0.03409998, -0.05121502, -0.01143592,
        0.20441476, -0.03848183, -0.08002796, -0.24430752, -0.0583269 ,
        0.06293844,  0.09939633,  0.23133436,  0.03535545,  0.0124524 ,
       -0.08124342,  0.0319851 , -0.19318594,  0.01551938,  0.02306063,
        0.06862629, -0.03819092,  0.09366054, -0.0873498 , -0.16789356,
        0.06211457, -0.04531497,  0.23381914, -0.06133813, -0.06068353,
       -0.06630228,  0.04470916, -0.09983315, -0.02139786, -0.08311246,
       -0.01235847,  0.00183215, -0.07296807,  0.0010722 , -0.04861382,
       -0.12487253,  0.02284342,  0.14009078,  0.13543136,  0.01898321,
        0.05710227,  0.00638756, -0.15550055, -0.03771891,  0.00845029,
        0.00582173,  0.02150673,  0.12136365, -0.10560768,  0.03

That subspace is the gender axis. You'll remember from class that we find the orthogoal projection of any unit-normalized vector $w$ onto a subspace $b$ by:

$$
w_b = \textrm{dot}(w,b) \; b
$$

If $b$ and $x$ are 100 dimensions, $w_b$ is 100 dimensions too.  The debiased vector $w_d$ is then simply $w - w_b$.  

**Q3.** Debias the vectors for "man", "woman", and the targets used above ("doctor", "nurse", "actor", "actress", "mechanic", "librarian", "architect", "magician", "cook", "chef") and see if debiasing changes the differences between these terms and "man"/"woman" as noted above.  Glove embeddings are not normalized ahead of time, so be sure to normalize them before carrying out your projection (i.e., dividing vector v by $\sqrt{\textrm{dot}(v,v)}$).


In [11]:
targets=["doctor", "nurse", "actor", "actress", "mechanic", "librarian", "architect", "magician", "cook", "chef"]
diffs2={}
for term in targets:
    
    w_b = np.dot(glove[term] / np.sqrt(np.dot(glove["man"], glove["man"])), subspace2) * subspace2
    w_d = glove[term] - w_b
    m,w=glove.cosine_similarities(w_d, [glove["man"], glove["woman"]])
    diffs2[term]=m-w

for k, v in sorted(diffs2.items(), key=lambda item: item[1], reverse=True):
    print("%.3f\t%s" % (v,k))

0.103	magician
0.084	mechanic
0.069	architect
0.041	actor
0.031	cook
0.008	chef
-0.024	doctor
-0.087	librarian
-0.129	actress
-0.133	nurse


In [12]:
# Before de-biasing

for k, v in sorted(diffs.items(), key=lambda item: item[1], reverse=True):
    print("%.3f\t%s" % (v,k))

0.109	magician
0.095	mechanic
0.082	architect
0.046	actor
0.035	cook
0.012	chef
-0.024	doctor
-0.110	librarian
-0.154	actress
-0.158	nurse


**check-plus**. Reflect in 100 words on the differences between this gender axis construction and the axis construction in SemAxis.  How are they different?

The SemAxis is simply the remainder after substracting the mean of all negative vectors from the mean of all positive vectors. It only captures the mean difference between positive vectors and negative vectors on each of the dimensions. \
The gender axis is constructed as a vector that captures the most the variance in the difference between each vector to their postive-negative pair. This axis minimized the total residuals in all dimensions. \
In term of the overall variance explained of the gender difference, gender axis is better than SemAxis.