In this homework, you will debias word embeddings using the method from [Bolukbasi et al. 2016](https://arxiv.org/abs/1607.06520) and interpreted through [Vargas and Cotterell 2020](https://arxiv.org/abs/2009.09435). 

In [12]:
import re
from gensim.models import KeyedVectors
import numpy as np
from sklearn.decomposition import PCA
from sklearn.metrics.pairwise import cosine_similarity

In [13]:
glove = KeyedVectors.load_word2vec_format("../data/glove.6B.100d.100K.w2v.txt", binary=False)

In [14]:
# let's print one sample vector just to see what it looks like
print(glove["man"].shape)
print(glove["man"])

(100,)
[ 3.7293e-01  3.8503e-01  7.1086e-01 -6.5911e-01 -1.0128e-03  9.2715e-01
  2.7615e-01 -5.6203e-02 -2.4294e-01  2.4632e-01 -1.8449e-01  3.1398e-01
  4.8983e-01  9.2560e-02  3.2958e-01  1.5056e-01  5.7317e-01 -1.8529e-01
 -5.2277e-01  4.6191e-01  9.2038e-01  3.1001e-02 -1.6246e-01 -4.0567e-01
  7.8621e-01  5.7722e-01 -5.3501e-01 -6.8228e-01  1.6987e-01  3.6310e-01
 -7.1773e-02  4.7233e-01  2.7806e-02 -1.4951e-01  1.7543e-01 -3.7573e-01
 -7.8517e-01  5.8171e-01  8.6859e-01  3.1445e-02 -4.5897e-01 -4.0917e-02
  9.5897e-01 -1.6975e-01  1.3045e-01  2.7434e-01 -6.9485e-02  2.2402e-02
  2.4977e-01 -2.1536e-01 -3.2406e-01 -3.9867e-01  6.8613e-01  1.7923e+00
 -3.7848e-01 -2.2477e+00 -7.7025e-01  4.6582e-01  1.2411e+00  5.7756e-01
  4.1151e-01  8.4328e-01 -5.4259e-01 -1.6715e-01  7.3927e-01 -9.3477e-02
  9.0278e-01  5.0889e-01 -5.0031e-01  2.6451e-01  1.5443e-01 -2.9432e-01
  1.0906e-01 -2.6667e-01  3.5438e-01  4.9079e-02  1.8018e-01 -5.8590e-01
 -5.5542e-01 -2.8987e-01  7.4278e-01  3.4530

Now let's calculate the cosine similarity of that vector ("man") with a set of other vectors ("king" and "cabbage").  This returns two cosine similarities, the first cos(man, king) and the second cos(man, cabbage).

In [15]:
glove.cosine_similarities(glove["man"], [glove["king"], glove["cabbage"]])

array([0.5118681 , 0.04780922], dtype=float32)

Let's use that machinery to find the differences between "man" and "woman" and a set of target terms.

In [16]:
targets=["doctor", "nurse", "actor", "actress", "mechanic", "librarian", "architect", "magician", "cook", "chef"]
diffs={}
for term in targets:
    
    m,w=glove.cosine_similarities(glove[term], [glove["man"], glove["woman"]])
    diffs[term]=m-w

for k, v in sorted(diffs.items(), key=lambda item: item[1], reverse=True):
    print("%.3f\t%s" % (v,k))

0.109	magician
0.095	mechanic
0.082	architect
0.046	actor
0.035	cook
0.012	chef
-0.024	doctor
-0.110	librarian
-0.154	actress
-0.158	nurse


We can see a gender difference here, where "man" is more aligned "magician" and "mechanic" and "woman" is more aligned with "actress" and "nurse".

**Q1.** Let's debias those embeddings, using the method from [Bolukbasi et al. 2016](https://arxiv.org/abs/1607.06520) and interpreted through [Vargas and Cotterell 2020](https://arxiv.org/abs/2009.09435).  Debiasing embeddings requires two steps: finding the gender subspace and then subtracting the orthogonal projection onto that subspace from the original embedding.  Let's start with the first step: creating "defining sets" that capture the variation:

$$
D_1 = \{man, woman\}\\
D_2 = \{mr., mrs.\}
$$

Following Vargas and Cotterell, we can find the gender subspace by constructing a new matrix $D'$ by substracting the embedding for a word in a defining set from the average of embeddings in that set. Using $e_{word}$ to denote the embedding for a word, this process would results in the following for the defining sets above:

$$
\begin{bmatrix}
e_{man} - \textrm{mean}(e_{man},e_{woman}) \\
e_{woman} - \textrm{mean}(e_{man},e_{woman})\\
e_{mr.} - \textrm{mean}(e_{mr.},e_{mrs.})\\
e_{mrs.} - \textrm{mean}(e_{mr.},e_{mrs.})\\
\end{bmatrix}
$$

If the original embeddings (e.g., for $e_{man}$) are 100 dimensions (and so the mean over any set of embeddings is also 100 dimensions), then the resulting matrix $D'$ should be $4 \times 100$.  Create this matrix $D'$ and name it `subspace_matrix`.

In [17]:
D1=["man", "woman"]
D2=["mr.", "mrs."]

#I will calculate each individual row of the matrix (vector) and them combine it into the subspace_matrix
M1_1 = glove["man"]-np.mean(glove[D1])
M1_2 = glove["woman"]-np.mean(glove[D1])
M2_1 = glove["mr."]-np.mean(glove[D2])
M2_2 = glove["mrs."]-np.mean(glove[D2])

subspace_matrix = np.vstack((M1_1, M1_2, M2_1, M2_2))
print(subspace_matrix)

[[ 0.31345838  0.3255584   0.6513884  -0.7185816  -0.06048441  0.8676784
   0.21667838 -0.11567461 -0.30241162  0.18684839 -0.2439616   0.2545084
   0.43035838  0.03308839  0.2701084   0.0910884   0.5136984  -0.2447616
  -0.5822416   0.4024384   0.8609084  -0.02847061 -0.2219316  -0.4651416
   0.7267384   0.5177484  -0.5944816  -0.7417516   0.1103984   0.30362839
  -0.1312446   0.4128584  -0.03166561 -0.2089816   0.11595839 -0.43520162
  -0.8446416   0.5222384   0.8091184  -0.02802661 -0.5184416  -0.10038861
   0.8994984  -0.22922161  0.07097839  0.2148684  -0.12895662 -0.03706961
   0.1902984  -0.2748316  -0.3835316  -0.4581416   0.6266584   1.7328284
  -0.4379516  -2.3071716  -0.8297216   0.4063484   1.1816283   0.5180884
   0.35203838  0.7838084  -0.6020616  -0.22662161  0.67979836 -0.15294862
   0.8433084   0.44941837 -0.5597816   0.2050384   0.09495839 -0.3537916
   0.04958839 -0.3261416   0.2949084  -0.01039261  0.12070839 -0.6453716
  -0.6148916  -0.3493416   0.6833084   0.28582

In [18]:
# This should be (4,100)
print(subspace_matrix.shape)

(4, 100)


Step two is to run [PCA](https://en.wikipedia.org/wiki/Principal_component_analysis) over that `subspace_matrix` matrix.  The gender subspace in this example is the first principle component of that process. Here's how you run PCA on a random matrix to get the first principle component.

In [19]:
fake_matrix=np.random.rand(3,3)
print("fake matrix:")
print(fake_matrix)

# We only need one principle component, so we'll set n_components=1
pca=PCA(n_components=1).fit(fake_matrix)
subspace_fake=pca.components_[0]

print("first principle component:")
print(subspace_fake)

fake matrix:
[[0.41346828 0.52525543 0.13353812]
 [0.10217291 0.52276599 0.37071568]
 [0.94241008 0.02444956 0.08855541]]
first principle component:
[ 0.81403561 -0.52846866 -0.24097074]


**Q2.** Run PCA on that subspace matrix to get the subspace axis.

In [20]:
# We only need one principle component, so we'll set n_components=1
pca=PCA(n_components=1).fit(subspace_matrix)
subspace=pca.components_[0]

print("Gender Subspace:")
print(subspace)

Gender Subspace:
[ 0.11603842  0.11445931  0.17490464 -0.06605324  0.04787533  0.20222813
  0.00225227 -0.03337409  0.03640716  0.21475543 -0.18589891  0.07010151
  0.04989893 -0.02383162  0.09524161  0.0563787   0.09899762 -0.09684595
  0.02625981 -0.09062164  0.1835587   0.0806286  -0.03016761  0.10647134
  0.09072191 -0.01883493 -0.06899775 -0.08525849  0.03143402  0.03756694
 -0.14139953 -0.12825754 -0.09846704 -0.11455748  0.1241527  -0.26089197
 -0.2440844   0.05834092  0.13330106  0.0914202  -0.06864227  0.09111322
 -0.03110604 -0.08132536  0.06989511 -0.00154357 -0.00926636  0.09458916
  0.04539823 -0.00710394 -0.14097895 -0.11680316 -0.04091208  0.25102094
 -0.06056746 -0.07806938 -0.0971934   0.05455751  0.1751944   0.14475709
  0.11376267  0.032123    0.03172037 -0.03226599 -0.03736976 -0.00476188
 -0.05274808 -0.09554956 -0.08746713 -0.07122406  0.02319685 -0.06534515
  0.03929776  0.06747835  0.08820648 -0.07466307 -0.03399871 -0.1008185
 -0.05528405 -0.13328253 -0.0259573

In [21]:
# You'll see that this subspace is already normalized to unit length:
print(subspace)
print(subspace/np.sqrt(np.dot(subspace, subspace)))

[ 0.11603842  0.11445931  0.17490464 -0.06605324  0.04787533  0.20222813
  0.00225227 -0.03337409  0.03640716  0.21475543 -0.18589891  0.07010151
  0.04989893 -0.02383162  0.09524161  0.0563787   0.09899762 -0.09684595
  0.02625981 -0.09062164  0.1835587   0.0806286  -0.03016761  0.10647134
  0.09072191 -0.01883493 -0.06899775 -0.08525849  0.03143402  0.03756694
 -0.14139953 -0.12825754 -0.09846704 -0.11455748  0.1241527  -0.26089197
 -0.2440844   0.05834092  0.13330106  0.0914202  -0.06864227  0.09111322
 -0.03110604 -0.08132536  0.06989511 -0.00154357 -0.00926636  0.09458916
  0.04539823 -0.00710394 -0.14097895 -0.11680316 -0.04091208  0.25102094
 -0.06056746 -0.07806938 -0.0971934   0.05455751  0.1751944   0.14475709
  0.11376267  0.032123    0.03172037 -0.03226599 -0.03736976 -0.00476188
 -0.05274808 -0.09554956 -0.08746713 -0.07122406  0.02319685 -0.06534515
  0.03929776  0.06747835  0.08820648 -0.07466307 -0.03399871 -0.1008185
 -0.05528405 -0.13328253 -0.02595735  0.03361828  0.

That subspace is the gender axis. You'll remember from class that we find the orthogoal projection of any unit-normalized vector $w$ onto a subspace $b$ by:

$$
w_b = \textrm{dot}(w,b) \; b
$$

If $b$ and $x$ are 100 dimensions, $w_b$ is 100 dimensions too.  The debiased vector $w_d$ is then simply $w - w_b$.  

**Q3.** Debias the vectors for "man", "woman", and the targets used above ("doctor", "nurse", "actor", "actress", "mechanic", "librarian", "architect", "magician", "cook", "chef") and see if debiasing changes the differences between these terms and "man"/"woman" as noted above.  Glove embeddings are not normalized ahead of time, so be sure to normalize them before carrying out your projection (i.e., dividing vector v by $\sqrt{\textrm{dot}(v,v)}$).


In [22]:
glove_man = glove["man"]
glove_woman = glove["woman"]
glove_doctor = glove["doctor"]
glove_nurse = glove["nurse"]
glove_actor = glove["actor"]
glove_actress = glove["actress"]
glove_librarian = glove["librarian"]
glove_architect = glove["architect"]


deb_man = (glove_man/np.sqrt(np.dot(glove_man, glove_man))) - subspace
deb_woman = (glove_woman/np.sqrt(np.dot(glove_woman, glove_woman))) - subspace
deb_doctor = (glove_doctor/np.sqrt(np.dot(glove_doctor, glove_doctor))) - subspace
deb_nurse = (glove_nurse/np.sqrt(np.dot(glove_nurse, glove_nurse))) - subspace
deb_actor = (glove_actor/np.sqrt(np.dot(glove_actor, glove_actor))) - subspace
deb_actress = (glove_actress/np.sqrt(np.dot(glove_actress, glove_actress))) - subspace
deb_librarian = (glove_librarian/np.sqrt(np.dot(glove_librarian, glove_librarian))) - subspace
deb_architect = (glove_architect/np.sqrt(np.dot(glove_architect, glove_architect))) - subspace

print(deb_man)
print(deb_woman)
print(deb_nurse)
print(deb_actor)
print(deb_actress)
print(deb_librarian)
print(deb_architect)


[-0.04936789 -0.0456256  -0.0478207  -0.05177911 -0.04805639 -0.03647695
  0.04711643  0.0233264  -0.07983874 -0.1707196   0.15291673 -0.01396976
  0.03767039  0.04037903 -0.03632097 -0.02946235  0.00347081  0.06372075
 -0.11971798  0.17319956 -0.01901783 -0.07508639  0.00112383 -0.17899495
  0.04983272  0.1220274  -0.02664863 -0.03671607 -0.00106552  0.02734624
  0.12856832  0.21269831  0.10343806  0.08782884 -0.09279022  0.19372088
  0.10371569  0.04565424  0.02198105 -0.08579863 -0.01341006 -0.09842815
  0.20254584  0.05097831 -0.04657392  0.05058868 -0.00315582 -0.09058424
 -0.00074562 -0.03139703  0.08304515  0.04553097  0.16357493  0.06939736
 -0.00709527 -0.323763   -0.04050799  0.02871943  0.04668316 -0.04150385
 -0.040195    0.11863433 -0.12872186  0.00238376  0.16953269 -0.01194946
  0.2141425   0.18652633 -0.00197576  0.11851181  0.00441136  0.01272812
 -0.01980057 -0.11515225 -0.02485222  0.08343716  0.06621037 -0.00392573
 -0.04401112  0.08146104  0.15874779  0.0281127  -0

**check-plus**. Reflect in 100 words on the differences between this gender axis construction and the axis construction in SemAxis.  How are they different?

**ANSWER:** I would argue that the main difference between the gender axis construction applied in this exercise and the SemAxis, is that the latter takes into consideration specific semantic domains as input to characterize the words (An, Kwak and Ahn, 2018, p. 1-2). Consequently, depending on the selection of that domain the outcome of the analysis can be very distinct. For instance, words such as "doctor", through the SemAxis analysis, could lean much more towards a masculine gender within the “Medical” domain, like we saw during this exercise, but could also lean more towards a feminine gender in a different domain, such as “Academia”, which we were not necessarily able to observe only using the method presented by Vargas and Cotterell (2020).