<a href="https://colab.research.google.com/github/axel-sirota/implement-nlp-word-embedding/blob/main/module4/Module4_Demo3_Debiase_Word_Embeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Debiasing word embeddings

## Load data and library

In [1]:
!pip install -U gensim



In [2]:
%%writefile get_data.sh
if [ ! -f glove.6B.zip ]; then
  wget -O glove.6B.zip https://nlp.stanford.edu/data/glove.6B.zip
  unzip glove.6B.zip
fi

Overwriting get_data.sh


In [3]:
!bash get_data.sh


## Testing the bias of the embedding

In [4]:
import warnings
warnings.filterwarnings('ignore')
from gensim.scripts.glove2word2vec import glove2word2vec
glove2word2vec(glove_input_file="glove.6B.50d.txt", word2vec_output_file="emb_word2vec_format.txt")

  


(400000, 50)

In [5]:
import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('emb_word2vec_format.txt')

In [6]:
from scipy import spatial
pairs = [('father', 'mother'), ('man', 'computer'), ('tennis', 'couch')]
for word_tuple in pairs:
  print(f'Words: {word_tuple} and similarity is {1-spatial.distance.cosine(model[word_tuple[0]], model[word_tuple[1]])}')


Words: ('father', 'mother') and similarity is 0.8909038305282593
Words: ('man', 'computer') and similarity is 0.4015541970729828
Words: ('tennis', 'couch') and similarity is 0.2106797844171524


In [23]:
import numpy as np
not_normalized_gender_vector = model['man'] - model['woman']
gender_vector = not_normalized_gender_vector/np.sqrt(np.dot(not_normalized_gender_vector, not_normalized_gender_vector))
for word in ['sports', 'politics', 'economy', 'digits', 'love', 'policewoman', 'equality', 'technology', 'receptionist']:
  print(f'Gender_vector similarity to {word} is {1-spatial.distance.cosine(gender_vector, model[word])}')

Gender_vector similarity to sports is 0.17934592068195343
Gender_vector similarity to politics is 0.012288778088986874
Gender_vector similarity to economy is 0.0778021365404129
Gender_vector similarity to digits is 0.02250431478023529
Gender_vector similarity to love is 0.006848746445029974
Gender_vector similarity to policewoman is -0.26551297307014465
Gender_vector similarity to equality is -0.17352420091629028
Gender_vector similarity to technology is 0.13193733990192413
Gender_vector similarity to receptionist is -0.33077940344810486


Notice how *man* is "closer" and "more related" to sports and technology, while *woman* is "more related" to equality and receptionist.

## Debiase non-gender words

In [25]:
def neutralize(word, gender_vector, model):
  word_projection_in_gender = np.dot(model[word], gender_vector)*gender_vector
  return model[word] - word_projection_in_gender

In [26]:
for word in ['sports', 'politics', 'economy', 'digits', 'love', 'policewoman', 'equality', 'technology', 'receptionist']:
  print(f'Gender_vector similarity to neutralized {word} is {1-spatial.distance.cosine(gender_vector, neutralize(word, gender_vector, model))}')

Gender_vector similarity to neutralized sports is 1.3393989917176441e-08
Gender_vector similarity to neutralized politics is -1.424718121256774e-10
Gender_vector similarity to neutralized economy is 1.4764868438987833e-08
Gender_vector similarity to neutralized digits is 6.64072086209444e-09
Gender_vector similarity to neutralized love is -2.858495662394489e-10
Gender_vector similarity to neutralized policewoman is -4.062933811610492e-08
Gender_vector similarity to neutralized equality is 5.002243064211598e-09
Gender_vector similarity to neutralized technology is 2.2143172628830143e-08
Gender_vector similarity to neutralized receptionist is 1.0329419097843129e-08


## Equalize gender specific words

In [27]:
def equalize(word1, word2, gender_vector, model):    
    word_rep_1 = model[word1]
    word_rep_2 = model[word2]

    average_rep = (word_rep_1 + word_rep_2) / 2

    average_in_gender_vector = np.dot(average_rep, gender_vector)* gender_vector
    average_in_non_gender_vector = average_rep - average_in_gender_vector

    word_1_in_gender_vector = np.dot(word_rep_1, gender_vector) * gender_vector
    word_2_in_gender_vector = np.dot(word_rep_2, gender_vector) * gender_vector
        
    corrected_word_1_in_gender_vector = np.sqrt(np.abs(1 - np.sum(average_in_non_gender_vector * average_in_non_gender_vector))) * (word_1_in_gender_vector - average_in_gender_vector) / np.linalg.norm(word_rep_1 - average_in_non_gender_vector - average_in_gender_vector)
    corrected_word_2_in_gender_vector = np.sqrt(np.abs(1 - np.sum(average_in_non_gender_vector * average_in_non_gender_vector))) * (word_2_in_gender_vector - average_in_gender_vector) / np.linalg.norm(word_rep_2 - average_in_non_gender_vector - average_in_gender_vector)

    e1 = corrected_word_1_in_gender_vector + average_in_non_gender_vector
    e2 = corrected_word_2_in_gender_vector + average_in_non_gender_vector

    return e1, e2

In [29]:
for (word1, word2) in [('man', 'woman'), ('policeman', 'policewoman'), ('actor', 'actress')]:
  e1, e2 = equalize(word1, word2, gender_vector, model)
  print(f'Gender_vector similarity to equalized pair {(word1)} is {1-spatial.distance.cosine(gender_vector, e1)}')
  print(f'Gender_vector similarity to equalized pair {(word2)} is {1-spatial.distance.cosine(gender_vector, e2)}')

Gender_vector similarity to equalized pair man is 0.7004363536834717
Gender_vector similarity to equalized pair woman is -0.7004364132881165
Gender_vector similarity to equalized pair policeman is 0.22585387527942657
Gender_vector similarity to equalized pair policewoman is -0.22585390508174896
Gender_vector similarity to equalized pair actor is 0.6083327531814575
Gender_vector similarity to equalized pair actress is -0.6083329916000366


## Mixing it all together

In [30]:
print(f'Checking the new similarity of equalized man/woman with original neutralized words')
e1, e2 = equalize('man', 'woman', gender_vector, model)
for word in ['sports', 'politics', 'economy', 'digits', 'love', 'policewoman', 'equality', 'technology', 'receptionist']:
  print(f'Gender_vector similarity to neutralized {word} is {1-spatial.distance.cosine(e1, neutralize(word, gender_vector, model))}')
  print(f'Gender_vector similarity to neutralized {word} is {1-spatial.distance.cosine(e2, neutralize(word, gender_vector, model))}')

Checking the new similarity of equalized man/woman with original neutralized words
Gender_vector similarity to neutralized sports is 0.2870727479457855
Gender_vector similarity to neutralized sports is 0.28707271814346313
Gender_vector similarity to neutralized politics is 0.3767041563987732
Gender_vector similarity to neutralized politics is 0.3767041563987732
Gender_vector similarity to neutralized economy is 0.23900271952152252
Gender_vector similarity to neutralized economy is 0.23900270462036133
Gender_vector similarity to neutralized digits is 0.06872960180044174
Gender_vector similarity to neutralized digits is 0.06872957199811935
Gender_vector similarity to neutralized love is 0.525387704372406
Gender_vector similarity to neutralized love is 0.5253877639770508
Gender_vector similarity to neutralized policewoman is 0.27720245718955994
Gender_vector similarity to neutralized policewoman is 0.2772025465965271
Gender_vector similarity to neutralized equality is 0.16693219542503357


Notice how we could effectively make the model unbiased by making each of the words equally likely to refer to woman or man. This can be done with any type of bias