### Quantifying and Reducing Gender Stereotypes in Word Embeddings

Ensuring fairness in algorithmically-driven decision-making is important to avoid inadvertent cases of bias and perpetuation of harmful stereotypes. However, modern natural language processing techniques, which learn model parameters based on data, might rely on implicit biases presented in the data to make undesirable stereotypical associations. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. Recent results ([1](https://arxiv.org/abs/1607.06520), [2](https://arxiv.org/abs/1608.07187)) show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because of their widespread use, as we describe, often tends to amplify these biases. 

In the following, we provide step-by-step instructions to demonstrate and quanitfy the biases in word embedding.



In [1]:
# Setup:
# Clone the code repository from https://github.com/tolga-b/debiaswe.git
# mkdir debiaswe_tutorial
# cd debiaswe_tutorial
# git clone https://github.com/tolga-b/debiaswe.git

# To reduce the time of downloading data, we provide as subset of GoogleNews-vectors in the following location:
# https://drive.google.com/file/d/1NH6jcrg8SXbnhpIXRIXF_-KUE7wGxGaG/view?usp=sharing

# For full embeddings:
# Download embeddings at https://github.com/tolga-b/debiaswe and put them on the following directory
# embeddings/GoogleNews-vectors-negative300-hard-debiased.bin
# embeddings/GoogleNews-vectors-negative300.bin

In [18]:
from __future__ import print_function, division
%matplotlib inline
from matplotlib import pyplot as plt
import json
import random
import numpy as np

import debiaswe as dwe
import debiaswe.we as we
from debiaswe.we import WordEmbedding
from debiaswe.data import load_professions

## Part 1: Gender Bias in Word Embedding


### Step 1: Load data
We first load the word embedding trained on a corpus of Google News texts consisting of 3 million English words and terms. The embedding maps each word into a 300-dimension vector. 

In [19]:
# load google news word2vec
E = WordEmbedding('w2v_gnews_small.txt')

# load professions
professions = load_professions()
profession_words = [p[0] for p in professions]

*** Reading data from w2v_gnews_small.txt
(26423, 300)
26423 words of dimension 300 : in, for, that, is, ..., Jay, Leroy, Brad, Jermaine
Loaded professions
Format:
word,
definitional female -1.0 -> definitional male 1.0
stereotypical female -1.0 -> stereotypical male 1.0


### Step 2: Define gender direction

We define gender direction by the direciton of she - he because they are frequent and do not have fewer alternative word senses (e.g., man can also refer to mankind). In the paper, we discuss alternative approach for defining gender direction (e.g., using PCA).

In [20]:
# gender direction
v_gender = E.diff('she', 'he')

### Step 3: Generating analogies of "Man: x :: Woman : y"

We show that the word embedding model generates gender-streotypical analogy pairs. 
To generate the analogy pairs, we use the analogy score defined in our paper. This score finds word pairs that are well aligned with gender direction as well as within a short distance from each other to preserve topic consistency. 


In [21]:
# analogies gender
a_gender = E.best_analogies_dist_thresh(v_gender)

for (a,b,c) in a_gender:
    print(a+"-"+b)

Computing neighbors
Mean: 10.219732808538016
Median: 7.0
she-he
herself-himself
her-his
woman-man
daughter-son
businesswoman-businessman
girl-boy
actress-actor
chairwoman-chairman
heroine-hero
mother-father
spokeswoman-spokesman
sister-brother
girls-boys
sisters-brothers
queen-king
niece-nephew
councilwoman-councilman
motherhood-fatherhood
women-men
petite-lanky
ovarian_cancer-prostate_cancer
Anne-John
schoolgirl-schoolboy
granddaughter-grandson
aunt-uncle
matriarch-patriarch
twin_sister-twin_brother
mom-dad
lesbian-gay
husband-younger_brother
gal-dude
lady-gentleman
sorority-fraternity
mothers-fathers
grandmother-grandfather
blouse-shirt
soprano-baritone
queens-kings
Jill-Greg
daughters-sons
grandma-grandpa
volleyball-football
diva-superstar
mommy-kid
Sarah-Matthew
hairdresser-barber
softball-baseball
goddess-god
Aisha-Jamal
waitress-waiter
princess-prince
filly-colt
mare-gelding
ladies-gentlemen
childhood-boyhood
interior_designer-architect
nun-priest
wig-beard
granddaughters-grandso

### She occupations and corresponding He occupation signalling the bias

In [22]:
she_occupation = ['homemaker', 'nurse', 'receptionist', 'librarian', 'socialite', 'hairdresser', 
'nanny', 'bookkeeper', 'stylist', 'housekeeper', 'interior designer', 'sewing']

for (a,b,c) in a_gender:
    if a in she_occupation:
        print(a+"-"+b+"-"+str(c))

hairdresser-barber-0.4368279
nurse-surgeon-0.37444478
sewing-carpentry-0.35139665
nanny-chauffeur-0.30954373
librarian-curator-0.2307742
housekeeper-janitor-0.22288004
bookkeeper-treasurer-0.20113158


### Step 4: Analyzing gender bias in word vectors asscoiated with professions

Next, we show that many occupations are unintendedly associated with either male of female by projecting their word vectors onto the gender dimension. 

The script will output the profession words sorted with respect to the projection score in the direction of gender.

In [23]:
# profession analysis gender
sp = sorted([(E.v(w).dot(v_gender), w) for w in profession_words])

sp[0:20], sp[-20:]

([(-0.23798442, 'maestro'),
  (-0.21665451, 'statesman'),
  (-0.2075867, 'skipper'),
  (-0.20267202, 'protege'),
  (-0.2020676, 'businessman'),
  (-0.19492392, 'sportsman'),
  (-0.18836352, 'philosopher'),
  (-0.18073659, 'marksman'),
  (-0.1728986, 'captain'),
  (-0.16785558, 'architect'),
  (-0.16702037, 'financier'),
  (-0.16313638, 'warrior'),
  (-0.15280862, 'major_leaguer'),
  (-0.15001443, 'trumpeter'),
  (-0.14718868, 'broadcaster'),
  (-0.14637241, 'magician'),
  (-0.14401694, 'fighter_pilot'),
  (-0.13782284, 'boss'),
  (-0.137182, 'industrialist'),
  (-0.13684885, 'pundit')],
 [(0.19714224, 'interior_designer'),
  (0.2083344, 'housekeeper'),
  (0.21560377, 'stylist'),
  (0.2236317, 'bookkeeper'),
  (0.23776126, 'maid'),
  (0.24125955, 'nun'),
  (0.2478258, 'nanny'),
  (0.24929331, 'hairdresser'),
  (0.24946159, 'paralegal'),
  (0.25276464, 'ballerina'),
  (0.25718823, 'socialite'),
  (0.26647124, 'librarian'),
  (0.27317625, 'receptionist'),
  (0.2754029, 'waitress'),
  (0.2

### Gender Bias in Profession
 #He professions are on the negative side of x-axis
 #She professions are on the positive side of x-axis

In [24]:
from plots import *

In [25]:
plot_words_extreme(sp[:20], sp[-20:], 40, x_title='Word Extremes', y_title='Similarity', 
title='Gender Bias In Profession')

## DEBIASING

In [26]:
from debiaswe.debias import debias,soft_debias

In [27]:
# Lets load some gender related word lists to help us with debiasing
with open('./data/definitional_pairs.json', "r") as f:
    defs = json.load(f)
print("definitional", defs)

with open('./data/equalize_pairs.json', "r") as f:
    equalize_pairs = json.load(f)

with open('./data/gender_specific_seed.json', "r") as f:
    gender_specific_words = json.load(f)
print("gender specific", len(gender_specific_words), gender_specific_words[:10])

definitional [['woman', 'man'], ['girl', 'boy'], ['she', 'he'], ['mother', 'father'], ['daughter', 'son'], ['gal', 'guy'], ['female', 'male'], ['her', 'his'], ['herself', 'himself']]
gender specific 218 ['actress', 'actresses', 'aunt', 'aunts', 'bachelor', 'ballerina', 'barbershop', 'baritone', 'beard', 'beards']


In [28]:
debias(E, gender_specific_words, defs, equalize_pairs)

26423 words of dimension 300 : in, for, that, is, ..., Jay, Leroy, Brad, Jermaine
{('Male', 'Female'), ('Schoolboy', 'Schoolgirl'), ('Spokesman', 'Spokeswoman'), ('HE', 'SHE'), ('Grandpa', 'Grandma'), ('Fatherhood', 'Motherhood'), ('dad', 'mom'), ('councilman', 'councilwoman'), ('congressman', 'congresswoman'), ('brothers', 'sisters'), ('SON', 'DAUGHTER'), ('EX_GIRLFRIEND', 'EX_BOYFRIEND'), ('FELLA', 'GRANNY'), ('COLT', 'FILLY'), ('GRANDFATHER', 'GRANDMOTHER'), ('Males', 'Females'), ('Men', 'Women'), ('males', 'females'), ('fraternity', 'sorority'), ('Fella', 'Granny'), ('spokesman', 'spokeswoman'), ('grandsons', 'granddaughters'), ('SONS', 'DAUGHTERS'), ('Councilman', 'Councilwoman'), ('MEN', 'WOMEN'), ('Monastery', 'Convent'), ('Prostate_Cancer', 'Ovarian_Cancer'), ('fathers', 'mothers'), ('GRANDSON', 'GRANDDAUGHTER'), ('FATHERS', 'MOTHERS'), ('grandfather', 'grandmother'), ('MAN', 'WOMAN'), ('Boys', 'Girls'), ('BROTHERS', 'SISTERS'), ('man', 'woman'), ('FATHERHOOD', 'MOTHERHOOD'), (

In [29]:
# profession analysis gender
sp_debiased = sorted([(E.v(w).dot(v_gender), w) for w in profession_words])

sp_debiased[0:20], sp_debiased[-20:]

([(-0.4154882, 'congressman'),
  (-0.4115872, 'businessman'),
  (-0.3297558, 'councilman'),
  (-0.2997815, 'dad'),
  (-0.21665451, 'statesman'),
  (-0.11345412, 'salesman'),
  (-0.073004864, 'monk'),
  (-0.07216395, 'handyman'),
  (-0.041478347, 'commander'),
  (-0.040511727, 'minister'),
  (-0.037369374, 'skipper'),
  (-0.036916208, 'commissioner'),
  (-0.033430077, 'observer'),
  (-0.032828875, 'manager'),
  (-0.032105125, 'firebrand'),
  (-0.031202454, 'surgeon'),
  (-0.03115207, 'citizen'),
  (-0.031070886, 'archbishop'),
  (-0.029434487, 'bishop'),
  (-0.029371599, 'captain')],
 [(0.033466958, 'drummer'),
  (0.03364377, 'student'),
  (0.03403572, 'illustrator'),
  (0.034525257, 'patrolman'),
  (0.034747515, 'hairdresser'),
  (0.037651714, 'foreman'),
  (0.03772233, 'carpenter'),
  (0.03777121, 'pastor'),
  (0.038419854, 'nanny'),
  (0.038888875, 'teenager'),
  (0.040513154, 'janitor'),
  (0.043088168, 'firefighter'),
  (0.047822032, 'wrestler'),
  (0.23776129, 'maid'),
  (0.241259

In [30]:
# analogies gender
a_gender_debiased = E.best_analogies_dist_thresh(v_gender)

for (a,b,c) in a_gender_debiased:
    print(a+"-"+b)

Computing neighbors
Mean: 10.216326685084963
Median: 7.0
grandma-grandpa
aunt-uncle
niece-nephew
moms-dads
schoolgirl-schoolboy
granddaughters-grandsons
convent-monastery
female-male
granddaughter-grandson
twin_sister-twin_brother
queens-kings
woman-man
motherhood-fatherhood
ladies-gentlemen
mare-gelding
filly-colt
ovarian_cancer-prostate_cancer
women-men
queen-king
herself-himself
congresswoman-congressman
councilwoman-councilman
daughter-son
sorority-fraternity
mom-dad
husbands-wives
sisters-brothers
gals-dudes
ex_boyfriend-ex_girlfriend
daughters-sons
chairwoman-chairman
sister-brother
girls-boys
females-males
estrogen-testosterone
mother-father
her-his
grandmother-grandfather
businesswoman-businessman
mothers-fathers
princess-prince
she-he
spokeswoman-spokesman
girl-boy
actress-actor
lesbian-gay
compatriot-countryman
husband-younger_brother
gal-dude
hers-theirs
heroine-hero
feminism-feminist
actresses-actors
childhood-boyhood
kid-guy
me-him
waitress-waiter
mommy-daddy
aunts-uncles


In [31]:
she_occupation = ['carpenter', 'nurse', 'receptionist', 'librarian', 'socialite', 'hairdresser', 
'maid', 'bookkeeper', 'stylist', 'housekeeper', 'interior designer', 'sewing']
for (a,b,c) in a_gender_debiased:
    if a in she_occupation:
        print(a+"-"+b+"-"+str(c))

maid-housekeeper-0.2925283
carpenter-handyman-0.13277006


In [32]:
plot_words_extreme(sp_debiased[:20], sp_debiased[-20:], 40, x_title='Word Extremes', y_title='Similarity', 
title='Debiased Gender - Profession')

### SOFT DEBIASING

In [33]:
#create new word emdedding for the purpose of testing soft debiasing
E_soft = WordEmbedding('w2v_gnews_small.txt')


*** Reading data from w2v_gnews_small.txt
(26423, 300)
26423 words of dimension 300 : in, for, that, is, ..., Jay, Leroy, Brad, Jermaine


In [34]:
soft_debias(E_soft, gender_specific_words, defs, equalize_pairs)

Optimization Completed, normalizing vector transform
26423 words of dimension 300 : in, for, that, is, ..., Jay, Leroy, Brad, Jermaine


In [35]:
# profession analysis gender
sp_softdebiased = sorted([(E_soft.v(w).dot(v_gender), w) for w in profession_words])

sp_softdebiased[0:20], sp_softdebiased[-20:]

([(-0.1777548, 'drug_addict'),
  (-0.16677907, 'plastic_surgeon'),
  (-0.16512035, 'actor'),
  (-0.16350327, 'alter_ego'),
  (-0.16339687, 'screenwriter'),
  (-0.15790638, 'comic'),
  (-0.15600741, 'confesses'),
  (-0.15448184, 'psychiatrist'),
  (-0.15214662, 'soft_spoken'),
  (-0.14799502, 'teenager'),
  (-0.14312887, 'neurosurgeon'),
  (-0.13936411, 'lawmaker'),
  (-0.13859978, 'comedian'),
  (-0.13564049, 'singer_songwriter'),
  (-0.13466011, 'cartoonist'),
  (-0.13441814, 'poet'),
  (-0.13341786, 'entrepreneur'),
  (-0.13248195, 'parliamentarian'),
  (-0.13184093, 'naturalist'),
  (-0.12937966, 'homemaker')],
 [(0.023444435, 'ambassador'),
  (0.025120577, 'ballerina'),
  (0.025296357, 'assistant_professor'),
  (0.025751486, 'rabbi'),
  (0.029306762, 'infielder'),
  (0.029671054, 'hooker'),
  (0.029877175, 'professor'),
  (0.030477457, 'adjunct_professor'),
  (0.03172747, 'realtor'),
  (0.040930294, 'paralegal'),
  (0.041478716, 'president'),
  (0.045913365, 'attorney'),
  (0.05157

In [36]:
# analogies gender
a_gender_softdebiased = E_soft.best_analogies_dist_thresh(v_gender)

for (a,b,c) in a_gender_softdebiased:
    print(a+"-"+b)

Computing neighbors
Mean: 7.023426560193771
Median: 5.0
transporters-transporter
suit_alleges-indictment_alleges
pour-inject
claims-claimed
sisters-twins
flooding-flash_flooding
peso-euro
warmer-calmer
city-county
apples-pumpkin
pilots-pilot
weakens-loses
rains-thunderstorms
quieted-subsided
vigorously-intensively
exceedingly-remarkably
dunk-dribble
underwhelming-unremarkable
flood-flash_flood
stipulated-envisaged
broke-smashed
comes-takes
migrant_workers-illegal_immigrants
indignant-bemused
hateful-profane
withstood-survived
imposition-impose
intellectuals-poets
superheroes-superhero
payout-jackpot
counterinsurgency-counter_terrorism
nationalization-nationalized
gays-bisexual
topped-eclipsed
songwriters-songwriter
condo-bungalow
energized-rejuvenated
strived-aspired
warriors-warrior
hard_liners-hardline
push-propel
missionaries-missionary
vampires-vampire
antibiotic-injectable
feminists-feminist
utmost_importance-ensure
invalidate-disqualify
contends-insists
argues-reckons
alleged-sus

In [35]:
from sklearn.metrics.pairwise import cosine_similarity

def compute_cosine_similarity(embedding, word1, word2):
    vec1 = embedding.v(word1).reshape(1, -1)
    vec2 = embedding.v(word2).reshape(1, -1)
    return cosine_similarity(vec1, vec2)[0][0]

# Example usage
word1 = 'dog'
word2 = 'tree'
similarity = compute_cosine_similarity(E, word1, word2)
print(f"Cosine similarity between '{word1}' and '{word2}': {similarity:.4f}")


Cosine similarity between 'dog' and 'tree': 0.2898


In [37]:
for (a,b,c) in a_gender_softdebiased:
    if a in she_occupation:
        print(a+"-"+b+"-"+str(c))

In [38]:
plot_words_extreme(sp_softdebiased[:20], sp_softdebiased[-20:], 40, x_title='Word Extremes', y_title='Similarity', 
title='Soft Debiased Gender - Profession')