# Workshop // Exploring Gender Bias in Word Embedding

## https://learn.responsibly.ai/word-embedding

### Powerd by [`responsibly`](https://docs.responsibly.ai/) - Toolkit for Auditing and Mitigating Bias and Fairness of Machine Learning Systems 🔎🤖🧰

![](images/banner.png)

### Legend:
# 💎 Important
# ⚡ Be Aware - Debated issue / interpret carefully / simplicity over precision
# 🛠️ Setup/Technical (a.k.a "the code is not important, just run it!")
# 🧪 Methodological Issue
# 💻 Hands-On - Your turn! NO programming background
# ⌨️ ... Some programming background (in Python) is required
# 🦄 Out of Scope

![](images/banner.png)

# Part One: Setup

## 1.1 - 🛠️ Install `responsibly`

In [None]:
!pip install --user git+https://github.com/ResponsiblyAI/responsibly.git@dev
# !pip install --user responsibly

## 1.2 - 🛠️ Validate Installation of `responsibly`

In [None]:
import responsibly

# You should get '0.1.2'
responsibly.__version__

---

# ⚠️


If you get an error of **`ModuleNotFoundError: No module named 'responsibly'`** after the installation, and you work on either **Colab** or **Binder** - this is **normal**.
<br/> <br/>
**Restart** the Kernel/Runtime (use the menu on top or the botton in the notebook), **skip** the installation cell (`!pip install --user responsibly`) and **run** the previous cell again (`import responsibly`).

###  Now it should all work fine!

![](images/banner.png)

# Part Two: Examples of Bias in Language Technology

## 2.1 - Translation

![](images/example-translate.jpg)

Source: [Google Blog](https://www.blog.google/products/translate/reducing-gender-bias-google-translate/), [Google AI Blog](https://ai.googleblog.com/2018/12/providing-gender-specific-translations.html)
## 2.2 - Automated Speech Recognition (ASR) 

![](images/asr-wer.jpg)

WER = Average Word Error Rate

`(substitutions + deletions + insertions) / total number of words`

Koenecke, Allison, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R. Rickford, Dan Jurafsky, and Sharad Goel. "[Racial disparities in automated speech recognition](https://www.pnas.org/content/117/14/7684)." Proceedings of the National Academy of Sciences 117, no. 14 (2020): 7684-7689.

[Stanford News](https://news.stanford.edu/2020/03/23/automated-speech-recognition-less-accurate-blacks/)

## 2.3 - Recruiting tool

### "Amazon scraps secret AI recruiting tool that showed bias against women" ([Reuters](https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G))

"But by 2015, the company realized its new system was not rating candidates for software developer jobs and other technical posts in a gender-neutral way."

![](images/banner.png)

# Part Three: Motivation - Why to use Word Embeddings?

## 3.1 - [NLP (Natural Language Processing)](https://en.wikipedia.org/wiki/Natural_language_processing) - **Very partial** list of tasks


### 1. Classification
- Fake news classification
- Toxic comment classification
- Review raiting (sentiment analysis)
- Hiring decision making by CV
- Automated essay scoring

### 3. Machine Translation

### 2. Information Retrieval
- Search engine
- Plagiarism detection

### 3. Conversation chatbot

### 4. Coreference Resolution
![](images/corefexample.png)
<small>Source: [Stanford Natural Language Processing Group](https://nlp.stanford.edu/projects/coref.shtml)</small>

<br><br><br><br>

## 3.3 - Machine Learning (NLP) Pipeline
<br>
<div style="border: 1px solid; padding: 50px; margin: 10px">
 <h2>
 
Data → Representation → (Structured) Inference → Prediction   

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;↑

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Auxiliary Corpus/Model
 </h2>
</div>
<br>

<small>Source: [Kai-Wei Chang (UCLA) - What It Takes to Control Societal Bias in Natural Language Processing](https://www.youtube.com/watch?v=RgcXD_1Cu18)</small>

![](images/banner.png)

<center>
<h2> 3.3 - Esessional Question - How to represent language to machine?</h2>
</center>

## We need some kind of *dictionary* 📖 to transform/encode
## ... from a human representation (words) 🗣 🔡
## ... to a machine representation (numbers) 🤖 🔢

<br><br><br><br>

## First Try

### Idea: Bag of Words (for a document)
![](images/bow.png)
<small>Source: Zheng, A.& Casari, A. (2018). Feature Engineering for Machine Learning. O'Reilly Media.</small>

In [None]:
from sklearn.feature_extraction.text import CountVectorizer

vocabulary = ['it', 'they', 'puppy', 'and', 'cat', 'aardvark', 'cute', 'extreamly', 'not']

vectorizer = CountVectorizer(vocabulary=vocabulary)

In [None]:
sentence = 'it is puppy and it extreamly cute'

Bag of words

In [None]:
vectorizer.fit_transform([sentence]).toarray()

In [None]:
vectorizer.fit_transform(['it is not puppy and it extreamly cute']).toarray()

In [None]:
vectorizer.fit_transform(['it is puppy and it extreamly not cute']).toarray()

One-hot representation

In [None]:
[vectorizer.fit_transform([word]).toarray()
 for word in sentence.split()]

### One-Hot Representation - The Issue with Text

![](images/audio-image-text.png)
<small>Source: [Tensorflow Documentation](https://www.tensorflow.org/tutorials/representation/word2vec)</small>

[Color Picker](https://www.google.com/search?q=color+picker)

<br><br><br><br>

## 3.4 - 💎 Idea: Embedding a word in a n-dimensional space

### Distributional Hypothesis
> "a word is characterized by the company it keeps" - [John Rupert Firth](https://en.wikipedia.org/wiki/John_Rupert_Firth)

#### 🦄 Training: using *word-context* relationships from a corpus. See: [The Illustrated Word2vec by Jay Alammar](http://jalammar.github.io/illustrated-word2vec/)

### Distance ~ Meaning Similarity

### 🦄 Examples (algorithms and pre-trained models)
- [Word2Vec](https://code.google.com/archive/p/word2vec/)
- [GloVe](https://nlp.stanford.edu/projects/glove/)
- [fastText](https://fasttext.cc/)
- [ELMo](https://allennlp.org/elmo) (contextualized)

#### 🦄 State of the Art
[The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
](http://jalammar.github.io/illustrated-bert/)

# Part Four: Let's play with Word2Vec word embedding...!

[Word2Vec](https://code.google.com/archive/p/word2vec/) - Google News - 100B tokens, 3M vocab, cased, 300d vectors - only lowercase vocab extracted

Loaded using [`responsibly`](http://docs.responsibly.ai) package, the function [`responsibly.we.load_w2v_small`]() returns a [`gensim`](https://radimrehurek.com/gensim/)'s [`KeyedVectors`](https://radimrehurek.com/gensim/models/keyedvectors.html#gensim.models.keyedvectors.KeyedVectors) object.


## 4.1 - Basic Properties

In [None]:
# 🛠️⚡ ignore warnings
# generally, you shouldn't do that, but for this tutorial we'll do so for the sake of simplicity

import warnings
warnings.filterwarnings('ignore')

In [None]:
from responsibly.we import load_w2v_small

w2v_small = load_w2v_small()

In [None]:
# vocabulary size

len(w2v_small.vocab)

In [None]:
# get the vector of the word "home"

print('home =', w2v_small['home'])

In [None]:
# the word embedding dimension, in this case, is 300

len(w2v_small['home'])

In [None]:
# all the words are normalized (=have norm equal to one as vectors)

from numpy.linalg import norm

norm(w2v_small['home'])

In [None]:
# 🛠️ make sure that all the vectors are normalized!

from numpy.testing import assert_almost_equal

length_vectors = norm(w2v_small.vectors, axis=1)

assert_almost_equal(actual=length_vectors,
                    desired=1,
                    decimal=5)

## 4.2 - 💎 Demo - Mesuring Distance between Words

![](https://upload.wikimedia.org/wikipedia/commons/thumb/7/7e/Sphere_wireframe_10deg_6r.svg/480px-Sphere_wireframe_10deg_6r.svg.png)

### Mesure of Similiarty: [Cosine Similariy](https://en.wikipedia.org/wiki/Cosine_similarity)
#### Measures the cosine of the angle between two vecotrs.
#### Ranges between 1 (same vector) to -1 (opposite/antipode vector)

#### In Python, for normalized vectors (Numpy's array), use the `@`(at) operator!


In [None]:
w2v_small['cat'] @ w2v_small['cat']

In [None]:
w2v_small['cat'] @ w2v_small['cats']

In [None]:
from math import acos, degrees

degrees(acos(w2v_small['cat'] @ w2v_small['cats']))

In [None]:
w2v_small['cat'] @ w2v_small['dog']

In [None]:
degrees(acos(w2v_small['cat'] @ w2v_small['dog']))

In [None]:
w2v_small['cat'] @ w2v_small['cow']

In [None]:
degrees(acos(w2v_small['cat'] @ w2v_small['cow']))

In [None]:
w2v_small['cat'] @ w2v_small['graduated']

In [None]:
degrees(acos(w2v_small['cat'] @ w2v_small['graduated']))

# 💎 In general, the use of Word Embedding to encode words, as an input for NLP systems (*), improve their performance compared to one-hot representation.

* Sometimes the embedding is learned as part of the NLP system.

## 4.3 - 🛠️ Demo - Visualization Word Embedding in 2D using T-SNE 

<small>Source: [Google's Seedbank](https://research.google.com/seedbank/seed/pretrained_word_embeddings)</small>

In [None]:
from sklearn.manifold import TSNE
from matplotlib import pylab as plt

# take the most common words in the corpus between 200 and 600
words = [word for word in w2v_small.index2word[200:600]]

# convert the words to vectors
embeddings = [w2v_small[word] for word in words]

# perform T-SNE
words_embedded = TSNE(n_components=2).fit_transform(embeddings)

# ... and visualize!
plt.figure(figsize=(20, 20))
for i, label in enumerate(words):
    x, y = words_embedded[i, :]
    plt.scatter(x, y)
    plt.annotate(label, xy=(x, y), xytext=(5, 2), textcoords='offset points',
                 ha='right', va='bottom', size=11)
plt.show()

## 4.4 - Demo - [Tensorflow Embedding Projector](http://projector.tensorflow.org)

⚡ Be cautious: It is easy to see "patterns".

## 4.5 - Demo - Most Similar

What are the most simlar words (=closer) to a given word?

In [None]:
w2v_small.most_similar('cat')

## 4.6 - [EXTRA] Demo - Doesn't Match

Given a list of words, which one doesn't match?

The word further away from the mean of all words.

In [None]:
w2v_small.doesnt_match('breakfast cereal dinner lunch'.split())

## 4.7 - Demo - Vector Arithmetic

![](images/vector-addition.png)

<small>Source: [Wikipedia](https://commons.wikimedia.org/wiki/File:Vector_add_scale.svg)</small>

In [None]:
# nature + science = ?

w2v_small.most_similar(positive=['nature', 'science'])

## 4.8 - 💎 More Vector Arithmetic

![](https://www.tensorflow.org/images/linear-relationships.png)
<small>Source: [Tensorflow Documentation](https://www.tensorflow.org/tutorials/representation/word2vec)</small>

## 4.9 - Demo - Vector Analogy

In [None]:
# man:king :: woman:?
# king - man + woman = ?

w2v_small.most_similar(positive=['king', 'woman'],
                       negative=['man'])

In [None]:
w2v_small.most_similar(positive=['big', 'smaller'],
                       negative=['small'])

## 4.10 - Think about a DIRECTION in word embedding as a RELATION

# $\overrightarrow{she} - \overrightarrow{he}$
# $\overrightarrow{smaller} - \overrightarrow{small}$
# $\overrightarrow{Spain} - \overrightarrow{Madrid}$


### ⚡ Direction is not a word vector by itself!

### ⚡ But it doesn't work all the time...

In [None]:
w2v_small.most_similar(positive=['forward', 'up'],
                       negative=['down'])

It might be because we have the"looking forward" which is acossiated with "excitement" in the data.

### ⚡🦄 Keep in mind the word embedding was generated by learning the co-occurrence of words, so the fact that it *empirically* exhibit "concept arithmetic", it doesn't necessarily mean it learned it! In fact, it seems it didn't.
See: [king - man + woman is queen; but why? by Piotr Migdał](https://p.migdal.pl/2017/01/06/king-man-woman-queen-why.html)

### 🦄 [Demo - Word Analogies Visualizer by Julia Bazińska](https://lamyiowce.github.io/word2viz/)

### ⚡🦄 In fact, `w2v_small.most_similar` find the most closest word which *is not one* of the given ones. This is a real methodological issue. Nowadays, it is not a common practice to evaluate word embedding with analogies.

You can use `from responsibly.we import most_similar` for the unrestricted version.

![](images/banner.png)

# Part Five: Gender Bias

### ⚡ We use the word *bias* merely as a technical term, without jugement of "good" or "bad". Later on we will put the bias into *human contextes* to evaluate it.

Keep in mind, the data is from Google News, the writers are professional journalists.

### Bolukbasi Tolga, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama, and Adam T. Kalai. [Man is to computer programmer as woman is to homemaker? debiasing word embeddings](https://arxiv.org/abs/1607.06520). NIPS 2016.

## 5.1 - Gender appropriate he-she analogies

In [None]:
# she:sister :: he:?
# sister - she + he = ?

w2v_small.most_similar(positive=['sister', 'he'],
                       negative=['she'])

```
queen-king
waitress-waiter
sister-brother
mother-father
ovarian_cancer-prostate_cancer
convent-monastery
```

## 5.2 - Gender stereotype he-she analogies

In [None]:
w2v_small.most_similar(positive=['nurse', 'he'],
                       negative=['she'])

```
sewing-carpentry
nurse-doctor
blond-burly
giggle-chuckle
sassy-snappy
volleyball-football
register_nurse-physician
interior_designer-architect
feminism-conservatism
vocalist-guitarist
diva-superstar
cupcakes-pizzas
housewife-shopkeeper
softball-baseball
cosmetics-pharmaceuticals
petite-lanky
charming-affable
hairdresser-barber
```

### But with the unrestricted version...

In [None]:
from responsibly.we import most_similar

In [None]:
most_similar(w2v_small,
             positive=['nurse', 'he'],
             negative=['she'])

## ⚡ Be Aware: According to a recent paper, it seems that the method of generating analogies enforce producing gender sterotype ones!

Nissim, M., van Noord, R., van der Goot, R. (2019). [Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor](https://arxiv.org/abs/1905.09866).

... and a [Twitter thread](https://twitter.com/adamfungi/status/1133865428663635968) between the authors of the two papares.

## My takeaway (and as well as of other researchers): Analogies are not approriate method to observe bias in word embedding.

## 🧪 What if our methodology introduce a bias?

## 5.3 - 💎 What can we take from analogies? Gender Direction!

# $\overrightarrow{she} - \overrightarrow{he}$

In [None]:
gender_direction = w2v_small['she'] - w2v_small['he']

gender_direction /= norm(gender_direction)

In [None]:
gender_direction @ w2v_small['architect']

In [None]:
gender_direction @ w2v_small['interior_designer']

### ⚡Interprete carefully: The word *architect* appears in more contexts with *he* than with *she*, and vice versa for *interior designer*.

🦄 In practice, we calculate the gender direction using multiple definitional pair of words for better estimation (words may have more than one meaning):

- woman - man
- girl - boy
- she - he
- mother - father
- daughter - son
- gal - guy
- female - male
- her - his
- herself - himself
- Mary - John

## 5.4 - 💻 Try some words by yourself
⚡ Keep in mind: You are performing exploratory data analysis, and not evaluate systematically!

In [None]:
gender_direction @ w2v_small['word']

## 5.5 - 💎 So What?

### Downstream Application - Putting a system into a human context

### Toy Example - Search Engine Ranking

- "MIT computer science PhD student"
- "doctoral candidate" ~ "PhD student"
- John:computer programmer :: Mary:homemaker

### Universal Embeddings
- Pre-trained on a large corpus
- Plugged in downstream task models (sentimental analysis, classification, translation …)
- Improvement of performances

## 5.6 - Measuring Bias in Word Embedding

# Think-Pair-Shar

```


















```
# Basic Ideas: Use neutral-gender words!
```


















```

# Neutral Professions!

## 5.7 - Projections

In [None]:
from responsibly.we import GenderBiasWE

w2v_small_gender_bias = GenderBiasWE(w2v_small, only_lower=True)

In [None]:
w2v_small_gender_bias.positive_end, w2v_small_gender_bias.negative_end

In [None]:
# gender direction
w2v_small_gender_bias.direction[:10]

In [None]:
from responsibly.we.data import BOLUKBASI_DATA

neutral_profession_names = BOLUKBASI_DATA['gender']['neutral_profession_names']

In [None]:
neutral_profession_names[:8]

Note: Why `actor` is in the neutral profession names list while `actress` is not there?
1. Due to the statistical nature of the method that is used to find the gender- specific and natural word
2. That might be because `actor` nowadays is much more gender-neutral, compared to waiter-waitress (see [Wikipedia - The term Actress](https://en.wikipedia.org/wiki/Actor#The_term_actress))

In [None]:
len(neutral_profession_names)

In [None]:
# the same of using the @ operator on the bias direction

w2v_small_gender_bias.project_on_direction(neutral_profession_names[0])

#### Let's visualize the projections of professions (neutral and specific by the orthography) on the gender direction

In [None]:
import matplotlib.pylab as plt

f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_bias.plot_projection_scores(n_extreme=20, ax=ax);

#### Demo - Visualizing gender bias with [Word Clouds](http://wordbias.umiacs.umd.edu/)

## 5.8 - Are the projections of occupation words on the gender direction related to the real world?

Let's take the percentage of female in various occupations from the Labor Force Statistics of 2017 Population Survey.

Taken from: https://arxiv.org/abs/1804.06876

In [None]:
from operator import itemgetter  # 🛠️ For idiomatic sorting in Python

from responsibly.we.data import OCCUPATION_FEMALE_PRECENTAGE

sorted(OCCUPATION_FEMALE_PRECENTAGE.items(), key=itemgetter(1))

In [None]:
f, ax = plt.subplots(1, figsize=(10, 8))

w2v_small_gender_bias.plot_factual_association(ax=ax);

### Also: Word embeddings quantify 100 years of gender stereotypes

Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). [Word embeddings quantify 100 years of gender and ethnic stereotypes](https://www.pnas.org/content/pnas/115/16/E3635.full.pdf). Proceedings of the National Academy of Sciences, 115(16), E3635-E3644.

![](images/gender-bias-over-decades.png)

<small>Data: Google Books/Corpus of Historical American English (COHA)</small>

Word embedding is sometimes used to analyze a collection of text in **digital humanities** - putting a system into a human context.

#### 🧪 Quite strong and interesting observation! We used "external" data which wan't used directly to create the word embedding.
#### It takes us to think about the *data generation process* - in both cases it is the "world", but it will be difficult to argue for causality:
##### 1. Text in newspapers
##### 2. Employment by gender

## 5.9 - Direct Bias Measure

1. Project each **neutral profession names** on the gender direction
2. Calculate the absolute value of each projection
3. Average it all

In [None]:
# using responsibly

w2v_small_gender_bias.calc_direct_bias()

In [None]:
# what responsibly does:

neutral_profession_projections = [w2v_small[word] @ w2v_small_gender_bias.direction
                                  for word in neutral_profession_names]

abs_neutral_profession_projections = [abs(proj) for proj in neutral_profession_projections]

sum(abs_neutral_profession_projections) / len(abs_neutral_profession_projections)

#### 🧪 What are the assumptions of the direct bias measure? How the choice of neutral word effect on the definition of the bias?

## 5.10 - [EXTRA] Indirect Bias Measure
Similarity due to shared "gender direction" projection

In [None]:
w2v_small_gender_bias.generate_closest_words_indirect_bias('softball',
                                                           'football')

# Part Six: Mitigating Bias

> We intentionally do not reference the resulting embeddings as "debiased" or free from all gender bias, and
prefer the term "mitigating bias" rather that "debiasing," to guard against the misconception that the resulting
embeddings are entirely "safe" and need not be critically evaluated for bias in downstream tasks. - <small>James-Sorenson, H., & Alvarez-Melis, D. (2019). [Probabilistic Bias Mitigation in Word Embeddings](https://arxiv.org/pdf/1910.14497.pdf). arXiv preprint arXiv:1910.14497.</small>


## 6.1 - Neutralize

In this case, we will remove the gender projection from all the words, except the neutral-gender ones, and then normalize.

🦄 We need to "learn" what are the gender-specific words in the vocabulary for a seed set of gender-specific words (by semi-automatic use of [WordNet](https://en.wikipedia.org/wiki/WordNet))

In [None]:
w2v_small_gender_debias = w2v_small_gender_bias.debias(method='neutralize', inplace=False)

In [None]:
print('home:',
      'before =', w2v_small_gender_bias.model['home'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['home'] @ w2v_small_gender_debias.direction)

In [None]:
print('man:',
      'before =', w2v_small_gender_bias.model['man'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.direction)

In [None]:
print('woman:',
      'before =', w2v_small_gender_bias.model['woman'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.direction)

In [None]:
w2v_small_gender_debias.calc_direct_bias()

In [None]:
f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_debias.plot_projection_scores(n_extreme=20, ax=ax);

In [None]:
f, ax = plt.subplots(1, figsize=(10, 8))

w2v_small_gender_debias.plot_factual_association(ax=ax);

## 6.2 [EXTRA] Equalize

- Do you see that `man` and `woman` have a different projection on the gender direction? 

- It might cause to different similarity (distance) to neutral words, such as to `kitchen`

In [None]:
w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.model['kitchen']

In [None]:
w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.model['kitchen']

In [None]:
BOLUKBASI_DATA['gender']['equalize_pairs'][:10]

## 6.3 - Hard Debias = Neutralize + Equalize

In [None]:
w2v_small_gender_debias = w2v_small_gender_bias.debias(method='hard', inplace=False)

In [None]:
print('home:',
      'before =', w2v_small_gender_bias.model['home'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['home'] @ w2v_small_gender_debias.direction)

In [None]:
print('man:',
      'before =', w2v_small_gender_bias.model['man'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.direction)

In [None]:
print('woman:',
      'before =', w2v_small_gender_bias.model['woman'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.direction)

In [None]:
w2v_small_gender_debias.calc_direct_bias()

In [None]:
w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.model['kitchen']

In [None]:
w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.model['kitchen']

In [None]:
f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_debias.plot_projection_scores(n_extreme=20, ax=ax);

## 6.4 - Compare Preformances

After debiasing, the performance of the word embedding, using standard benchmarks, get only slightly worse!

### ⚠️ It might take few minutes to run!

In [None]:
w2v_small_gender_bias.evaluate_word_embedding()

In [None]:
w2v_small_gender_debias.evaluate_word_embedding()

![](images/banner.png)

# 💎 Part Seven: So What?

We removed the gender bias, **as we defined it**, in a word embedding - Is there any impact on a downstream application?


### Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2018). [Gender bias in coreference resolution: Evaluation and debiasing methods](https://par.nsf.gov/servlets/purl/10084252). NAACL-HLT 2018.


#### WinoBias Dataset
![](images/coref-example.png)


#### Stereotypical Occupations (the source of `responsibly.we.data.OCCUPATION_FEMALE_PRECENTAGE`)
![](images/coref-occupations.png)

#### Results on *UW End-to-end Neural Coreference Resolution System*

##### No Intervention - Baseline

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   67.7   |            76.0            |             49.4            | 62.7 | 26.6* |

##### Intervention: Named-entity anonymization

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   66.4   |            73.5            |             51.2            | 62.6 | 21.3* |
|  Hard Debiased |   66.5   |            67.2            |             59.3            | 63.2 |  7.9* |

##### Interventions: Named-entity anonymization + Gender swapping

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   66.2   |            65.1            |             59.2            | 62.2 |  5.9* |
|  Hard Debiased |   66.3   |            63.9            |             62.8            | 63.4 |  1.1  |

### Zhao, J., Zhou, Y., Li, Z., Wang, W., & Chang, K. W. (2018). [Learning gender-neutral word embeddings](https://arxiv.org/pdf/1809.01496.pdf). EMNLP 2018.

#### Another bias mitigation method (tailor-made for GloVe training process)

![](images/gn-glove-results.png)

![](images/banner.png)

# 💎💎 Part Eight: Meta "So What?" - I

## How should we definition of "bias" in word embedding?

### 1. Intrinsic (e.g., direct bias)

### 2. External - Downstream application (e.g., coreference resolution, classification)

![](images/banner.png)

# 💎 Part Nine: Have we really removed the bias?

Let's look on another metric, called **WEAT** (Word Embedding Association Test) which is inspired by **IAT** (Implicit-Association Test) from Pyschology.

### Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). [Semantics derived automatically from language corpora contain human-like biases.](http://www.cs.bath.ac.uk/~jjb/ftp/CaliskanEtAl-authors-full.pdf) Science, 356(6334), 183-186.


## 9.1 - Ingredients

1. Target words (e.g., Male ve. Female)

2. Attribute words (e.g., Math vs. Arts)

In [None]:
from copy import deepcopy  # 🛠️ For copying a nested data structure in Python

from responsibly.we.weat import WEAT_DATA


# B. A. Nosek, M. R. Banaji, A. G. Greenwald, Math=male, me=female, therefore math≠me.,
# Journal of Personality and Social Psychology 83, 44 (2002).
weat_gender_science_arts = deepcopy(WEAT_DATA[7])

In [None]:
# 🛠️ filter words from the original IAT experiment that are not presend in the reduced Word2Vec model

from responsibly.we.weat import _filter_by_model_weat_stimuli

_filter_by_model_weat_stimuli(weat_gender_science_arts, w2v_small)

In [None]:
weat_gender_science_arts['first_attribute']

In [None]:
weat_gender_science_arts['second_attribute']

In [None]:
weat_gender_science_arts['first_target']

In [None]:
weat_gender_science_arts['second_target']

## 9.2 - Recipe

➕ Male x Science

➖ Male x Arts

➖ Female x Science

➕ Female x Arts

In [None]:
def calc_combination_similiarity(model, attribute, target):
    score = 0

    for attribute_word in attribute['words']:

        for target_word in target['words']:

            score += w2v_small.similarity(attribute_word,
                                          target_word)

    return score

In [None]:
male_science_score = calc_combination_similiarity(w2v_small,
                                                  weat_gender_science_arts['first_attribute'],
                                                  weat_gender_science_arts['first_target'])

male_science_score

In [None]:
male_arts_score = calc_combination_similiarity(w2v_small,
                                               weat_gender_science_arts['first_attribute'],
                                               weat_gender_science_arts['second_target'])

male_arts_score

In [None]:
female_science_score = calc_combination_similiarity(w2v_small,
                                                    weat_gender_science_arts['second_attribute'],
                                                    weat_gender_science_arts['first_target'])

female_science_score

In [None]:
female_arts_score = calc_combination_similiarity(w2v_small,
                                                 weat_gender_science_arts['second_attribute'],
                                                 weat_gender_science_arts['second_target'])

female_arts_score

In [None]:
male_science_score - male_arts_score - female_science_score + female_arts_score

In [None]:
len(weat_gender_science_arts['first_attribute']['words'])

In [None]:
(male_science_score - male_arts_score - female_science_score + female_arts_score) / 8

## 9.3 - All WEAT Tests

In [None]:
from responsibly.we import calc_all_weat

calc_all_weat(w2v_small, [weat_gender_science_arts])

### ⚡ Important Note: Our results are a bit different because we use a reduced Word2Vec.


### Results from the Paper (computed on the complete Word2Vec):

![](images/weat-w2v.png)


### ⚡Caveats regarding comparing WEAT to the IAT

- Individuals (IAT) vs. Words (WEAT)
- Therefore, the meanings of the effect size and p-value are totally different!

### ⚡🦄 The definition of the WEAT score is structured differently (but it is computationally equivalent). The original formulation matters to compute the p-value. Refer to the paper for details.

### 🧪 With the effect size, we can "compare" a human bias to a machine one. It raises the question whether the baseline for meauring bias/fairness of a machine should be human bias? Then a well-performing machine shouldn't be necessarily not biased, but only less biased than human (think about autonomous cars or semi-structured vs. unstructured interview).

## 9.4 - Let's go back to our question - did we removed the bias?


### Gonen, H., & Goldberg, Y. (2019, June). [Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them](https://arxiv.org/pdf/1903.03862.pdf). In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 609-614).

They used multiple methods, we'll show only two:
1. WEAT
2. Neutral words clustering

In [None]:
w2v_small_gender_bias.calc_direct_bias()

In [None]:
w2v_small_gender_debias.calc_direct_bias()

### 9.4.1 -  WEAT - before and after

![](images/weat-experiment.png)

See `responsibly` [demo page on word embedding](https://docs.responsibly.ail/notebooks/demo-word-embedding-bias.html#first-experiment-weat-before-and-after-debias) for a complete example.

### 9.4.2 - Clustering Neutral Gender Words

In [None]:
w2v_vocab = {word for word in w2v_small_gender_bias.model.vocab.keys()}

# 🦄 how we got these words - read the Bolukbasi's paper for details
all_gender_specific_words = set(BOLUKBASI_DATA['gender']['specific_full_with_definitional_equalize'])

all_gender_neutral_words = w2v_vocab - all_gender_specific_words

print('#vocab =', len(w2v_vocab),
      '#specific =', len(all_gender_specific_words),
      '#neutral =', len(all_gender_neutral_words))

In [None]:
neutral_words_gender_projections = [(w2v_small_gender_bias.project_on_direction(word), word)
                                    for word in all_gender_neutral_words]

neutral_words_gender_projections.sort()

In [None]:
neutral_words_gender_projections[:-20:-1]

In [None]:
neutral_words_gender_projections[:20]

In [None]:
# Neutral words: top 500 male-biased and top 500 female-biased words

GenderBiasWE.plot_most_biased_clustering(w2v_small_gender_bias, w2v_small_gender_debias);

Note: In the paper, they got a stronger result, 92.5% accuracy for the debiased model.
However, they perform clustering on all the words from the reduced word embedding, both gender- neutral and specific words, and applied slightly different pre-processing.

### 9.4.3 - 💎 Strong words form the paper (emphasis mine):

> The experiments ...
reveal a **systematic bias** found in the embeddings,
which is **independent of the gender direction**.


> The implications are alarming: while suggested
debiasing methods work well at removing the gender direction, the **debiasing is mostly superficial**.
The bias stemming from world stereotypes and
learned from the corpus is **ingrained much more
deeply** in the embeddings space.


> .. real concern from biased representations is **not the association** of a concept with
words such as “he”, “she”, “boy”, “girl” **nor** being
able to perform **gender-stereotypical word analogies**... algorithmic discrimination is more likely to happen by associating one **implicitly gendered** term with
other implicitly gendered terms, or picking up on
**gender-specific regularities** in the corpus by learning to condition on gender-biased words, and generalizing to other gender-biased words.


![](images/banner.png)

# 💎💎 Part Ten: Meta "So What?" - II

## Can we debias at all a word embedding?

## Under some downstream use-cases, maybe the bias in the word embedding is desirable?

![](images/banner.png)

# ⌨️ Part Eleven: Your Turn!

## Explore bias in word embedding by other groups (such as race and religious)

**Task 1.** Let's explor racial bias usint Tolga's approche. Will use the [`responsibly.we.BiasWordEmbedding`](http://docs.responsibly.ai/word-embedding-bias.html#ethically.we.bias.BiasWordEmbedding) class. `GenderBiasWE` is a sub-class of `BiasWordEmbedding`.

In [None]:
from responsibly.we import BiasWordEmbedding

w2v_small_racial_bias = BiasWordEmbedding(w2v_small, only_lower=True)

#### 💎💎💎 Identify the racial direction using the `sum` method

In [None]:
white_common_names = ['Emily', 'Anne', 'Jill', 'Allison', 'Laurie', 'Sarah', 'Meredith', 'Carrie',
                      'Kristen', 'Todd', 'Neil', 'Geoffrey', 'Brett', 'Brendan', 'Greg', 'Matthew',
                      'Jay', 'Brad']

black_common_names = ['Aisha', 'Keisha', 'Tamika', 'Lakisha', 'Tanisha', 'Latoya', 'Kenya', 'Latonya',
                      'Ebony', 'Rasheed', 'Tremayne', 'Kareem', 'Darnell', 'Tyrone', 'Hakim', 'Jamal',
                      'Leroy', 'Jermaine']

w2v_small_racial_bias._identify_direction('Whites', 'Blacks',
                                          definitional=(white_common_names, black_common_names),
                                          method='sum')

#### Use the neutral profession names to measure the racial bias

In [None]:
neutral_profession_names = BOLUKBASI_DATA['gender']['neutral_profession_names']

In [None]:
neutral_profession_names[:10]

In [None]:
f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_racial_bias.plot_projection_scores(neutral_profession_names, n_extreme=20, ax=ax);

#### Calculate the direct bias measure

In [None]:
# Your Code Here...

#### Keep exploring the racial bias

In [None]:
# Your Code Here...

**Task 2.** Open the [word embedding demo page in `responsibly` documentation](http://docs.responsibly.ai/notebooks/demo-word-embedding-bias.html#it-is-possible-also-to-expirements-with-new-target-word-sets-as-in-this-example-citizen-immigrant), and look on the use of the function [`calc_weat_pleasant_unpleasant_attribute`](). What was the attempt in that experiment? What was the result? Can you come up with other experiments?

In [None]:
from responsibly.we import calc_weat_pleasant_unpleasant_attribute

In [None]:
# Your Code Here...

![](images/banner.png)

# Part Twelfth: Examples of Representation Bias in the Context of Gender

![](images/examples-gender-bias-nlp.png)

<small>Source: Sun, T., Gaut, A., Tang, S., Huang, Y., ElSherief, M., Zhao, J., ... & Wang, W. Y. (2019). [Mitigating Gender Bias in Natural Language Processing: Literature Review](https://arxiv.org/pdf/1906.08976.pdf). arXiv preprint arXiv:1906.08976.</small>


![](images/banner.png)

# 💎💎 Part Twelve: Takeaways

1. **Downstream application** - putting a system into a human context

2. **Measurements** (a.k.a "what is a *good* system?")

3. **Data** (generation process, corpus building, selection bias, train vs. validation vs. test datasets)

4. **Impact** of a system on individuals, groups, society, and humanity, both for **long-term** and **scale-up**

![](images/banner.png)

# Resources

### [Doing Data Science Responsibly - Resources](https://handbook.responsibly.ai/appendices/resources.html)


## Non-Technical Overview with More Downstream Application Examples
- [Google - Text Embedding Models Contain Bias. Here's Why That Matters.](https://developers.googleblog.com/2018/04/text-embedding-models-contain-bias.html)
- [Kai-Wei Chang (UCLA) - What It Takes to Control Societal Bias in Natural Language Processing](https://www.youtube.com/watch?v=RgcXD_1Cu18)
- Sun, T., Gaut, A., Tang, S., Huang, Y., ElSherief, M., Zhao, J., ... & Wang, W. Y. (2019). [Mitigating Gender Bias in Natural Language Processing: Literature Review](https://arxiv.org/pdf/1906.08976.pdf). arXiv preprint arXiv:1906.08976.

## Additional Related Work

- **Understanding Bias**
    - Ethayarajh, K., Duvenaud, D., & Hirst, G. (2019, July). [Understanding Undesirable Word Embedding Associations](https://arxiv.org/pdf/1908.06361.pdf). In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 1696-1705). - **Including critical analysis of the current metrics and debiasing methods (quite technical)**

  - Brunet, M. E., Alkalay-Houlihan, C., Anderson, A., & Zemel, R. (2019, May). [Understanding the Origins of Bias in Word Embeddings](https://arxiv.org/pdf/1810.03611.pdf). In International Conference on Machine Learning (pp. 803-811).


- **Discovering Biases**
  - Swinger, N., De-Arteaga, M., Heffernan IV, N. T., Leiserson, M. D., & Kalai, A. T. (2019, January). [What are the biases in my word embedding?](https://arxiv.org/pdf/1812.08769.pdf). In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (pp. 305-311). ACM.
    Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories
  
  - Chaloner, K., & Maldonado, A. (2019, August). [Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories](https://www.aclweb.org/anthology/W19-3804). In Proceedings of the First Workshop on Gender Bias in Natural Language Processing (pp. 25-32).


- **Fairness in Classification**
  - Prost, F., Thain, N., & Bolukbasi, T. (2019, August). [Debiasing Embeddings for Reduced Gender Bias in Text Classification](https://arxiv.org/pdf/1908.02810.pdf). In Proceedings of the First Workshop on Gender Bias in Natural Language Processing (pp. 69-75).
  
  - Romanov, A., De-Arteaga, M., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., ... & Kalai, A. (2019, June). [What's in a Name? Reducing Bias in Bios without Access to Protected Attributes](https://arxiv.org/pdf/1904.05233.pdf). In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4187-4195).


- **Other**
  
  - Zhao, J., Wang, T., Yatskar, M., Cotterell, R., Ordonez, V., & Chang, K. W. (2019, June). [Gender Bias in Contextualized Word Embeddings](https://arxiv.org/pdf/1904.03310.pdf). In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 629-634). [slides](https://jyzhao.net/files/naacl19.pdf)

  - Zhou, P., Shi, W., Zhao, J., Huang, K. H., Chen, M., & Chang, K. W. [Analyzing and Mitigating Gender Bias in Languages with Grammatical Gender and Bilingual Word Embeddings](https://aiforsocialgood.github.io/icml2019/accepted/track1/pdfs/47_aisg_icml2019.pdf). ICML 2019 - AI for Social Good. [Poster](https://aiforsocialgood.github.io/icml2019/accepted/track1/posters/47_aisg_icml2019.pdf)


##### Complete example of using `responsibly` with Word2Vec, GloVe and fastText: http://docs.responsibly.ai/notebooks/demo-gender-bias-words-embedding.html


## Bias in NLP

Around dozen of papers on this field until 2019, but nowdays plenty of work is done.
- [1st ACL Workshop on Gender Bias for Natural Language Processing](https://genderbiasnlp.talp.cat/)
- [NAACL 2019](https://naacl2019.org/)

![](images/banner.png)

<center><h1>THE END!</h1></center>