# Workshop // Exploring Gender Bias in Word Embedding

## https://learn.responsibly.ai/word-embedding

Powerd by [`responsibly`](https://docs.responsibly.ai/) - Toolkit for auditing and mitigating bias and fairness of machine learning systems 🔎🤖🧰

# Part Five: Gender Bias

**⚡ We use the word *bias* merely as a technical term, without jugement of "good" or "bad". Later on we will put the bias into *human contextes* to evaluate it.**

Keep in mind, the data is from Google News, the writers are professional journalists.

Bolukbasi Tolga, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama, and Adam T. Kalai. [Man is to computer programmer as woman is to homemaker? debiasing word embeddings](https://arxiv.org/abs/1607.06520). NIPS 2016.

## 5.1 - Gender appropriate he-she analogies

In [None]:
# she:sister :: he:?
# sister - she + he = ?

w2v_small.most_similar(positive=['sister', 'he'],
                       negative=['she'])

```
queen-king
waitress-waiter
sister-brother
mother-father
ovarian_cancer-prostate_cancer
convent-monastery
```

## 5.2 - Gender stereotype he-she analogies

In [None]:
w2v_small.most_similar(positive=['nurse', 'he'],
                       negative=['she'])

```
sewing-carpentry
nurse-doctor
blond-burly
giggle-chuckle
sassy-snappy
volleyball-football
register_nurse-physician
interior_designer-architect
feminism-conservatism
vocalist-guitarist
diva-superstar
cupcakes-pizzas
housewife-shopkeeper
softball-baseball
cosmetics-pharmaceuticals
petite-lanky
charming-affable
hairdresser-barber
```

### Methodological Issue: The unrestricted version of analogy generation

In [None]:
from responsibly.we import most_similar

In [None]:
most_similar(w2v_small,
             positive=['nurse', 'he'],
             negative=['she'])

⚡ Be Aware: According to a recent paper, it seems that the method of generating analogies enforce producing gender sterotype ones!

Nissim, M., van Noord, R., van der Goot, R. (2019). [Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor](https://arxiv.org/abs/1905.09866).

... and a [Twitter thread](https://twitter.com/adamfungi/status/1133865428663635968) between the authors of the two papares.

My takeaway (and as well as of other researchers): Analogies are not approriate method to observe bias in word embedding.

🧪 What if our methodology introduce a bias?

## 5.3 - 💎 What can we take from analogies? Gender Direction!

# $\overrightarrow{she} - \overrightarrow{he}$

In [None]:
gender_direction = w2v_small['she'] - w2v_small['he']

gender_direction /= norm(gender_direction)

In [None]:
gender_direction @ w2v_small['architect']

In [None]:
gender_direction @ w2v_small['interior_designer']

**⚡Interprete carefully: The word *architect* appears in more contexts with *he* than with *she*, and vice versa for *interior designer*.**

🦄 In practice, we calculate the gender direction using multiple definitional pair of words for better estimation (words may have more than one meaning):

- woman - man
- girl - boy
- she - he
- mother - father
- daughter - son
- gal - guy
- female - male
- her - his
- herself - himself
- Mary - John

## 5.4 - 💻 Try some words by yourself
⚡ Keep in mind: You are performing exploratory data analysis, and not evaluate systematically!

In [None]:
gender_direction @ w2v_small['word']

## 5.5 - 💎 So What?

Downstream Application - Putting a system into a human context

### Toy Example - Search Engine Ranking

- "MIT computer science PhD student"
- "doctoral candidate" ~ "PhD student"
- John:computer programmer :: Mary:homemaker

### Universal Embeddings
- Pre-trained on a large corpus
- Plugged in downstream task models (sentimental analysis, classification, translation …)
- Improvement of performances

## 5.6 - Measuring Bias in Word Embedding

# Think-Pair-Shar

```


















```
**Basic Ideas: Use neutral-gender words!**
```


















```

**Neutral Professions!**

## 5.7 - Projections

In [None]:
from responsibly.we import GenderBiasWE

w2v_small_gender_bias = GenderBiasWE(w2v_small, only_lower=True)

In [None]:
w2v_small_gender_bias.positive_end, w2v_small_gender_bias.negative_end

In [None]:
# gender direction
w2v_small_gender_bias.direction[:10]

In [None]:
from responsibly.we.data import BOLUKBASI_DATA

neutral_profession_names = BOLUKBASI_DATA['gender']['neutral_profession_names']

In [None]:
neutral_profession_names[:8]

Note: Why `actor` is in the neutral profession names list while `actress` is not there?
1. Due to the statistical nature of the method that is used to find the gender- specific and natural word
2. That might be because `actor` nowadays is much more gender-neutral, compared to waiter-waitress (see [Wikipedia - The term Actress](https://en.wikipedia.org/wiki/Actor#The_term_actress))

In [None]:
len(neutral_profession_names)

In [None]:
# the same of using the @ operator on the bias direction

w2v_small_gender_bias.project_on_direction(neutral_profession_names[0])

**Let's visualize the projections of professions (neutral and specific by the orthography) on the gender direction**

In [None]:
import matplotlib.pylab as plt

f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_bias.plot_projection_scores(n_extreme=20, ax=ax);

EXTRA: Demo - Visualizing gender bias with [Word Clouds](http://wordbias.umiacs.umd.edu/)

## 5.8 - Are the projections of occupation words on the gender direction related to the real world?

Let's take the percentage of female in various occupations from the Labor Force Statistics of 2017 Population Survey.

Taken from: https://arxiv.org/abs/1804.06876

In [None]:
from operator import itemgetter  # 🛠️ For idiomatic sorting in Python

from responsibly.we.data import OCCUPATION_FEMALE_PRECENTAGE

sorted(OCCUPATION_FEMALE_PRECENTAGE.items(), key=itemgetter(1))

In [None]:
f, ax = plt.subplots(1, figsize=(10, 8))

w2v_small_gender_bias.plot_factual_association(ax=ax);

### Also: Word embeddings quantify 100 years of gender stereotypes

Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). [Word embeddings quantify 100 years of gender and ethnic stereotypes](https://www.pnas.org/content/pnas/115/16/E3635.full.pdf). Proceedings of the National Academy of Sciences, 115(16), E3635-E3644.

![](../images/gender-bias-over-decades.png)

<small>Data: Google Books/Corpus of Historical American English (COHA)</small>

Word embedding is sometimes used to analyze a collection of text in **digital humanities** - putting a system into a human context.

🧪 Quite strong and interesting observation! We used "external" data which wan't used directly to create the word embedding.

It takes us to think about the *data generation process* - in both cases it is the "world", but it will be difficult to argue for causality only in one direction:
1. Text in newspapers
2. Employment by gender

## 5.9 - Direct Bias Measure

1. Project each **neutral profession names** on the gender direction
2. Calculate the absolute value of each projection
3. Average it all

In [None]:
# using responsibly

w2v_small_gender_bias.calc_direct_bias()

In [None]:
# what responsibly does:

neutral_profession_projections = [w2v_small[word] @ w2v_small_gender_bias.direction
                                  for word in neutral_profession_names]

abs_neutral_profession_projections = [abs(proj) for proj in neutral_profession_projections]

sum(abs_neutral_profession_projections) / len(abs_neutral_profession_projections)

🧪 What are the assumptions of the direct bias measure? How the choice of neutral word effect on the definition of the bias?

## 5.10 - [EXTRA] Indirect Bias Measure
Similarity due to shared "gender direction" projection

In [None]:
w2v_small_gender_bias.generate_closest_words_indirect_bias('softball',
                                                           'football')

# Part Six: Mitigating Bias

> We intentionally do not reference the resulting embeddings as "debiased" or free from all gender bias, and
prefer the term "mitigating bias" rather that "debiasing," to guard against the misconception that the resulting
embeddings are entirely "safe" and need not be critically evaluated for bias in downstream tasks. <small>James-Sorenson, H., & Alvarez-Melis, D. (2019). [Probabilistic Bias Mitigation in Word Embeddings](https://arxiv.org/pdf/1910.14497.pdf). arXiv preprint arXiv:1910.14497.</small>


## 6.1 - Neutralize

In this case, we will remove the gender projection from all the words, except the neutral-gender ones, and then normalize.

🦄 We need to "learn" what are the gender-specific words in the vocabulary for a seed set of gender-specific words (by semi-automatic use of [WordNet](https://en.wikipedia.org/wiki/WordNet))

In [None]:
w2v_small_gender_debias = w2v_small_gender_bias.debias(method='neutralize', inplace=False)

In [None]:
print('home:',
      'before =', w2v_small_gender_bias.model['home'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['home'] @ w2v_small_gender_debias.direction)

In [None]:
print('man:',
      'before =', w2v_small_gender_bias.model['man'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.direction)

In [None]:
print('woman:',
      'before =', w2v_small_gender_bias.model['woman'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.direction)

In [None]:
w2v_small_gender_debias.calc_direct_bias()

In [None]:
f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_debias.plot_projection_scores(n_extreme=20, ax=ax);

In [None]:
f, ax = plt.subplots(1, figsize=(10, 8))

w2v_small_gender_debias.plot_factual_association(ax=ax);

## 6.2 [EXTRA] Equalize

- Do you see that `man` and `woman` have a different projection on the gender direction? 

- It might cause to different similarity (distance) to neutral words, such as to `kitchen`

In [None]:
w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.model['kitchen']

In [None]:
w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.model['kitchen']

In [None]:
BOLUKBASI_DATA['gender']['equalize_pairs'][:10]

## 6.3 - Hard Debias = Neutralize + Equalize

In [None]:
w2v_small_gender_debias = w2v_small_gender_bias.debias(method='hard', inplace=False)

In [None]:
print('home:',
      'before =', w2v_small_gender_bias.model['home'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['home'] @ w2v_small_gender_debias.direction)

In [None]:
print('man:',
      'before =', w2v_small_gender_bias.model['man'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.direction)

In [None]:
print('woman:',
      'before =', w2v_small_gender_bias.model['woman'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.direction)

In [None]:
w2v_small_gender_debias.calc_direct_bias()

In [None]:
w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.model['kitchen']

In [None]:
w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.model['kitchen']

In [None]:
f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_debias.plot_projection_scores(n_extreme=20, ax=ax);

## 6.4 - Compare Preformances

After debiasing, the performance of the word embedding, using standard benchmarks, get only slightly worse!

**⚠️ It might take few minutes to run!**

In [None]:
w2v_small_gender_bias.evaluate_word_embedding()

In [None]:
w2v_small_gender_debias.evaluate_word_embedding()

# 💎 Part Seven: So What?

We removed the gender bias, **as we defined it**, in a word embedding - Is there any impact on a downstream application?

## First example: coreference resolution

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2018). [Gender bias in coreference resolution: Evaluation and debiasing methods](https://par.nsf.gov/servlets/purl/10084252). NAACL-HLT 2018.


### WinoBias Dataset
![](../images/coref-example.png)


### Stereotypical Occupations (the source of `responsibly.we.data.OCCUPATION_FEMALE_PRECENTAGE`)
![](../images/coref-occupations.png)

### Results on *UW End-to-end Neural Coreference Resolution System*

#### No Intervention - Baseline

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   67.7   |            76.0            |             49.4            | 62.7 | 26.6* |

#### Intervention: Named-entity anonymization

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   66.4   |            73.5            |             51.2            | 62.6 | 21.3* |
|  Hard Debiased |   66.5   |            67.2            |             59.3            | 63.2 |  7.9* |

#### Interventions: Named-entity anonymization + Gender swapping

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   66.2   |            65.1            |             59.2            | 62.2 |  5.9* |
|  Hard Debiased |   66.3   |            63.9            |             62.8            | 63.4 |  1.1  |

## Second example: another bias mitigation method

Zhao, J., Zhou, Y., Li, Z., Wang, W., & Chang, K. W. (2018). [Learning gender-neutral word embeddings](https://arxiv.org/pdf/1809.01496.pdf). EMNLP 2018.

The mitigation method is tailor-made for GloVe training process.

![](../images/gn-glove-results.png)

# 💎💎 Part Eight: Meta "So What?" - I

## How should we definition of "bias" in word embedding?

### 1. Intrinsic (e.g., direct bias)

### 2. External - Downstream application (e.g., coreference resolution, classification)