# Tutorial // Exploring Gender Bias in Word Embedding

## https://learn.responsibly.ai/word-embedding

Powered by [`responsibly`](https://docs.responsibly.ai/) - Toolkit for auditing and mitigating bias and fairness of machine learning systems 🔎🤖🧰

# Overview

## Learning Objectives:

1. Gaining an intuitive technical understanding of bias in machine learning systems.

2. Exploring the interplay between data, algorithms, application, workflow, and human context when considering responsible AI.

## Audience
Everyone, really. No previous knowledge is assumed. If you have a background, you will be able to understand the topic more deeply.


## Method

1. Dive into one family of machine learning models/building-block as a scaffolder.

2. Focus on one example of bias: by gender.

## Disclaimers

1. Word embeddings are not very important by themselves in the context of responsible AI, but bias can be demonstrated with them in an intuitive way.

2. We focus on gender bias, and treat it as binary for the simplicity in this work. Nevertheless, gender is a complex social construct, and we should keep it in mind when we go back from a learning context to the real-world.

2. We don't aim to give a comprehensive overview of neither bias in machine learning nor fairness nor responsible AI.

3. If you need to work on one of these topics, this workshop is far from being enough, but it can serve as a starting point for your learning path.

4. On top of that, it is an active area of research.

5. And there is much more to say about this topic, especially from Science and Technology Studies (STS) point of view (but not only).

5. Therefore, we will provide good learning resources at the end.


## Legend
💎 Important

⚡ Be Aware - Debated issue / interpret carefully / simplicity over precision

🛠️ Setup/Technical (a.k.a "the code is not important, just run it!")

🧪 Methodological Issue

💻 Hands-On - Your turn! NO programming background

⌨️ ... Some programming background (in Python) is required

🦄 Out of Scope


<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part One: Setup

## 1.1 - 🛠️ Install `responsibly`

In [None]:
%pip install --user responsibly

---

### You might need to restart youe notebook
<big>⚠️</big>

If you get an error of **`ModuleNotFoundError: No module named 'responsibly'`** after the `import responsibly` in the next cell, and you work on either **Colab** or **Binder** - this is **normal**.
<br/> <br/>
**Restart** the Kernel/Runtime (use the menu on top or the botton in the notebook), **skip** the installation cell (`!pip install --user responsibly`) and **run** the previous cell again (`import responsibly`).

Now it should all work fine!

---

## 1.2 - Validate Installation of `responsibly`
<big>🛠️</big>

In [None]:
import responsibly

# You should get '0.1.3'
responsibly.__version__

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Two: Examples of Bias in Language Technology

## 2.1 - Translation

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/example-translate.jpg?raw=1"/>

Source: [Google Blog](https://www.blog.google/products/translate/reducing-gender-bias-google-translate/), [Google AI Blog](https://ai.googleblog.com/2018/12/providing-gender-specific-translations.html)

## 2.2 - Automated Speech Recognition (ASR) 

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/asr-wer.jpg?raw=1" />

WER = Average Word Error Rate

`(substitutions + deletions + insertions) / total number of words`

Koenecke, Allison, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R. Rickford, Dan Jurafsky, and Sharad Goel. "[Racial disparities in automated speech recognition](https://www.pnas.org/content/117/14/7684)." Proceedings of the National Academy of Sciences 117, no. 14 (2020): 7684-7689.

[Stanford News](https://news.stanford.edu/2020/03/23/automated-speech-recognition-less-accurate-blacks/)

## 2.3 - Recruiting tool

"Amazon scraps secret AI recruiting tool that showed bias against women" ([Reuters](https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G))

"But by 2015, the company realized its new system was not rating candidates for software developer jobs and other technical posts in a gender-neutral way."


## 2.4 - Natural Language Generation  (based on language models)


### WARNING: The following demonstration contains examples which are offensive in nature.

```


















```

[Write With Transformer](https://transformer.huggingface.co/doc/gpt2-large) (OpenAI GPT-2)

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/nlg-prompt.png?raw=1" width="400"/>

1. Sheng, E., Chang, K. W., Natarajan, P., & Peng, N. (2019). [The woman worked as a babysitter: On biases in language generation](https://arxiv.org/pdf/1909.01326.pdf). arXiv preprint arXiv:1909.01326.
2. [StereoSet](https://stereoset.mit.edu/)
Nadeem, M., Bethke, A., & Reddy, S. (2020). [StereoSet: Measuring stereotypical bias in pretrained language models](https://arxiv.org/pdf/2004.09456.pdf). arXiv preprint arXiv:2004.09456.

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Three: Motivation - Why to use Word Embeddings?

## 3.1 - [NLP (Natural Language Processing)](https://en.wikipedia.org/wiki/Natural_language_processing)
**Very partial** list of tasks


### 1. Classification
- Fake news classification
- Toxic comment classification
- Review raiting (sentiment analysis)
- Hiring decision making by CV
- Automated essay scoring

### 3. Machine Translation

### 2. Information Retrieval
- Search engine
- Plagiarism detection

### 3. Conversation Chatbot

### 4. Coreference Resolution
<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/corefexample.png?raw=1" />

<small>Source: [Stanford Natural Language Processing Group](https://nlp.stanford.edu/projects/coref.shtml)</small>

## 3.2 - Machine Learning (NLP) Pipeline

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/nlp-pipeline.png?raw=1" />

<small>Source: [Kai-Wei Chang (UCLA) - What It Takes to Control Societal Bias in Natural Language Processing](https://www.youtube.com/watch?v=RgcXD_1Cu18)</small>

## 3.3 - Esessional Question - How to represent language to machine?

We need some kind of *dictionary* 📖 to transform/encode

→ from a human representation (words) 🗣 🔡

→ to a machine representation (numbers) 🤖 🔢

### First Atempt

### Idea: Bag of Words (for a document)
<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/bow.png?raw=1" width="300" />
<small>Source: Zheng, A.& Casari, A. (2018). Feature Engineering for Machine Learning. O'Reilly Media.</small>

In [None]:
from sklearn.feature_extraction.text import CountVectorizer

vocabulary = ['it', 'they', 'puppy', 'and', 'cat', 'aardvark', 'cute', 'extremely', 'not']

vectorizer = CountVectorizer(vocabulary=vocabulary)

In [None]:
sentence = 'it is a puppy and it is extremely cute'

### Bag of words

In [None]:
vectorizer.fit_transform([sentence]).toarray()

In [None]:
vectorizer.fit_transform(['it is not a puppy and it is extremely cute']).toarray()

In [None]:
vectorizer.fit_transform(['it is a puppy and it is extremely not cute']).toarray()

🦄 Read more about scikit-learn's text feature extraction [here](https://scikit-learn.org/stable/modules/feature_extraction.html#text-feature-extraction).

### One-hot representation

In [None]:
[vectorizer.fit_transform([word]).toarray()
 for word in sentence.split()
 if word in vocabulary]

### The problem with one-hot representation

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/audio-image-text.png?raw=1" />

<small>Source: [Tensorflow Documentation](https://www.tensorflow.org/tutorials/representation/word2vec)</small>

[Color Picker](https://www.google.com/search?q=color+picker)

<br><br><br><br>

## 3.4 - Idea: Embedding a word in a n-dimensional space
<big>💎</big>

### Distributional Hypothesis
> "a word is characterized by the company it keeps" - [John Rupert Firth](https://en.wikipedia.org/wiki/John_Rupert_Firth)

**Distance ~ Meaning Similarity**


### Examples (algorithms and pre-trained models)
<big>🦄</big>
- [Word2Vec](https://code.google.com/archive/p/word2vec/)
- [GloVe](https://nlp.stanford.edu/projects/glove/)
- [fastText](https://fasttext.cc/)
- [ELMo](https://allennlp.org/elmo) (contextualized)

#### Training: using *word-context* relationships from a corpus.
<big>🦄</big>

See: [The Illustrated Word2vec by Jay Alammar](http://jalammar.github.io/illustrated-word2vec/)

#### State of the Art - Contextual Word Embedding → Language Models
<big>🦄</big>
- [The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) by Jay Alammar](http://jalammar.github.io/illustrated-bert/)
- Microsoft - [NLP Best Practices](https://github.com/microsoft/nlp-recipes)
- [Tracking Progress in Natural Language Processing](https://nlpprogress.com/)

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Four: Playing with Word2Vec word embedding!

[Word2Vec](https://code.google.com/archive/p/word2vec/) - Google News - 100B tokens, 3M vocab, cased, 300d vectors - only lowercase vocab extracted

Loaded using [`responsibly`](http://docs.responsibly.ai) package, the function [`responsibly.we.load_w2v_small`]() returns a [`gensim`](https://radimrehurek.com/gensim/)'s [`KeyedVectors`](https://radimrehurek.com/gensim/models/keyedvectors.html#gensim.models.keyedvectors.KeyedVectors) object.

## 4.1 - Basic Properties

In [None]:
# 🛠️⚡ ignore warnings
# generally, you shouldn't do that, but for this tutorial we'll do so for the sake of simplicity

import warnings
warnings.filterwarnings('ignore')

In [None]:
from responsibly.we import load_w2v_small

w2v_small = load_w2v_small()

In [None]:
# vocabulary size

len(w2v_small.vocab)

In [None]:
# get the vector of the word "home"

print('home =', w2v_small['home'])

In [None]:
# the word embedding dimension, in this case, is 300

len(w2v_small['home'])

In [None]:
# all the words are normalized (=have norm equal to one as vectors)

from numpy.linalg import norm

norm(w2v_small['home'])

In [None]:
# 🛠️ make sure that all the vectors are normalized!

from numpy.testing import assert_almost_equal

length_vectors = norm(w2v_small.vectors, axis=1)

assert_almost_equal(actual=length_vectors,
                    desired=1,
                    decimal=5)

## 4.2 - Mesuring Distance between Words
<big>💎</big>

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/sphere.png?raw=1" width="300"/>

<small>Source: [Wikipedia](https://en.wikipedia.org/wiki/File:Sphere_wireframe_10deg_6r.svg)</small>

### Mesure of Similiarty: [Cosine Similariy](https://en.wikipedia.org/wiki/Cosine_similarity)
- Measures the cosine of the angle between two vecotrs.
- Ranges between 1 (same vector) to -1 (opposite/antipode vector)
- In Python, for normalized vectors (Numpy's array), use the `@`(at) operator!

In [None]:
w2v_small['cat'] @ w2v_small['cat']

In [None]:
w2v_small['cat'] @ w2v_small['cats']

In [None]:
from math import acos, degrees

degrees(acos(w2v_small['cat'] @ w2v_small['cats']))

In [None]:
w2v_small['cat'] @ w2v_small['dog']

In [None]:
degrees(acos(w2v_small['cat'] @ w2v_small['dog']))

In [None]:
w2v_small['cat'] @ w2v_small['cow']

In [None]:
degrees(acos(w2v_small['cat'] @ w2v_small['cow']))

In [None]:
w2v_small['cat'] @ w2v_small['graduated']

In [None]:
degrees(acos(w2v_small['cat'] @ w2v_small['graduated']))

💎 In general, the use of Word Embedding to encode words, as an input for NLP systems (*), improve their performance compared to one-hot representation.

\* Sometimes the embedding is learned as part of the NLP system.

## 4.3 - Visualizing Word Embedding in 2D using T-SNE 
<big>🛠️</big>

<small>Source: [Google's Seedbank](https://research.google.com/seedbank/seed/pretrained_word_embeddings)</small>

In [None]:
from sklearn.manifold import TSNE
from matplotlib import pylab as plt

# take the most common words in the corpus between 200 and 600
words = [word for word in w2v_small.index2word[200:600]]

# convert the words to vectors
embeddings = [w2v_small[word] for word in words]

# perform T-SNE
words_embedded = TSNE(n_components=2).fit_transform(embeddings)

# ... and visualize!
plt.figure(figsize=(20, 20))
for i, label in enumerate(words):
    x, y = words_embedded[i, :]
    plt.scatter(x, y)
    plt.annotate(label, xy=(x, y), xytext=(5, 2), textcoords='offset points',
                 ha='right', va='bottom', size=11)
plt.show()

### Extra: [Tensorflow Embedding Projector](http://projector.tensorflow.org)
⚡ Be cautious: It is easy to see "patterns".

## 4.4 - Most Similar

What are the most simlar words (=closer) to a given word?

In [None]:
w2v_small.most_similar('cat')

### EXTRA: Doesn't Match

Given a list of words, which one doesn't match?

The word further away from the mean of all words.

In [None]:
w2v_small.doesnt_match('breakfast cereal dinner lunch'.split())

## 4.5 - Vector Arithmetic

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/vector-addition.png?raw=1" width="400"/>

<small>Source: [Wikipedia](https://commons.wikimedia.org/wiki/File:Vector_add_scale.svg)</small>

In [None]:
# nature + science = ?

w2v_small.most_similar(positive=['nature', 'science'])

## 4.6 - Vector Analogy
<big>💎</big>

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/linear-relationships.png?raw=1" />

<small>Source: [Tensorflow Documentation](https://www.tensorflow.org/tutorials/representation/word2vec)</small>

In [None]:
# man:king :: woman:?
# king - man + woman = ?

w2v_small.most_similar(positive=['king', 'woman'],
                       negative=['man'])

In [None]:
w2v_small.most_similar(positive=['big', 'smaller'],
                       negative=['small'])

## 4.10 - Think about a DIRECTION in word embedding as a RELATION

## $\overrightarrow{she} - \overrightarrow{he}$
## $\overrightarrow{smaller} - \overrightarrow{small}$
## $\overrightarrow{Spain} - \overrightarrow{Madrid}$


**⚡ Direction is not a word vector by itself!**

### ⚡ But it doesn't work all the time...

In [None]:
w2v_small.most_similar(positive=['forward', 'up'],
                       negative=['down'])

It might be because we have the phrase "looking forward" which is acossiated with "excitement" in the data.

⚡🦄 Keep in mind the word embedding was generated by learning the co-occurrence of words, so the fact that it *empirically* exhibit "concept arithmetic", it doesn't necessarily mean it learned it! In fact, it seems it didn't.
See: [king - man + woman is queen; but why? by Piotr Migdał](https://p.migdal.pl/2017/01/06/king-man-woman-queen-why.html)

🦄 EXTRA: [Demo - Word Analogies Visualizer by Julia Bazińska](https://lamyiowce.github.io/word2viz/)

⚡🦄 In fact, `w2v_small.most_similar` find the most closest word which *is not one* of the given ones. This is a real methodological issue. Nowadays, it is not a common practice anymore to evaluate word embedding with analogies.

You can use [`responsibly.we.most_similar`](https://docs.responsibly.ai/word-embedding-bias.html#responsibly.we.utils.most_similar) for the unrestricted version.

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Five: Gender Bias

**⚡ We use the word *bias* merely as a technical term, without jugement of "good" or "bad". Later on we will put the bias into *human contextes* to evaluate it.**

Keep in mind, the data is from Google News, the writers are professional journalists.

Bolukbasi Tolga, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama, and Adam T. Kalai. [Man is to computer programmer as woman is to homemaker? debiasing word embeddings](https://arxiv.org/abs/1607.06520). NIPS 2016.

In [None]:
from responsibly.we import load_w2v_small

w2v_small = load_w2v_small()

## 5.1 - Gender appropriate he-she analogies

In [None]:
# she:sister :: he:?
# sister - she + he = ?

w2v_small.most_similar(positive=['sister', 'he'],
                       negative=['she'])

```
queen-king
waitress-waiter
sister-brother
mother-father
ovarian_cancer-prostate_cancer
convent-monastery
```

## 5.2 - Gender stereotype he-she analogies

In [None]:
w2v_small.most_similar(positive=['nurse', 'he'],
                       negative=['she'])

```
sewing-carpentry
nurse-doctor
blond-burly
giggle-chuckle
sassy-snappy
volleyball-football
register_nurse-physician
interior_designer-architect
feminism-conservatism
vocalist-guitarist
diva-superstar
cupcakes-pizzas
housewife-shopkeeper
softball-baseball
cosmetics-pharmaceuticals
petite-lanky
charming-affable
hairdresser-barber
```

### Methodological Issue: The unrestricted version of analogy generation

In [None]:
from responsibly.we import most_similar

In [None]:
most_similar(w2v_small,
             positive=['nurse', 'he'],
             negative=['she'])

⚡ Be Aware: According to a recent paper, it seems that the method of generating analogies enforce producing gender sterotype ones!

Nissim, M., van Noord, R., van der Goot, R. (2019). [Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor](https://arxiv.org/abs/1905.09866).

... and a [Twitter thread](https://twitter.com/adamfungi/status/1133865428663635968) between the authors of the two papares.

My takeaway (and as well as of other researchers): Analogies are not approriate method to observe bias in word embedding.

🧪 What if our methodology introduce a bias?

## 5.3 - What can we take from analogies? Gender Direction!
<big>💎</big>

### $\overrightarrow{she} - \overrightarrow{he}$

In [None]:
from numpy.linalg import norm

gender_direction = w2v_small['she'] - w2v_small['he']

gender_direction /= norm(gender_direction)

In [None]:
gender_direction @ w2v_small['architect']

In [None]:
gender_direction @ w2v_small['interior_designer']

**⚡Interprete carefully: The word *architect* appears in more contexts with *he* than with *she*, and vice versa for *interior designer*.**

🦄 In practice, we calculate the gender direction using multiple definitional pair of words for better estimation (words may have more than one meaning):

- woman - man
- girl - boy
- she - he
- mother - father
- daughter - son
- gal - guy
- female - male
- her - his
- herself - himself
- Mary - John

## 5.4 - Try some words by yourself
<big>💻</big>
⚡ Keep in mind: You are performing exploratory data analysis, and not evaluate systematically!

In [None]:
gender_direction @ w2v_small['house']

## 5.5 - So What?
<big>💎</big>

Downstream Application - Putting a system into a human context

### Toy Example - Search Engine Ranking

- "BU computer science PhD student"
- "doctoral candidate" ~ "PhD student"
- John:computer programmer :: Mary:homemaker

### Universal Embeddings
- Pre-trained on a large corpus
- Plugged in downstream task models (sentimental analysis, classification, translation …)
- Improvement of performances

## 5.6 - Measuring Bias in Word Embedding

# Think-Pair-Share

```


















```
**Basic Ideas: Use neutral-gender words!**
```


















```

**Neutral Professions!**

## 5.7 - Projections

In [None]:
from responsibly.we import GenderBiasWE

w2v_small_gender_bias = GenderBiasWE(w2v_small, only_lower=True)

In [None]:
w2v_small_gender_bias.positive_end, w2v_small_gender_bias.negative_end

In [None]:
# gender direction
w2v_small_gender_bias.direction[:10]

In [None]:
from responsibly.we.data import BOLUKBASI_DATA

neutral_profession_names = BOLUKBASI_DATA['gender']['neutral_profession_names']

In [None]:
neutral_profession_names[:8]

Note: Why `actor` is in the neutral profession names list while `actress` is not there?
1. Due to the statistical nature of the method that is used to find the gender- specific and natural words
2. That might be because `actor` nowadays is much more gender-neutral, compared to waiter-waitress (see [Wikipedia - The term Actress](https://en.wikipedia.org/wiki/Actor#The_term_actress))

In [None]:
len(neutral_profession_names)

In [None]:
# the same of using the @ operator on the bias direction

w2v_small_gender_bias.project_on_direction(neutral_profession_names[0])

**Let's visualize the projections of professions (neutral and specific by the orthography) on the gender direction**

In [None]:
import matplotlib.pylab as plt

f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_bias.plot_projection_scores(n_extreme=20, ax=ax);

EXTRA: Demo - Visualizing gender bias with [Word Clouds](http://wordbias.umiacs.umd.edu/)

## 5.8 - Are the projections of occupation words on the gender direction related to the real world?

Let's take the percentage of female in various occupations from the Labor Force Statistics of 2017 Population Survey.

Taken from: https://arxiv.org/abs/1804.06876

In [None]:
from operator import itemgetter  # 🛠️ For idiomatic sorting in Python

from responsibly.we.data import OCCUPATION_FEMALE_PRECENTAGE

sorted(OCCUPATION_FEMALE_PRECENTAGE.items(), key=itemgetter(1))

In [None]:
f, ax = plt.subplots(1, figsize=(10, 8))

w2v_small_gender_bias.plot_factual_association(ax=ax);

### Also: Word embeddings quantify 100 years of gender stereotypes

Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). [Word embeddings quantify 100 years of gender and ethnic stereotypes](https://www.pnas.org/content/pnas/115/16/E3635.full.pdf). Proceedings of the National Academy of Sciences, 115(16), E3635-E3644.

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/gender-bias-over-decades.png?raw=1" />

<small>Data: Google Books/Corpus of Historical American English (COHA)</small>

Word embedding is sometimes used to analyze a collection of text in **digital humanities** - putting a system into a human context.

🧪 Quite strong and interesting observation! We used "external" data which wan't used directly to create the word embedding.

It takes us to think about the *data generation process* - in both cases it is the "world", but it will be difficult to argue for causality only in one direction:
1. Text in newspapers
2. Employment by gender

## 5.9 - Direct Bias Measure

1. Project each **neutral profession names** on the gender direction
2. Calculate the absolute value of each projection
3. Average it all

In [None]:
# using responsibly

w2v_small_gender_bias.calc_direct_bias()

In [None]:
# what responsibly does:

neutral_profession_projections = [w2v_small[word] @ w2v_small_gender_bias.direction
                                  for word in neutral_profession_names]

abs_neutral_profession_projections = [abs(proj) for proj in neutral_profession_projections]

sum(abs_neutral_profession_projections) / len(abs_neutral_profession_projections)

🧪 What are the assumptions of the direct bias measure? How the choice of neutral word effect on the definition of the bias?

## 5.10 - [EXTRA] Indirect Bias Measure
Similarity due to shared "gender direction" projection

In [None]:
w2v_small_gender_bias.generate_closest_words_indirect_bias('softball',
                                                           'football')

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Six: Mitigating Bias

> We intentionally do not reference the resulting embeddings as "debiased" or free from all gender bias, and
prefer the term "mitigating bias" rather that "debiasing," to guard against the misconception that the resulting
embeddings are entirely "safe" and need not be critically evaluated for bias in downstream tasks. <small>James-Sorenson, H., & Alvarez-Melis, D. (2019). [Probabilistic Bias Mitigation in Word Embeddings](https://arxiv.org/pdf/1910.14497.pdf). arXiv preprint arXiv:1910.14497.</small>

## 6.1 - Neutralize

In this case, we will remove the gender projection from all the words, except the neutral-gender ones, and then normalize.

🦄 We need to "learn" what are the gender-specific words in the vocabulary for a seed set of gender-specific words (by semi-automatic use of [WordNet](https://en.wikipedia.org/wiki/WordNet))

In [None]:
w2v_small_gender_debias = w2v_small_gender_bias.debias(method='neutralize', inplace=False)

In [None]:
print('home:',
      'before =', w2v_small_gender_bias.model['home'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['home'] @ w2v_small_gender_debias.direction)

In [None]:
print('man:',
      'before =', w2v_small_gender_bias.model['man'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.direction)

In [None]:
print('woman:',
      'before =', w2v_small_gender_bias.model['woman'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.direction)

In [None]:
w2v_small_gender_debias.calc_direct_bias()

In [None]:
f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_debias.plot_projection_scores(n_extreme=20, ax=ax);

In [None]:
f, ax = plt.subplots(1, figsize=(10, 8))

w2v_small_gender_debias.plot_factual_association(ax=ax);

## 6.2 [EXTRA] Equalize

- Do you see that `man` and `woman` have a different projection on the gender direction? 

- It might cause to different similarity (distance) to neutral words, such as to `babysitter`

In [None]:
w2v_small_gender_debias.model['grandfather'] @ w2v_small_gender_debias.model['babysitter']

In [None]:
w2v_small_gender_debias.model['grandmother'] @ w2v_small_gender_debias.model['babysitter']

In [None]:
BOLUKBASI_DATA['gender']['equalize_pairs'][:10]

## 6.3 - Hard Debias = Neutralize + Equalize

In [None]:
w2v_small_gender_debias = w2v_small_gender_bias.debias(method='hard', inplace=False)

In [None]:
print('home:',
      'before =', w2v_small_gender_bias.model['home'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['home'] @ w2v_small_gender_debias.direction)

In [None]:
print('man:',
      'before =', w2v_small_gender_bias.model['man'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['man'] @ w2v_small_gender_debias.direction)

In [None]:
print('woman:',
      'before =', w2v_small_gender_bias.model['woman'] @ w2v_small_gender_bias.direction,
      'after = ', w2v_small_gender_debias.model['woman'] @ w2v_small_gender_debias.direction)

In [None]:
w2v_small_gender_debias.calc_direct_bias()

In [None]:
w2v_small_gender_debias.model['grandfather'] @ w2v_small_gender_debias.model['babysitter']

In [None]:
w2v_small_gender_debias.model['grandmother'] @ w2v_small_gender_debias.model['babysitter']

In [None]:
f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_gender_debias.plot_projection_scores(n_extreme=20, ax=ax);

The disadvantage of equalization is that it might remove meaningful association, such as the verb meaning of "grandfather", e.g. "to grandfather a regulation". Equalization removes this distinction.

## 6.4 - Compare Preformances

After debiasing, the performance of the word embedding, using standard benchmarks, get only slightly worse!

**⚠️ It might take few minutes to run!**

In [None]:
w2v_small_gender_bias.evaluate_word_embedding()

In [None]:
w2v_small_gender_debias.evaluate_word_embedding()

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Seven: So What?
<big>💎</big>

We removed the gender bias, **as we defined it**, in a word embedding - Is there any impact on a downstream application?

## First example: coreference resolution

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2018). [Gender bias in coreference resolution: Evaluation and debiasing methods](https://par.nsf.gov/servlets/purl/10084252). NAACL-HLT 2018.


### WinoBias Dataset
<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/coref-example.png?raw=1" width="400"/>


### Stereotypical Occupations (the source of `responsibly.we.data.OCCUPATION_FEMALE_PRECENTAGE`)
<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/coref-occupations.png?raw=1" width="400"/>

### Results on *UW End-to-end Neural Coreference Resolution System*

#### No Intervention - Baseline

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   67.7   |            76.0            |             49.4            | 62.7 | 26.6* |

#### Intervention: Named-entity anonymization

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   66.4   |            73.5            |             51.2            | 62.6 | 21.3* |
|  Hard Debiased |   66.5   |            67.2            |             59.3            | 63.2 |  7.9* |

#### Interventions: Named-entity anonymization + Gender swapping

| Word Embedding | OnoNotes | Type 1 - Pro-stereotypical | Type 1 - Anti-stereotypical |  Avg |  Diff |
|:--------------:|:--------:|:--------------------------:|:---------------------------:|:----:|:-----:|
|    Original    |   66.2   |            65.1            |             59.2            | 62.2 |  5.9* |
|  Hard Debiased |   66.3   |            63.9            |             62.8            | 63.4 |  1.1  |

## Second example: another bias mitigation method

Zhao, J., Zhou, Y., Li, Z., Wang, W., & Chang, K. W. (2018). [Learning gender-neutral word embeddings](https://arxiv.org/pdf/1809.01496.pdf). EMNLP 2018.

The mitigation method is tailor-made for GloVe training process.

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/gn-glove-results.png?raw=1" width="400">

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Eight: Meta "So What?" - I
<big>💎💎</big>

## How should we definition of "bias" in word embedding?

### 1. Intrinsic (e.g., direct bias)

### 2. External - Downstream application (e.g., coreference resolution, classification)

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Nine: Have we really removed the bias?
<big>💎</big>

Let's look on another metric, called **WEAT** (Word Embedding Association Test) which is inspired by **IAT** (Implicit-Association Test) from Pyschology.

Try IAT by yourself: https://implicit.harvard.edu/implicit/

**Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). [Semantics derived automatically from language corpora contain human-like biases.](http://www.cs.bath.ac.uk/~jjb/ftp/CaliskanEtAl-authors-full.pdf) Science, 356(6334), 183-186.**


## 9.1 - Ingredients

1. Attribute words (e.g., Male ve. Female)

2. Target words (e.g., Math vs. Arts)

In [None]:
from responsibly.we import load_w2v_small

w2v_small = load_w2v_small()

In [None]:
# 🛠️ For copying a nested data structure in Python
from copy import deepcopy

from responsibly.we.weat import WEAT_DATA

# B. A. Nosek, M. R. Banaji, A. G. Greenwald, Math=male, me=female, therefore math≠me.,
# Journal of Personality and Social Psychology 83, 44 (2002).
weat_gender_science_arts = deepcopy(WEAT_DATA[7])

In [None]:
# 🛠️ filter words from the original IAT experiment that are not presend in the reduced Word2Vec model

from responsibly.we.weat import _filter_by_model_weat_stimuli

_filter_by_model_weat_stimuli(weat_gender_science_arts, w2v_small)

In [None]:
weat_gender_science_arts['first_attribute']

In [None]:
weat_gender_science_arts['second_attribute']

In [None]:
weat_gender_science_arts['first_target']

In [None]:
weat_gender_science_arts['second_target']

## 9.2 - Recipe

➕ Male x Science

➖ Male x Arts

➖ Female x Science

➕ Female x Arts

In [None]:
def calc_combination_similiarity(model, attribute, target):
    score = 0

    for attribute_word in attribute['words']:

        for target_word in target['words']:

            score += w2v_small.similarity(attribute_word,
                                          target_word)

    return score

In [None]:
male_science_score = calc_combination_similiarity(w2v_small,
                                                  weat_gender_science_arts['first_attribute'],
                                                  weat_gender_science_arts['first_target'])

male_science_score

In [None]:
male_arts_score = calc_combination_similiarity(w2v_small,
                                               weat_gender_science_arts['first_attribute'],
                                               weat_gender_science_arts['second_target'])

male_arts_score

In [None]:
female_science_score = calc_combination_similiarity(w2v_small,
                                                    weat_gender_science_arts['second_attribute'],
                                                    weat_gender_science_arts['first_target'])

female_science_score

In [None]:
female_arts_score = calc_combination_similiarity(w2v_small,
                                                 weat_gender_science_arts['second_attribute'],
                                                 weat_gender_science_arts['second_target'])

female_arts_score

In [None]:
male_science_score - male_arts_score - female_science_score + female_arts_score

In [None]:
len(weat_gender_science_arts['first_attribute']['words'])

In [None]:
(male_science_score - male_arts_score - female_science_score + female_arts_score) / 8

## 9.3 - All WEAT Tests

In [None]:
from responsibly.we import calc_all_weat

calc_all_weat(w2v_small, [weat_gender_science_arts])

### ⚡ Important Note: Our results are a bit different because we use a reduced Word2Vec.


### Results from the Paper (computed on the complete Word2Vec):

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/weat-w2v.png?raw=1" width="600"/>


### ⚡Caveats

#### Comparing WEAT to the IAT

- Individuals (IAT) vs. Words (WEAT)
- Therefore, the meanings of the effect size and p-value are totally different!

#### ⚡🦄 WEAT score definition

The definition of the WEAT score is structured differently (but it is computationally equivalent). The original formulation matters to compute the p-value. Refer to the paper for details.

### 🧪 Effect size comparision between human and machine bias

With the effect size, we can "compare" a human bias to a machine one. It raises the question whether the baseline for meauring bias/fairness of a machine should be human bias? Then a well-performing machine shouldn't be necessarily not biased, but only less biased than human (think about autonomous cars or semi-structured vs. unstructured interview).

## 9.4 - WEAT vs. IAT

Lewis, M., & Lupyan, G. [What are we learning from language? Associations between gender biases and distributional semantics in 25 languages](https://mollylewis.shinyapps.io/iatlang_SI/).

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/iat-weat.png?raw=1" />


1. Implicit male-career association (adjusted for participant age, gender, and congruent/incongruent block order) as a function of the linguistic male-career association derived from word-embeddings *r*(23) = 0.48 [0.11, 0.74]; *p* = 0.01; *n* = 25; Study 1b). Each point corresponds to a language. The size of the point is proportional to the number of participants who come from the country in which the language is dominant (total = 656,636 participants). Linguistic associations are estimated from models trained on text in each language from the Wikipedia corpus. Larger values indicate a greater tendency to associate men with the concept of career and women with the concept of family.

2. Difference (UK minus US) in implicit association versus linguistic association for 31 IAT types (*N* = 27,045 participants). Error bands indicate standard error of the linear model estimate.


## 9.5 - Let's go back to our question - did we removed the bias?


**Gonen, H., & Goldberg, Y. (2019, June). [Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them](https://arxiv.org/pdf/1903.03862.pdf). In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 609-614).**

They used multiple methods, we'll show only two:
1. WEAT
2. Neutral words clustering

In [None]:
from responsibly.we import GenderBiasWE

w2v_small_gender_bias = GenderBiasWE(w2v_small, only_lower=True)

w2v_small_gender_debias = w2v_small_gender_bias.debias(method='neutralize', inplace=False)

In [None]:
w2v_small_gender_bias.calc_direct_bias()

In [None]:
w2v_small_gender_debias.calc_direct_bias()

### 9.5.1 -  WEAT - before and after

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/weat-experiment.png?raw=1" />

See `responsibly` [demo page on word embedding](https://docs.responsibly.ail/notebooks/demo-word-embedding-bias.html#first-experiment-weat-before-and-after-debias) for a complete example.

### 9.5.2 - Clustering Neutral Gender Words

In [None]:
from responsibly.we.data import BOLUKBASI_DATA

w2v_vocab = {word for word in w2v_small_gender_bias.model.vocab.keys()}

# 🦄 how we got these words - read the Bolukbasi's paper for details
all_gender_specific_words = set(BOLUKBASI_DATA['gender']['specific_full_with_definitional_equalize'])

all_gender_neutral_words = w2v_vocab - all_gender_specific_words

print('#vocab =', len(w2v_vocab),
      '#specific =', len(all_gender_specific_words),
      '#neutral =', len(all_gender_neutral_words))

In [None]:
neutral_words_gender_projections = [(w2v_small_gender_bias.project_on_direction(word), word)
                                    for word in all_gender_neutral_words]

neutral_words_gender_projections.sort()

In [None]:
neutral_words_gender_projections[:-20:-1]

In [None]:
neutral_words_gender_projections[:20]

In [None]:
# Neutral words: top 500 male-biased and top 500 female-biased words

GenderBiasWE.plot_most_biased_clustering(w2v_small_gender_bias, w2v_small_gender_debias);

Note: In the paper, they got a stronger result, 92.5% accuracy for the debiased model.
However, they perform clustering on all the words from the reduced word embedding, both gender- neutral and specific words, and applied slightly different pre-processing.

### 9.5.3 - Strong words form the paper (emphasis mine):
<big>💎</big>

> The experiments ...
reveal a **systematic bias** found in the embeddings,
which is **independent of the gender direction**.


> The implications are alarming: while suggested
debiasing methods work well at removing the gender direction, the **debiasing is mostly superficial**.
The bias stemming from world stereotypes and
learned from the corpus is **ingrained much more
deeply** in the embeddings space.


> .. real concern from biased representations is **not the association** of a concept with
words such as “he”, “she”, “boy”, “girl” **nor** being
able to perform **gender-stereotypical word analogies**... algorithmic discrimination is more likely to happen by associating one **implicitly gendered** term with
other implicitly gendered terms, or picking up on
**gender-specific regularities** in the corpus by learning to condition on gender-biased words, and generalizing to other gender-biased words.


<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Ten: Meta "So What?" - II
<big>💎💎</big>

## Can we debias at all a word embedding?

## Under some downstream applications, maybe the bias in the word embedding is desirable?

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Eleven: Your Turn!
<big>⌨️</big>

Note: The first two tasks require a basic background in Python programming. For the last task, you need some experience with Machine Learning and Natural Langauge Processing (NLP) as well.

In [None]:
from responsibly.we import load_w2v_small

w2v_small = load_w2v_small()

## Task 1 - Racial bias

Let's explor racial bias usint Tolga's approche. Will use the [`responsibly.we.BiasWordEmbedding`](http://docs.responsibly.ai/word-embedding-bias.html#ethically.we.bias.BiasWordEmbedding) class. `GenderBiasWE` is a sub-class of `BiasWordEmbedding`.

In [None]:
from responsibly.we import BiasWordEmbedding

w2v_small_racial_bias = BiasWordEmbedding(w2v_small, only_lower=True)

💎💎💎 Identify the racial direction using the `sum` method

In [None]:
white_common_names = ['Emily', 'Anne', 'Jill', 'Allison', 'Laurie', 'Sarah', 'Meredith', 'Carrie',
                      'Kristen', 'Todd', 'Neil', 'Geoffrey', 'Brett', 'Brendan', 'Greg', 'Matthew',
                      'Jay', 'Brad']

black_common_names = ['Aisha', 'Keisha', 'Tamika', 'Lakisha', 'Tanisha', 'Latoya', 'Kenya', 'Latonya',
                      'Ebony', 'Rasheed', 'Tremayne', 'Kareem', 'Darnell', 'Tyrone', 'Hakim', 'Jamal',
                      'Leroy', 'Jermaine']

w2v_small_racial_bias._identify_direction('Whites', 'Blacks',
                                          definitional=(white_common_names, black_common_names),
                                          method='sum')

Use the neutral profession names to measure the racial bias

In [None]:
from responsibly.we.data import BOLUKBASI_DATA

neutral_profession_names = BOLUKBASI_DATA['gender']['neutral_profession_names']

In [None]:
neutral_profession_names[:10]

In [None]:
import matplotlib.pylab as plt

f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_racial_bias.plot_projection_scores(neutral_profession_names, n_extreme=20, ax=ax);

Calculate the direct bias measure

In [None]:
# Your Code Here...

Keep exploring the racial bias

In [None]:
# Your Code Here...

## Task 2 - Your WEAT test

Open the [word embedding demo page in `responsibly` documentation](http://docs.responsibly.ai/notebooks/demo-word-embedding-bias.html#it-is-possible-also-to-expirements-with-new-target-word-sets-as-in-this-example-citizen-immigrant), and look on the use of the function `calc_weat_pleasant_unpleasant_attribute`. What was the attempt in that experiment? What was the result? Can you come up with other experiments?

In [None]:
from responsibly.we import calc_weat_pleasant_unpleasant_attribute

In [None]:
# Your Code Here...

## Task 3 - Sentiment Analysis

#### Notes:
1. This task requires some background with NLP, particularly with training a text classifier in Python.
2. Our goal is to learn how word embeddings might affect downstream application from a gender bias perspective, focusing on learning. So we won't follow the best practices in NLP or use the most advanced techniques.

One way to examine bias in word embeddings is through a downstream application. Here we will use a sentiment analysis classifier of tweets; given a tweet, the system would infer the *valence* of the sentiment expressed in a tweet. The valence is expressed as a real number between 0 and 1, where 0 represents the negative and 1 is for the positive end.

The system is going to be rather simple and consists of three components:

1. Preprocessing (e.g., removing stopwords and punctuation, [tockenization](https://en.wikipedia.org/wiki/Text_segmentation#Word_segmentation))
2. Transforming the tweets' tokens  into a single 300-dimensional vector.
3. Applying logistic regression to predict the valence.

Our goal is to assess the word embedding's impact in its original version and the neutralize-"debiased" one on the system bias. We are going to build two versions of that system, each using one version of the two word embedding, and compare its performance on the [Equity Evaluation Corpus (EEC)](http://saifmohammad.com/WebPages/Biases-SA.html), which is designed to assess gender bias in sentiment analysis systems.

**Reference:**
Kiritchenko, S., & Mohammad, S. M. (2018). [Examining gender and race bias in two hundred sentiment analysis systems](https://arxiv.org/pdf/1805.04508.pdf). arXiv preprint arXiv:1805.04508.

### Data

First, let's dowload and load the datasets "Affect in Tweets" taken from the [SemEval 2018](https://competitions.codalab.org/competitions/17751#learn_the_details-datasets) competition. We have training, development, and test datasets. We will only use the first and the last datasets, but feel free to use the development dataset to tune select models and hyperparameters with cross-validation.

There are three columns:

1. `Tweet` - The tweet itself as a string, the input.
2. `Intensity Score` - The sentiment's valence of the tweet in the range [0, 1], the output
3. `Affect Dimension` - You can ignore it. It is `'valence'` for all of the data points.


In [None]:
%%bash

wget https://learn.responsibly.ai/word-embedding/data/SemEval2018-Task1-all-data.zip \
     -O SemEval2018-Task1-all-data.zip -q

unzip -qq -o SemEval2018-Task1-all-data.zip -d ./data

In [None]:
import pandas as pd


train_df = pd.read_csv('./data/SemEval2018-Task1-all-data/English/V-reg/2018-Valence-reg-En-train.txt',
                       sep='\t', index_col=0)
dev_df = pd.read_csv('./data/SemEval2018-Task1-all-data/English/V-reg/2018-Valence-reg-En-dev.txt',
                       sep='\t', index_col=0)
test_df = pd.read_csv('./data/SemEval2018-Task1-all-data/English/V-reg/2018-Valence-reg-En-test-gold.txt',
                       sep='\t', index_col=0)

In [None]:
# A few examples

train_df.head()

In [None]:
# Convert all the labels from real numbers into boolean values,
# setting the threshold at 0.5, and creating a new column named
# `label`

train_df['label'] = train_df['Intensity Score'] > 0.5
dev_df['label'] = dev_df['Intensity Score'] > 0.5
test_df['label'] = test_df['Intensity Score'] > 0.5

Now, let's download the **complete** word2voc word embedding, (which is not filtered only to lowercased words), and load it using the `gensim` Python package.

In [None]:
%%bash

wget https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz \
     -O GoogleNews-vectors-negative300.bin.gz -q

In [None]:
from gensim.models import KeyedVectors

# Limit vocabulary to top-500K most frequent words
VOCAB_SIZE = 500000

# Load the word2vec
w2v_model = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz',
                                              binary=True,
                                              limit=VOCAB_SIZE)

In [None]:
# Get the vector embedding for a word
w2v_model['home']

In [None]:
# Check whether there is an embedding for a word
'bazinga' in w2v_model

### Preprocessing & feature extraction

Before we transform a tweet into a vector of 300 dimensions, it should be broken into tokens ("words") and be cleaned. You can do that with various Python packages for NLP, such as [NLTK](https://www.nltk.org/) and 
[spaCy](https://spacy.io/). Feel free to use them if you would like to! We will use the basic preprocessing functionality that comes with [`gensim`](https://radimrehurek.com/gensim/parsing/preprocessing.html).

In [None]:
from gensim.parsing.preprocessing import (preprocess_string,
                                          strip_tags,
                                          strip_punctuation,
                                          strip_multiple_whitespaces,
                                          strip_numeric,
                                          remove_stopwords)


# We pick a subset of the default filters,
# in particular, we do not take
# strip_short() and stem_text().
FILTERS = [strip_punctuation,
           strip_tags,
           strip_multiple_whitespaces,
           strip_numeric,
           remove_stopwords]

# See how the sentece is transformed into tokes (words)
preprocess_string('This is a "short" text!', FILTERS)

After preprocessing all the tweets, we get tokens. We transform each token into a 300d vector using the word embedding and then compute the *average* vector. It will have 300 dimensions as well. This vector serves as the values of the features for each tweet. 

Note for these two possible pitfalls:

1. Make sure that the token exists int he word embedding.
2. Sometimes, there are tweets without any token found in the word embedding. Discard these tweets from the data. Keep in mind that you should discard the labels as well.

Write the function `generate_text_features(text, w2v)` that gets a string `text` and a word embedding `w2v` and produces the features of this text according to the method xdescribed above. The function should return an Numpy array with lengh of 300.

In [None]:
def generate_text_features(text, w2v):
    pass  # Your Code Here...

Now, use this function to produce the features for all three datasets (training, validation, test).

In [None]:
# Your Code Here...

### Training a classifier

The next step is straightforward, train logistic regression on the dataset. Report the accuracy of the training and the test dataset.

We recommend using [`sklearn.linear_model.LogisticRegression`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [None]:
# Your Code Here...

### Evaluate gender bias in the downstream appliation

The **Equity Evaluation Corpus (EEC)** consists of 8,640 English sentences carefully chosen to tease out biases towards certain races and genders.

We foucs on the sentences releated to gender. Every sentence is a build out of three elements:

1. Person (e.g., `he`, `this woman`, `my uncly`, `my mother`)
2. Emotion word (e.g., `anger`, `happy`, `gloomy`, `amazing`)
3. Template (e.g., `<person subject> feels <emotion word>`).

that are mixed together to form a sentence, for examples:
* he feels anger
* she feels anger
* this woman feels happy
* this man feels happy

Thanks to this systemic constraction from templates, the sentence are paired by gender, i.e. the EEC data is built of pairs of sentences that are all the same except for a gender noun (e.g., `she`-`he`, `my mother`-`my father`). If we think about sentiment analysis, there is no reason that the a system would assign different prediction to the paird sentences! So if we find differce in that, it could point for a potential gender bias in the downstream application.

#### Keep in mind, this is only an operalization of the gender bias in a sentiment analysis system. All the issues with a concreate and single measurement arise also here! We should always take into accout the human contex in which the system is deployed!

First, let's download the 

In [None]:
%%bash

wget https://learn.responsibly.ai/word-embedding/data/Equity-Evaluation-Corpus.zip \
     -O Equity-Evaluation-Corpus.zip -q

wget https://learn.responsibly.ai/word-embedding/data/SemEval2018-Task1-all-data.zip \
     -O SemEval2018-Task1-all-data.zip -q

unzip -qq -o Equity-Evaluation-Corpus.zip -d ./data

unzip -qq -o SemEval2018-Task1-all-data.zip -d ./data

In [None]:
# 🛠 Prepare the EEC data, no need to dig into this cell

eec_df = pd.read_csv('./data/Equity-Evaluation-Corpus/Equity-Evaluation-Corpus.csv')

# Remove the sentences for evaluating racial bias
gender_eec_df = eec_df[eec_df['Race'].isna()][:]

# Create identifier to mach sentence pairs
# The EEC data comes withot this matching
MALE_PERSONS = ('he', 'this man', 'this boy', 'my brother', 'my son', 'my husband',
                'my boyfriend', 'my father', 'my uncle', 'my dad', 'him')

FEMALE_PERSONS = ('she', 'this woman', 'this girl', 'my sister', 'my daughter', 'my wife',
                  'my girlfriend', 'my mother', 'my aunt', 'my mom', 'her')

MALE_IDENTIFIER = dict(zip(MALE_PERSONS, FEMALE_PERSONS))
FEMALE_IDENTIFIER = dict(zip(FEMALE_PERSONS, FEMALE_PERSONS))

PERSON_MATCH_WORDS = {**MALE_IDENTIFIER,
                      **FEMALE_IDENTIFIER}

gender_eec_df['PersonIdentifier'] = gender_eec_df['Person'].map(PERSON_MATCH_WORDS)

gender_eec_df = gender_eec_df.sort_values(['Gender', 'Template', 'Emotion word', 'PersonIdentifier'])

gender_split_index = len(gender_eec_df) // 2

# Create two DataFrames, one for 
female_eec_df = gender_eec_df[:gender_split_index].reset_index(False)
male_eec_df = gender_eec_df[gender_split_index:].reset_index(False)

In [None]:
female_eec_df.head()

In [None]:
male_eec_df.head()

Note that the two DataFrames are paired by index. If we take that *i*-th row in each one of them, then they are different only in the matched person word:

In [None]:
k = 543  # change my value and run the cell again!
female_eec_df.iloc[k]['Sentence'], male_eec_df.iloc[k]['Sentence']

Compute the probability estimations of the classifier for the female and male parts in the EEC data. If you use `sklearn`, then the classifier's method `predict_proba` is your friend for that!

In [None]:
# Your Code Here...

### Do the same for the neutralize-"debiased" word2vec

Perform the all the previous steps for the neutralize-"debiased" word2vec to produce the probability estimations of the EEC data for the classifier using that word-embedding

#### Neutralize-"debias" the word embedding

Hints:
1. Use [`responsibly.we.GenderBiasWE`](https://docs.responsibly.ai/word-embedding-bias.html). 
2. Look for the method `debias`.
3. Set the `method` argument to `'neutralize'`. 
4. Make sure that you set `inplace=True` to save memory. Note that you won't be able to work with the original word embedding after that.
5. Validate the neutralize-"debias" was applied by computing the direct bias measure with the method `calc_direct_bias`.
6. After the bias mitigating, the word embedding itself (as a `KeyedVectors` of `gensim`), is accessible through the attribute `model`.

In [None]:
# Your Code Here...

#### Generate features with the "debiased" word embedding and train a new classifier

Check the classifier's accuracy on the training and the test data - did the "debiasing" of the word embeddings hurt the classifier performance?

In [None]:
# Your Code Here...

#### Compute the probability estimations for the male and female sentences in the EEC data with the new classifier

In [None]:
# Your Code Here...

### Gender bias analysis

Now we are ready to blend all together. You have two classifiers, each one of them was trained on the same dataset, but with a different word embedding. The first used the original word2vec, and the other was undergone the neutralize-"debias" process. We computed the probability estimates for the EEC data twice for each one of the classifiers.


**Think about how to evaluate the impact of replacing the word embedding concerning gender bias. Keep in mind that the female and male EEC data is paired!**

#### Your analysis can take two points of view (there are more, but you start with that):
1. Analyze the difference between the female and male probability estimations for each system *separately* and compare the results.
2. Analyze the difference of differences; start with the difference of probability estimations between the paired female and male sentences for each system, and then compare the two differences.


#### Few possible ideas of what to do:
1. Plot distributions  ([`seaborn.displot`](https://seaborn.pydata.org/generated/seaborn.displot.html#seaborn.displot))
2. Compute the [effect size](https://en.wikipedia.org/wiki/Effect_size#Cohen's_d)
3. Perform statistical hypothesis testings to check whether means are eqaul using the paired t-test ([`scipy.stats.ttest_rel`])

In [None]:
# Your Code Here...

#### What is your conclusion? What would be your next steps?

Consider:

1. Group by the analysis according to the EEC columns (e.g., by emotion)
2. Try another classifier (e.g., `sklearn.ensemble.RandomForestClassifier`)
3. Change the mitigation bias to *hard* instead of *neutralize*.
4. Analyze the training data from gender prespective



Refer to this paper for some ideas:
[Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems](http://saifmohammad.com/WebDocs/EEC/ethics-StarSem-final_with_appendix.pdf). Svetlana Kiritchenko and Saif M. Mohammad. In Proceedings of *Sem, New Orleans, LA, USA, June 2018.

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Twelve: Examples of Representation Bias in the Context of Gender

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/examples-gender-bias-nlp.png?raw=1" />

<small>Source: Sun, T., Gaut, A., Tang, S., Huang, Y., ElSherief, M., Zhao, J., ... & Wang, W. Y. (2019). [Mitigating Gender Bias in Natural Language Processing: Literature Review](https://arxiv.org/pdf/1906.08976.pdf). arXiv preprint arXiv:1906.08976.</small>

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Thirteen: Takeaways
<big>💎💎</big>

1. **Downstream application** - putting a system into a human context, be aware of the [abstraction error](http://friedler.net/papers/sts_fat2019.pdf)

2. **Measurements** (a.k.a "what is a *good* system?")

3. **Data** (generation process, corpus building, collection selection bias, features (measurement and operationalization), train vs. validation vs. test datasets)

4. **Impact** of a system on individuals, groups, society, and humanity; **long-term**, **scale-up** and **automated**

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

# Part Fourteen: Resources
<big>💎💎</big>

## [Doing Data Science Responsibly - Resources](https://handbook.responsibly.ai/appendices/resources.html)

In particular:

- Timnit Gebru and Emily Denton - CVPR 2020 - [FATE Tutorial](https://youtu.be/-xGvcDzvi7Q) [Video]

- Rachel Thomas - fast.ai - [Algorithmic Bias (NLP video 16)](https://youtu.be/pThqge9QDn8) [Video]

- Solon Barocas, Moritz Hardt, Arvind Narayanan - [Fairness and machine learning - Limitations and Opportunities](https://fairmlbook.org/) [Textbook]

## [Course // Responsible AI, Law, Ethics & Society](https://learn.responsibly.ai/)

## Non-Technical Overview with More Downstream Application Examples
- [Google - Text Embedding Models Contain Bias. Here's Why That Matters.](https://developers.googleblog.com/2018/04/text-embedding-models-contain-bias.html)
- [Kai-Wei Chang (UCLA) - What It Takes to Control Societal Bias in Natural Language Processing](https://www.youtube.com/watch?v=RgcXD_1Cu18)
- Sun, T., Gaut, A., Tang, S., Huang, Y., ElSherief, M., Zhao, J., ... & Wang, W. Y. (2019). [Mitigating Gender Bias in Natural Language Processing: Literature Review](https://arxiv.org/pdf/1906.08976.pdf). arXiv preprint arXiv:1906.08976.

## Critical Prespective on Bias in NLP

Blodgett, S. L., Barocas, S., Daumé III, H., & Wallach, H. (2020). [Language (Technology) is Power: A Critical Survey of "Bias" in NLP](https://arxiv.org/pdf/2005.14050.pdf). arXiv preprint arXiv:2005.14050.

## Additional Related Work

- **Software Framework for Word Embedding Bias**
  - [WEFE: The Word Embeddings Fairness Evaluation Framework](https://wefe.readthedocs.io/en/latest/)

- **Understanding Bias**
    - Ethayarajh, K., Duvenaud, D., & Hirst, G. (2019, July). [Understanding Undesirable Word Embedding Associations](https://arxiv.org/pdf/1908.06361.pdf). In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 1696-1705). - **Including critical analysis of the current metrics and debiasing methods (quite technical)**

  - Brunet, M. E., Alkalay-Houlihan, C., Anderson, A., & Zemel, R. (2019, May). [Understanding the Origins of Bias in Word Embeddings](https://arxiv.org/pdf/1810.03611.pdf). In International Conference on Machine Learning (pp. 803-811).

- **Discovering Bias**
  - Swinger, N., De-Arteaga, M., Heffernan IV, N. T., Leiserson, M. D., & Kalai, A. T. (2019, January). [What are the biases in my word embedding?](https://arxiv.org/pdf/1812.08769.pdf). In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (pp. 305-311). ACM.
    Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories
  
  - Chaloner, K., & Maldonado, A. (2019, August). [Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories](https://www.aclweb.org/anthology/W19-3804). In Proceedings of the First Workshop on Gender Bias in Natural Language Processing (pp. 25-32).

- **Mitigating Bias**
  - Maudslay, R. H., Gonen, H., Cotterell, R., & Teufel, S. (2019). [It's All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution](https://arxiv.org/pdf/1909.00871.pdf). arXiv preprint arXiv:1909.00871.
  
  - Shin, S., Song, K., Jang, J., Kim, H., Joo, W., & Moon, I. C. (2020). [Neutralizing Gender Bias in Word Embedding with Latent Disentanglement and Counterfactual Generation](https://arxiv.org/pdf/2004.03133.pdf). arXiv preprint arXiv:2004.03133.
  
  - Zhang, B. H., Lemoine, B., & Mitchell, M. (2018, December). [Mitigating unwanted biases with adversarial learning](https://dl.acm.org/doi/pdf/10.1145/3278721.3278779?casa_token=yd1KGvVDBGwAAAAA:YzUT7d8Fq4bOV2b5M-CB43NLqIReW7wx2EaZj0omJ0ncbZF_pkPFoyV6WHWIBnG_HKIRqiG7FWFjsA). In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (pp. 335-340). [Demo](https://colab.research.google.com/notebooks/ml_fairness/adversarial_debiasing.ipynb)
  
- **Fairness in Classification**
  - Prost, F., Thain, N., & Bolukbasi, T. (2019, August). [Debiasing Embeddings for Reduced Gender Bias in Text Classification](https://arxiv.org/pdf/1908.02810.pdf). In Proceedings of the First Workshop on Gender Bias in Natural Language Processing (pp. 69-75).
  
  - Romanov, A., De-Arteaga, M., Wallach, H., Chayes, J., Borgs, C., Chouldechova, A., ... & Kalai, A. (2019, June). [What's in a Name? Reducing Bias in Bios without Access to Protected Attributes](https://arxiv.org/pdf/1904.05233.pdf). In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4187-4195).

- **Grammatical Gender**
  - Zhou, P., Shi, W., Zhao, J., Huang, K. H., Chen, M., & Chang, K. W. [Analyzing and Mitigating Gender Bias in Languages with Grammatical Gender and Bilingual Word Embeddings](https://aiforsocialgood.github.io/icml2019/accepted/track1/pdfs/47_aisg_icml2019.pdf). ICML 2019 - AI for Social Good. [Poster](https://aiforsocialgood.github.io/icml2019/accepted/track1/posters/47_aisg_icml2019.pdf)

  - Zhao, J., Mukherjee, S., Hosseini, S., Chang, K. W., & Awadallah, A. [Gender Bias in Multilingual Embeddings](https://www.researchgate.net/profile/Subhabrata_Mukherjee/publication/340660062_Gender_Bias_in_Multilingual_Embeddings/links/5e97428692851c2f52a6200a/Gender-Bias-in-Multilingual-Embeddings.pdf).

  - Gonen, H., Kementchedjhieva, Y., & Goldberg, Y. (2019). [How does Grammatical Gender Affect Noun Representations in Gender-Marking Languages?](https://arxiv.org/pdf/1910.14161.pdf). arXiv preprint arXiv:1910.14161.

- **Other**  
  - Zhao, J., Wang, T., Yatskar, M., Cotterell, R., Ordonez, V., & Chang, K. W. (2019, June). [Gender Bias in Contextualized Word Embeddings](https://arxiv.org/pdf/1904.03310.pdf). In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 629-634). [slides](https://jyzhao.net/files/naacl19.pdf)


##### Complete example of using `responsibly` with Word2Vec, GloVe and fastText: http://docs.responsibly.ai/notebooks/demo-gender-bias-words-embedding.html


## Bias in NLP

Around dozen of papers on this field until 2019, but nowdays plenty of work is done. Two venues from back then:
- [1st ACL Workshop on Gender Bias for Natural Language Processing](https://genderbiasnlp.talp.cat/)
- [NAACL 2019](https://naacl2019.org/)

<img src="https://github.com/ResponsiblyAI/word-embedding/blob/master/images/banner.png?raw=1" />

<center><h1>THE END!</h1></center>