# Workshop // Exploring Gender Bias in Word Embedding

## https://learn.responsibly.ai/word-embedding

Powerd by [`responsibly`](https://docs.responsibly.ai/) - Toolkit for auditing and mitigating bias and fairness of machine learning systems 🔎🤖🧰

# Part Eleven: Your Turn!
<big>⌨️</big>

Note: The first two tasks require a basic background in Python programming. For the last task, you need some experience with Machine Learning and Natural Langauge Processing (NLP) as well.

In [None]:
from responsibly.we import load_w2v_small

w2v_small = load_w2v_small()

## Task 1: Racial bias

Let's explor racial bias usint Tolga's approche. Will use the [`responsibly.we.BiasWordEmbedding`](http://docs.responsibly.ai/word-embedding-bias.html#ethically.we.bias.BiasWordEmbedding) class. `GenderBiasWE` is a sub-class of `BiasWordEmbedding`.

In [None]:
from responsibly.we import BiasWordEmbedding

w2v_small_racial_bias = BiasWordEmbedding(w2v_small, only_lower=True)

💎💎💎 Identify the racial direction using the `sum` method

In [None]:
white_common_names = ['Emily', 'Anne', 'Jill', 'Allison', 'Laurie', 'Sarah', 'Meredith', 'Carrie',
                      'Kristen', 'Todd', 'Neil', 'Geoffrey', 'Brett', 'Brendan', 'Greg', 'Matthew',
                      'Jay', 'Brad']

black_common_names = ['Aisha', 'Keisha', 'Tamika', 'Lakisha', 'Tanisha', 'Latoya', 'Kenya', 'Latonya',
                      'Ebony', 'Rasheed', 'Tremayne', 'Kareem', 'Darnell', 'Tyrone', 'Hakim', 'Jamal',
                      'Leroy', 'Jermaine']

w2v_small_racial_bias._identify_direction('Whites', 'Blacks',
                                          definitional=(white_common_names, black_common_names),
                                          method='sum')

Use the neutral profession names to measure the racial bias

In [None]:
from responsibly.we.data import BOLUKBASI_DATA

neutral_profession_names = BOLUKBASI_DATA['gender']['neutral_profession_names']

In [None]:
neutral_profession_names[:10]

In [None]:
import matplotlib.pylab as plt

f, ax = plt.subplots(1, figsize=(10, 10))

w2v_small_racial_bias.plot_projection_scores(neutral_profession_names, n_extreme=20, ax=ax);

Calculate the direct bias measure

In [None]:
# Your Code Here...

Keep exploring the racial bias

In [None]:
# Your Code Here...

## Task 2 - Your WEAT test

Open the [word embedding demo page in `responsibly` documentation](http://docs.responsibly.ai/notebooks/demo-word-embedding-bias.html#it-is-possible-also-to-expirements-with-new-target-word-sets-as-in-this-example-citizen-immigrant), and look on the use of the function `calc_weat_pleasant_unpleasant_attribute`. What was the attempt in that experiment? What was the result? Can you come up with other experiments?

In [None]:
from responsibly.we import calc_weat_pleasant_unpleasant_attribute

In [None]:
# Your Code Here...

## Task 3 - Sentiment Analysis

For this task, you will need to have some background with NLP, and in particular, training text classifier in Python.

One way to examine bias in word embeddings is through downstream application. Here we will use sentiment analysis classifier of tweets; given a tween, the system would infer the the *valence*/*intensity* of the sentiment expressed in a tweet. The valence is expressed as a real number between 0 and 1, where 0 represent the negetive and and 1 is for the positive end.

The system is going to be rather simple, and cosists of three components:

1. Preprocessing (e.g., removing stopwords and punctuation, [tockenization](https://en.wikipedia.org/wiki/Text_segmentation#Word_segmentation))
2. Transforming the tokens of a tweet into a signle 300 dimensional vector.
3. Applying logistic regression to predict the valence.

Our goal it to asses whehter 

You are going to build two versions of that system: (1) with the original word2vec; and (2) 

Kiritchenko, S., & Mohammad, S. M. (2018). [Examining gender and race bias in two hundred sentiment analysis systems](https://arxiv.org/pdf/1805.04508.pdf). arXiv preprint arXiv:1805.04508.

[Equity Evaluation Corpus (EEC)](http://saifmohammad.com/WebPages/Biases-SA.html)



### Data

First, let's load the datasets "Affect in Tweets" taken from [SemEval 2018](https://competitions.codalab.org/competitions/17751#learn_the_details-datasets) competition. We have training, development and test datasets. We will use only the first and the last, but feel free to use the development dataset to tune select models and hyperparameters with cross validation.

We have three columns:

1. `Tweet` - The tweet itself as string, the input.
2. `Intensity Score` - The sentiment's intensity of the tweet in the range [0, 1].
3. `Affect Dimension` - You can ignore it. It is `'valence'` for all of the datapoints.


In [None]:
train_df = pd.read_csv('./SemEval2018-Task1-all-data/English/V-reg/2018-Valence-reg-En-train.txt',
                       sep='\t', index_col=0)
dev_df = pd.read_csv('./SemEval2018-Task1-all-data/English/V-reg/2018-Valence-reg-En-dev.txt',
                       sep='\t', index_col=0)
test_df = pd.read_csv('./SemEval2018-Task1-all-data/English/V-reg/2018-Valence-reg-En-test-gold.txt',
                       sep='\t', index_col=0)

In [None]:
# A few examples

train_df.head()

In [None]:
# Convert all the labels from real numbers into boolean values,
# setting the threshold at 0.5, and creating a new column named
# `label`

train_df['label'] = train_df['Intensity Score'] > 0.5
dev_df['label'] = dev_df['Intensity Score'] > 0.5
test_df['label'] = test_df['Intensity Score'] > 0.5

Now, let's download the word2voc **complete** word embedding (not filtered only to lower cased words), and load it using `gensim`.

In [None]:
!wget https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz

In [None]:
from gensim.models import KeyedVectors

# Load the word2vec
w2v_model = KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz',
                                              binary=True)

In [None]:
# Get the vector embedding for a word
w2v_model['home']

In [None]:
# Check whether there is an embedding for a word
'bazinga' in w2v_model

### Preprocessing & feature extraction

Before we transform a tweet into a vector of 300 dimention, it should be broken into tokens ("words") and be cleaned. You can do so with various Python pakcages for NLP, such as [NLTK](https://www.nltk.org/) and 
[spaCy](https://spacy.io/). Feel free to use them if you would like to! We will use the preprocessing functionality that comes with [`gensim`](https://radimrehurek.com/gensim/parsing/preprocessing.html).

In [None]:
from gensim.parsing.preprocessing import (preprocess_string,
                                          strip_tags,
                                          strip_punctuation,
                                          strip_multiple_whitespaces,
                                          strip_numeric,
                                          remove_stopwords)

# We pick a subset of the default filters,
# in particular, we do not take
# strip_short() and stem_text().
FILTERS = [strip_punctuation,
           strip_tags,
           strip_multiple_whitespaces,
           strip_numeric,
           remove_stopwords]

# 
preprocess_string('This is a "short" text!', FILTERS)

After prerocessing all the tweets, we get tokens. We transform each token into a 300d vector using the word embedding, and then compute the *average* vector. It will have 300d as well. This vector seves as the features values for each tweet. 

Note for this two possible pitfalls:

1. Make sure that the token exists int he word embedding.
2. Sometimes, there are tweets without any token found in the word embedding. Discard these tweets from the data. Keep in mind that you should discard the labels as well.

Write the function `generate_text_features(text, w2v)` that gets a string `text` and a word embedding `w2v` and produce the features of this text according to the method described above.

In [None]:
def generate_text_features(text, w2v):
    pass  # Your Code Here...

Now, use this function to produce the features for all the three datasets (training, validation, test).

In [None]:
# Your Code Here...

### Training a classifier

The next step is strightforward, train logistic regression on the dataset. Report the accuracy for the training and the test dataset.

We recommend using [`sklearn.linear_model.LogisticRegression`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).

In [None]:
# Your Code Here...

### Evaluate gender bias in the downstream appliation