# Machine Learning Fairness

_ **Note**:  the goal of this part of the assignment is to understand the kinds of biases that commonly sneak into machine learned systems and a handful of techniques to improve standard modeling.  While we hope you find this instructive, we empathize that these research results may negatively affect some students.  Please reach out to the teaching staff if you have serious concerns for alternate arrangements._

From simple count-based models to the most complex neural architectures, machine learning models are ultimately nothing more than the product of the signals and labels in the training set.  That these tools can so effectively mimic and generalize from the training set distribution is the key to why they are so useful in so many applications.

This powerful ability to fit a data is a double edged sword.  Unfortunately, the real world is filled with inequality, unfairness and stereotypes.  When the signals and labels systemically capture these aspects of the world, the powerful ability to generalize has other names: bias.  This bias can take many forms:  a minority group of entries in the training set would be underrepresented (the loss function is incented to produce a model that works better on the majority at the expense of the minority) or predictions may be systemically biased against a protected group (i.e. the model learns to predict the protected label and from that the actual prediction rather than learning the prediction directly).

In this part of the assignment, we will take a look at a few nice analyses that discuss this bias. Below are a few questions about these papers.

- [How to make a racist AI without really trying](http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/)
- [Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings](https://arxiv.org/pdf/1607.06520.pdf)
- [Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations](https://arxiv.org/pdf/1707.00075.pdf)

## Questions about the Racist AI

1.  In [Step 5](http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/#Step-5:-Behold-the-monstrosity-that-we-have-created), the author shows that substituting a type of cuisine into a fixed sentence significantly changes the overall sentiment score of their model.  What is the difference in sentiment score between the word ```Italian``` and ```Mexican``` (not the difference in the whole sentence!), assuming that embeddings for all words in the sentence are found in GloVe.?

2. Rank ConceptNet Numberbatch, GloVe and Word2Vec by ethnic bias as defined by the author?

4. What technique does the author apply to achieve that lower bias?

In [4]:
# 1
#text_to_sentiment("Let's go get Italian food")
italian_score = 2.0429166109408983
#text_to_sentiment("Let's go get Chinese food")
chinese_score = 1.4094033658140972
#text_to_sentiment("Let's go get Mexican food")
mexican_score = 0.38801985560121732
# the score is the mean sentiment score of all the words in the sentence
# using the "Chinese" sentence as a baseline, we can calculate how "Mexican" and "Italian" differed from the baseline
# and use those two values to compute the difference in sentiment score
mexican_diff = chinese_score - mexican_score
italian_diff = italian_score - chinese_score
ital_vs_mex = italian_diff + mexican_diff
print('Difference in sentiment score between "Italian" and "Mexican" = {0}'.format(ital_vs_mex))

# 2
print('\nEthnic Bias as defined by the author (most to least)')
print("1. Word2Vec")
print("2. GloVe")
print("1. ConceptNet Numberbatch")

#3
print('\nDebiasing Word Embeddings')

Difference in sentiment score between "Italian" and "Mexican" = 1.654896755339681

Ethnic Bias as defined by the author (most to least)
1. Word2Vec
2. GloVe
1. ConceptNet Numberbatch

Debiasing Word Embeddings


## Questions about Debiasing Word Embeddings

Word embeddings are commonly used in deep neural networks to solve analogy tasks (see the corresponding sections in both [Word2Vec](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) and [GloVe](https://nlp.stanford.edu/pubs/glove.pdf)).  This paper quickly reintroduces that task, then continues to explore the analogy task with additional tuples that illustrate the bias that these vectors have picked up.

1.  What evidence from the previous analysis makes the scatter plot of Figure 4 not surprise you?

*The Racist AI analysis demonstrated that common pronouns like Italian and Mexican have vastly different sentiment scores. Based on this, it is not surprising that she-he words exhibit the same bias.*

2.  Why are the results of Table 1 important?

*The results of Table 1 show that the authors' debiasing techniques do not seriously negatively (or positively) impact the accuracy of standard benchmarks. This is important because it eliminates a key argument against using debiasing techniques - if the techniques provide large benefits with no accuracy-related drawbacks, then why not use them?*

3.  What are the two stages of debiasing?

*(1) Identify gender subspace - identify the component of the embedding vector that implies gender bias.*

*(2) Neutralize and Equalize or Soften - both involve reducing the amount of bias in the embeddings by either (a) eliminating it entirely which works but you lose other definitions like "to grandfather in a cell phone plan" or (b) repositioning the word vectors so that those fringe definitions are maintained but the vectors are more gender-neutral.*

4.  Once the subspace is found, one of the options to update vectors is called?

*Neutralize and Equalize... this is a weird question*

### Questions about Adversarial Learning

1.  What is the intuition behind the parity gap measure?  (Don't give us the formula, give us <= TWO sentences.)

*The parity gap measures the amount of bias present in the embeddings.*

2.  What is the intuition behind the equality gap measure?  (Don't give us the formula, give us <= TWO sentences.)

*The parity gap measures the amount of bias present in the predictions.*

3.  What is the intuition behind $J_{\lambda}$?  (Don't give us the formula, give us <= TWO sentences.)

*Traditional ML would, without $J_{\lambda}$, lead $g(X)$ to predict Z, but we want the opposite. Introducing $J_{\lambda}$ makes $g$ trained to predict Y but also to make it difficult for $a()$ to predict Z.*