# Machine Learning Fairness

_ **Note**:  the goal of this part of the assignment is to understand the kinds of biases that commonly sneak into machine learned systems and a handful of techniques to improve standard modeling.  While we hope you find this instructive, we empathize that these research results may negatively affect some students.  Please reach out to the teaching staff if you have serious concerns for alternate arrangements._

From simple count-based models to the most complex neural architectures, machine learning models are ultimately nothing more than the product of the signals and labels in the training set.  That these tools can so effectively mimic and generalize from the training set distribution is the key to why they are so useful in so many applications.

This powerful ability to fit a data is a double edged sword.  Unfortunately, the real world is filled with inequality, unfairness and stereotypes.  When the signals and labels systemically capture these aspects of the world, the powerful ability to generalize has other names: bias.  This bias can take many forms:  a minority group of entries in the training set would be underrepresented (the loss function is incented to produce a model that works better on the majority at the expense of the minority) or predictions may be systemically biased against a protected group (i.e. the model learns to predict the protected label and from that the actual prediction rather than learning the prediction directly).

In this part of the assignment, we will take a look at a few nice analyses that discuss this bias. Below are a few questions about these papers.

- [How to make a racist AI without really trying](http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/)
- [Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings](https://arxiv.org/pdf/1607.06520.pdf)
- [Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations](https://arxiv.org/pdf/1707.00075.pdf)

## Questions about the Racist AI

1.  In [Step 5](http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/#Step-5:-Behold-the-monstrosity-that-we-have-created), the author shows that substituting a type of cuisine into a fixed sentence significantly changes the overall sentiment score of their model.  What is the difference in sentiment score between the word ```Italian``` and ```Mexican``` (not the difference in the whole sentence!), assuming that embeddings for all words in the sentence are found in GloVe.?

2. Rank ConceptNet Numberbatch, GloVe and Word2Vec by ethnic bias as defined by the author?

4. What technique does the author apply to achieve that lower bias?

1. The difference is 2.0429166109408983 - 0.38801985560121732 =1.65489677

2. Bias: word2vec (15.573) is more than GloVe(13.041597745167659) and more than ConceptNetNumberbatch (3.805)

3. Debiasing Word Embeddings. Switch to ConceptNet, the knowledge graph with word-embedding features built in, has a training step that adjusts the embeddings to identify and remove some sources of algorithmic racism and sexism. 

## Questions about Debiasing Word Embeddings

Word embeddings are commonly used in deep neural networks to solve analogy tasks (see the corresponding sections in both [Word2Vec](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) and [GloVe](https://nlp.stanford.edu/pubs/glove.pdf)).  This paper quickly reintroduces that task, then continues to explore the analogy task with additional tuples that illustrate the bias that these vectors have picked up.

1.  What evidence from the previous analysis makes the scatter plot of Figure 4 not surprise you?
2.  Why are the results of Table 1 important?
3.  What are the two stages of debiasing?
4.  Once the subspace is found, one of the options to update vectors is called?

1. The analysis used w2vNEWS embedding to suggest that gender stereotype is prevalent. Figure 4 does not surprise me as it shows gender stereotypes is prevalent across both GloVe and w2vNEWS embeddings and is not an artifact of the particular training corpus or methodology of word2vec.

2. The result of Table 1 is important because it shows the performance does not degrade after debiasing, which is the basis for the experiments in the latter part of this paper. 

3. The first step, called identify gender subspace, is to identify a direction of the embedding that captures the bias. The second step, we define two options: Neutralize and Equalize or Soften. Neutralize ensures that gender neutral words are zero in the gender subspace. Equalize perfectly equalizes sets of words outside the subspace and thereby enforces the property that any Neutral word is equidistant to all words in each equality set.

4. Once the subspace is found, we can use either hard de-biasing to neutralize and equalize the data or soft bias method to correct the bias.

### Questions about Adversarial Learning

1.  What is the intuition behind the parity gap measure?  (Don't give us the formula, give us <= TWO sentences.)
2.  What is the intuition behind the equality gap measure?  (Don't give us the formula, give us <= TWO sentences.)
3.  What is the intuition behind $J_{\lambda}$?  (Don't give us the formula, give us <= TWO sentences.)

1. Parity gap is a way to measure the independence between prediction YË† and sensitive attribute Z. the lower the better. 

2. Equality gap is a way to measure whether embedding h is independent of Z, given Y=1

3. $J_{\lambda}$ is an identity function with a negative gradient. It will help g() to maximize the classication error for Z.

