<a href="https://colab.research.google.com/github/venkatacrc/Notes/blob/master/ML_papers/WhyIsMyClassifierDiscriminatory.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Why Is My Classifier Discriminatory?

Source: [Why Is My Classifier Discriminatory?](https://youtu.be/kLUc1bG2rv4)

### Motivation
It is **surprisingly easy** to make a discriminatory algorithm even accidentally.

Logistic regression showed differences in accuracy by race, especially high errors for asians. 

This paper includes:
* Find the sources of unfairness to guide resource allocation.
* Decompose unfairness into bias, variance, and noise.
* Demonstrate methods to guide feature augmentation and training data collection to fix unfairness.

### What are some of the reasons for unfairness?
* Some groups are underrepresented, resulting higher errors.
  This is due to **Variance** and can be solved by collecting more data.
* Model class ill suited for a particular group. The resulting error is due to **Bias** and can be fixed by changing the model class. 
* If one group is harder to predict compared to another even with the best model and infinite data. This error is due to noise and recommended to collect more features.

### Method
Fairness definition:
Fairness is defined as a difference between the loss metric between two groups.
In the context of **loss functions** like false positive rate, false negative rate, etc.

For example, acuracy for data D and prediction $\hat Y$:

$\gamma_a(\hat Y, Y, D) := P_D(\hat Y \neq Y | A = a)$

Formalizing **unfairness as group differences**

$\bar \Gamma(\hat Y) := |\gamma_1 - \gamma_0|$

Note: This method relies on accurate Y labels and focus on algorithmic error.

**Theorem 1:** For error over group a given predictor $\hat Y$:

$\bar\gamma_a(\hat Y) = \bar B_a(\hat Y) + \bar V_a(\hat Y) + \bar N_a$

Note that $\bar N_a$ indicates the expectation of $N_a$ over X and data D.

Accordingly, The expected discrimination level $\bar \Gamma(\hat Y) := |\gamma_1 - \gamma_0|$ can be decomposed into differences in bias, differences in variance, and differences in noise.

$\bar\Gamma = |(\bar B_1 - \bar B_0)| + |(\bar V_1 - \bar V_0)| + |(\bar N_1 - \bar N_0)|$

From MIMIC-III clinical data:
* Found statistically significant racial differences in zero-one loss
* By subsampling data, fitted inverse power loss to estimate the benefit of more data and reducing variance
* Using topic modeling, identified subpopulations to gather more features to reduce noise

###Conclusion
1. For **accurate and fair models** deployed in real world applications, both the **data** and **model** should be considered.
1. By using easily implementable fairness checks, the algorithms can be checked for bias, variance, and noise which will guide further efforts to reduce unfairness.


Group-specific Loss

> $\bar \gamma_a = \underbrace{\bar B_a(\hat Y)}\limits_{Bias} + \underbrace{\bar V_a(\hat Y)}\limits_{Variance} + \underbrace{\bar N_a} \limits_{Noise}$

Discrimination Level

>$\bar\Gamma = |(\bar B_1 - \bar B_0)| + |(\bar V_1 - \bar V_0)| + |(\bar N_1 - \bar N_0)|$