# How do different fairness notions affect the performance of machine learning models?
In this section we will explore how different fairness notions affect the performance of a machine learning model. We will cover the following notions:
* **Statistical parity**
* **Individual fairness**
* **Equalized odds**
* **Equal opportunity**

Barocas et al. (2017) classify many more fairness notions.

First, we motivate the relevance of this research questions by contextualizing it in contemporary trends of machine learning. We discuss each notion separately. Then, we briefly look at the theoretic background of these notions and predict the expected performance. We conclude by putting these notions to the test.

## Motivation
Machine learning applications are increasingly being applied in the industry. Legislators, insurers and banks are playing catch-up to integrate this technology in the process of decision-making. Supervised learning often relies on historical data. This means that bias present in the data is transferred to the model. Perpetuating this bias is not only unfair, but often unlawful or contrary to company policy. Chouldechova & Roth (2018) identify three causes of unfairness:
* **Bias in training data**: historical data that has human bias embedded in it. A classic example is the disproportionate amount of crime committed by some marginalized and ostracized communities. However, this can often be explained by considering the socio-economic situation. Also, these areas might be policed at a higher rate, which further skews crime-prediction models.
* **Minimizing average error**: a majority group will be more accurately represented in a model than a minority group. Naturally, it follows from the fact that the majority group has a larger representation and thus minimizing errors will benefit more if the error of each individual has the same weight.
* **Related to exploration**: online learning models that gets updated with new information while being used, can greatly benefit from the information gained of taking suboptimal decisions. This can be either amoral (e.g. for medical procedures) or benefit/disadvantage certain groups.

Model with a low VC-dimension (and thus high bias) will amplify these undesired effects.

## Fairness notions
We will now introduce the different fairness notions that are going to be applied.

### Group fairness
**Statistical parity** or **demographic parity** is a group fairness notion. It is satisfied for a given (sensitive) group attribute when the *positive* classification distribution for each group is identical to that of the entire population (Barocas et al., 2017). This means that predictions needs to be statistically independent with respect to the group attribute. For a group attribute $G$ and attribute to predict $R$ (with "$+$" deemed a *positive* prediction), it holds that:
$$\forall a,b \in G: \mathbb{P}(R = + \mid G = a) = \mathbb{P}(R = + \mid G = b)$$

### Individual fairness

### Equalized odds


### Equal opportunity

## Sources
* Chouldechova, A., & Roth, A. (2018). The frontiers of fairness in machine learning. *arXiv preprint arXiv:1810.08810*.
* Barocas, S., Hardt, M., & Narayanan, A. (2017). Fairness in machine learning. Nips tutorial, 1, 2017.
* Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012, January). Fairness through awareness. In *Proceedings of the 3rd innovations in theoretical computer science conference* (pp. 214-226).