# A survey on datasets for fairness-aware machine learning

[Paper](https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1452)

[Slides](https://danjacobellis.github.io/FTML/survey_of_datasets.slides.html)


<script>
    document.querySelector('head').innerHTML += '<style>.slides { zoom: 1.75 !important; }</style>';
</script>

<center> <h1>
A survey on datasets for fairness-aware machine learning
</h1> </center>

## Three ways to intervene in the name of fairness

* Interventions in the original data
  * Class modification
  * Sampling
* Change the learning algorithm
  * Regularization
  * Incorporate fairness into objective function
* Post processing of the model outputs
  * Move decision boundary
  * Cascade fair classifier with black box model

## Caveats

* We will only explore tabular data
* We will use a **Bayesian network (BN)** to explore the relationships between features
* All numerical features will be discretized to make them categorical
  * Most BN algorithms cannot efficiently handle numeric features
* We will examine relationships between specific, categorical features. Examples:
    * $A_1 = \text{sex} \in \{M,F\}$
    * $A_2 = \text{race} \in \{\text{white},\text{nonwhite}\}$
    * $A_3 = \text{race} \in \{\text{white}, \text{black}, \text{asian-pac-islander}, \text{amer-indian}, \text{other}\}$
    * $A_4 = \text{age} \in \{ 17, 18, \dots, 90 \}$
    * $A_5 = \text{age} \in \{ 0, \dots, 255 \}$
    * $A_6 = \text{age} \in \{ 25-60, <25, >60 \}$

## Ricci v. DeStefano

* Firefighter promotions determined by result of exam
  * Mostly whites passed exam
  * Few black firefighters passed exam
  * Supreme court ruled that *ignoring the exam* violates 1964 civil rights act
  
&nbsp;

| Attribute | Values               | Description                      |
|-----------|----------------------|----------------------------------|
| Position  | {Lieutenant,Captain} | Desired promotion                |
| Race      | {White, Non-White}   | Self identified race             |
| Written   | [46-95]              | Written exam score               |
| Oral      | [40.83-92.8]         | Oral exam score                  |
| Combined  | [45.93-92.8]         | 0.6 \*written + 0.4\*oral        |
| Promoted  | {True,False}         | Whether a promotion was obtained |

## Data balance

<p style="text-align:center;">
<img src="_images/data_balance.png" width=600 height=600 class="center">
</p>

![](img/data_balance.png)

## Statistical parity score

$$\begin{align} \text{SP} &= P(\hat{y}=+|S=\bar s) \\ &- P(\hat{y}=+|S= s)\end{align}$$

* $s$ is the protected group and $\bar s$ is the unprotected group
* $\text{SP}=0$ occurs when there is no discrimination
* $\text{SP}\in (0,1]$ occurs when the protected group is discriminated against
* $\text{SP}\in [-1,0)$ occurs when the unprotected group is discriminated against

Use logistic regression model to get a baseline fairness scores for each dataset.

![](img/ricci_logistic.png)

<p style="text-align:center;">
<img src="_images/ricci_logistic.png" width=650 height=650 class="center">
</p>

## Bayesian network structure learning

* The structure of the network $\scr{M}$ should maximize the likelihood of generating the dataset $\cal{D}$
* Regularize the parameters of the network $\widehat{\scr{M}}$ (i.e. the edges of the graph)
* Ensure that the protected attribute $y$ is a leaf node

$$\max_{\scr{M^*}}{\left\{ P(\cal D | \scr M - \gamma \widehat {\scr M} \right\}}$$
$$ \text{subject to } y\in \scr L$$

* Optimization completed using the [pomegranate](https://pomegranate.readthedocs.io/en/latest/BayesianNetwork.html) software
  * Exact and approximate algorithms are available
  * All of the datasets used are small enough to use exact algorithms

![](img/ricci_BN2.png)

<p style="text-align:center;">
<img src="_images/ricci_BN2.png" width=650 height=650 class="center">
</p>

## Finance (KDD)

-balance -bayes

## Criminology (Compas)

## Balanced Accuracy

## Equalized odds

## ABROCA

## Healthcare (Diabetes)

## Education (Math)

## Summary: accuracy, balance, fairness

| Dataset           | Protected Attribute | Group Distribution (%) | Accuracy | Balanced Accuracy | Statistical Parity | Equalized Odds | ABROCA |
|-------------------|---------------------|-----------------------------------------|----------|-------------------|--------------------|----------------|--------|
| Ricci             | Race                | [12.7, 29.7, 34.7, 22.9]                | N/A      | N/A               | 0.1714             | N/A            | N/A    |
| COMPAS Recid.     | Race                | [31.5, 28.7, 15.5, 24.3]                | 0.6414   | 0.6299            | -0.3398            | 0.6452         | 0.0675 |
| KDD Census-Income | Sex                 | [1.3, 50.7, 4.8, 43.2]                  | 0.9474   | 0.6031            | 0.0198             | 0.0403         | 0.0074 |
| Diabetes          | Gender              | [11.1, 34.1, 13.1, 41.7]                | 0.7584   | 0.5               | N/A                | N/A            | 0.0189 |

## How can a model discrimintate against both classes simultaneously?

| Dataset           | Protected Attribute | Group Distribution | Accuracy | Balanced Accuracy | Statistical Parity | Equalized Odds | ABROCA |
|-------------------|---------------------|-----------------------------------------|----------|-------------------|--------------------|----------------|--------|
| COMPAS Recid.     | Race                | [31.5, 28.7, 15.5, 24.3]                | 0.6414   | 0.6299            | **-0.3398**        | 0.6452         | 0.0675 |
A: Data are heavily imbalanced towards the protected class
<p style="text-align:center;">
<img src="_images/ABROCA_COMPAS.png" width=600 height=600 class="center">
</p>


![](img/ABROCA_COMPAS.png)