# Analysis of Demographic Parity and Equalized Odds

This is a notebook I created while reading [Equality of Opportunity in Supervised Learning](https://arxiv.org/abs/1610.02413). 

I am not sure there is a way to access the code the authors used to produce their plots/results, however, after some googling I found the [Fairlearn](https://fairlearn.github.io/v0.5.0/index.html#) package, which seemingly aims to _empowers developers of artificial intelligence (AI) systems to assess their system's fairness and mitigate any observed unfairness issues_. Reading their documentation they seem to implement, among others, the methodology described by the paper above.

I thought it would be good to play around with this tool to assimilate the theoretical concepts introduced by the paper, so what you find below is a minimal overview of the practical use of this methodology, compared to another standard definition of fairness called _Demographic Parity_ (also implemented in Fairlearn)

In [1]:
import numpy as np
import pandas as pd
from sklearn.datasets import fetch_openml

RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)

In [2]:
data = fetch_openml(data_id=1590, as_frame=True)
X = pd.get_dummies(data.data)
y_true = (data.target == '>50K') * 1
sex = data.data['sex']
sex.value_counts()

Male      32650
Female    16192
Name: sex, dtype: int64

In [3]:
data.data.head()

Unnamed: 0,age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country
0,25.0,Private,226802.0,11th,7.0,Never-married,Machine-op-inspct,Own-child,Black,Male,0.0,0.0,40.0,United-States
1,38.0,Private,89814.0,HS-grad,9.0,Married-civ-spouse,Farming-fishing,Husband,White,Male,0.0,0.0,50.0,United-States
2,28.0,Local-gov,336951.0,Assoc-acdm,12.0,Married-civ-spouse,Protective-serv,Husband,White,Male,0.0,0.0,40.0,United-States
3,44.0,Private,160323.0,Some-college,10.0,Married-civ-spouse,Machine-op-inspct,Husband,Black,Male,7688.0,0.0,40.0,United-States
4,18.0,,103497.0,Some-college,10.0,Never-married,,Own-child,White,Female,0.0,0.0,30.0,United-States


In [4]:
from fairlearn.metrics import MetricFrame
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier

In [5]:
classifier = DecisionTreeClassifier(min_samples_leaf=10, max_depth=4)
classifier.fit(X, y_true)
y_pred = classifier.predict(X)

### No fairness measure

In [6]:
gm = MetricFrame(accuracy_score, y_true, y_pred, sensitive_features=sex)
print(f"Total accuracy score: {gm.overall}\n")
print(f"Accuracy score by: {gm.by_group}\n")

Total accuracy score: 0.8443552680070431

Accuracy score by: sex
Female    0.925148
Male      0.804288
Name: accuracy_score, dtype: object



### Fairness metrics

When working on fairness, before we can start _fairifying_ models we need to establish fairness metrics to optimise. Here we consider two definitions _Demographic Parity_ and _Equalized Odds_.

* **Demographic Parity**: is defined as `Pr(Y_pred=1 | A=1) = Pr(Y_pred=1 | A=0)`
* **Equalized Odds**: is defined as `Pr(Y_pred=1 | A=1, Y=y) = Pr(Y_pred=1 | A=0, Y=y)`

Since in both cases we won't get perfect equality a good way to assess fairness is to look at the difference between the terms, and this is exactly what is done in `fairnlearn`:

For _Demographic Parity_ we have:
```
demographic_parity_difference = |Pr(Y_pred=1 | A=1) - Pr(Y_pred=1 | A=0)|
```

For _Equalized Odds_ we have:
```
equalized_odds_difference = max(
    |Pr(Y_pred=1 | A=1, Y=0) - Pr(Y_pred=1 | A=0, Y=0)|,
    |Pr(Y_pred=1 | A=1, Y=1) - Pr(Y_pred=1 | A=0, Y=1)|
)
```

In [7]:
from fairlearn.metrics import demographic_parity_difference, equalized_odds_difference

dp_difference = demographic_parity_difference(y_true, y_pred, sensitive_features=sex)
print(f"Demographic parity difference: {dp_difference}")

eo_difference = equalized_odds_difference(y_true, y_pred, sensitive_features=sex)
print(f"Equalized odds difference: {eo_difference}")

Demographic parity difference: 0.15004887369937472
Equalized odds difference: 0.0811655575720911


### Demographic Parity

In [None]:
from fairlearn.reductions import ExponentiatedGradient, DemographicParity

mitigator = ExponentiatedGradient(classifier, DemographicParity())
mitigator.fit(X, y_true, sensitive_features=sex)
y_pred_mitigated = mitigator.predict(X)

In [None]:
gm_dp = MetricFrame(accuracy_score, y_true, y_pred_mitigated, sensitive_features=sex)
print(f"Total accuracy score {gm_dp.overall}")
print(gm_dp.by_group)

In [None]:
dp_difference = demographic_parity_difference(y_true, y_pred_mitigated, sensitive_features=sex)
print(f"Demographic parity difference: {dp_difference}")

### Equalized Odds

In [8]:
from fairlearn.postprocessing import ThresholdOptimizer
from fairlearn.reductions import EqualizedOdds

mitigator = ThresholdOptimizer(estimator=classifier, constraints='equalized_odds')
mitigator.fit(X, y_true, sensitive_features=sex)
y_pred_mitigated = mitigator.predict(X, sensitive_features=sex)

In [9]:
gm_eo = MetricFrame(accuracy_score, y_true, y_pred_mitigated, sensitive_features=sex)
print(f"Total accuracy score {gm_eo.overall}")
print(gm_eo.by_group)

Total accuracy score 0.8219155644731992
sex
Female    0.882102
Male      0.792067
Name: accuracy_score, dtype: object


In [10]:
eo_difference = equalized_odds_difference(y_true, y_pred_mitigated, sensitive_features=sex)
print(f"Equalized odds difference: {eo_difference}")

Equalized odds difference: 0.0019057914241039087
