# FairLearn - a reductions approach 


[Paper](https://arxiv.org/pdf/1803.02453.pdf): _A Reductions Approach to Fair Classification_, 2018

> We present a systematic approach for achievingfairness in a binary classification setting. Whilewe focus on two well-known quantitative defini-tions of fairness, our approach encompasses manyother  previously  studied  definitions  as  specialcases. The key idea is to __reduce fair classification__ to a __sequence  of  cost-sensitive__  classification problems, whose solutions yield a randomized classifier with the __lowest (empirical) error__ subject to  the  __desired  constraints__.   We  introduce  two reductions that work for any representation of the cost-sensitive  classifier  and  compare  favorably to prior baselines on a variety of data sets, while overcoming several of their disadvantages.

[FairLearn Documentation](https://fairlearn.github.io/user_guide/mitigation.html#id17)



## TLDR; 

- This approach poses Fair Learning as a constrained optimization problem: minimize the empirical error, subject to linear constraints of the fairness (e.g., TPR difference, demographic parity).
- Solve the constrained optimization as a __cost-sensitive__ classification problem.
- Obtain a __randomized classifier__, which implies they will create multiple base estimators.
- 

In [3]:
!pip install aequitas
!pip install fairlearn
import yaml
import os
import pandas as pd
import numpy as np
import seaborn as sns
from aequitas.group import Group
from aequitas.bias import Bias
from aequitas.fairness import Fairness
import aequitas.plot as ap
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
import fairlearn
DATAPATH = 'https://github.com/dssg/fairness_tutorial/raw/master/data/'

In [12]:
# Let's use the methods 
from fairlearn.reductions import ExponentiatedGradient, GridSearch, DemographicParity, TruePositiveRateDifference
from fairlearn.metrics import selection_rate_group_summary

In [53]:
traindf = pd.read_csv(DATAPATH + 'train_20120501_20120801.csv.gz', compression='gzip')
testdf = pd.read_csv(DATAPATH + 'test_20121201_20130201.csv.gz', compression='gzip')
train_attrdf = pd.read_csv(DATAPATH + 'train_20120501_20120801_protected.csv.gz', compression='gzip')
test_attrdf = pd.read_csv(DATAPATH + 'test_20121201_20130201_protected.csv.gz', compression='gzip')

In [54]:
label_col = 'quickstart_label'
date_col = 'as_of_date'
id_col = 'entity_id'
attr_col = 'poverty_level'
exclude_cols = [label_col, date_col, id_col]

X_train, y_train, A_train = traindf[[c for c in traindf.columns if c not in exclude_cols]].values, traindf[label_col].values, train_attrdf[[attr_col]]
X_test,   y_test,   A_test   = testdf[[c for c in testdf.columns if c not in exclude_cols]].values,   testdf[label_col].values  , test_attrdf[[attr_col]]


### Exponentiated Gradient

The exponentiated gradient algorithm 



Its hyperparameters are: 
- `estimator`: an estimator that implements the methods `fit(X, y, sample_weight)` and `predict(X)`.
- `constraints`: disparity constraints.
- `eps: float`: fairness threshold, i.e., how much constraint violation we support (defaults to 0.01). 
- `T: int`: maximum number of iterations (defaults to 50).
- `nu: float`: convergence threshold for duality gap (defaults to None).
- `eta_0: float`: initial learning rate (defaults to 2).
- `run_linprog_step: bool`: whether to apply saddle point optimization to the convex hull of classifiers obtained so far, after each exponentiated gradient step (defaults to True).

In [55]:
# NOTE: Exponentiated Gradient has a stoachastic component
np.random.seed(0)

In [58]:
# Step 1. Define the constraint
constraint = TruePositiveRateDifference()

# Step 2. Define the base estimator (any estimator providing 'fit' and 'predict')
# Note: we could have used other algorithm such as logistic regression or random forest
base_estimator = DecisionTreeClassifier(max_depth=20, min_samples_leaf=10)

# Step 3. Define the bias reducer algorithm you want to apply
bias_reducer = ExponentiatedGradient(base_estimator, constraint, T=5)

# Step 4. Fit the data (and provide the sensitive attributes)
bias_reducer.fit(X_train, y_train, sensitive_features=A_train)

In [59]:
# Step 5. Use the mitigator to make predictions 
y_pred = bias_reducer.predict(X_test)
y_pred

array([0, 0, 1, ..., 1, 0, 1])

In [48]:
new_preds = testdf[['entity_id','as_of_date','quickstart_label']].copy()
new_preds['score'] = y_pred_mitigated


In [50]:
new_preds['score'].value_counts()

0    10602
1     7075
Name: score, dtype: int64

In [51]:
df = pd.merge(new_preds, test_attrdf, how='left', on=['entity_id','as_of_date'], left_index=True, right_index=False, sort=True, copy=True)
df = df.rename(columns = {'quickstart_label':'label_value'})
g = Group()
xtab, _ = g.get_crosstabs(df[['score','label_value','poverty_level','metro_type', 'teacher_sex']].copy())

model_id, score_thresholds 0 {'rank_abs': [7075]}


In [52]:
b = Bias()
bdf = b.get_disparity_predefined_groups(xtab, original_df=df, ref_groups_dict={'poverty_level':'lower', 'metro_type':'suburban_rural', 'teacher_sex':'male'})
metrics = ['tpr']
ap.disparities(bdf, metrics, 'poverty_level', fairness_threshold = 1.3)

get_disparity_predefined_group()
