# **Milestone 3: Sources, Forms, and Quantification of Bias and Discrimination in Supervised Learning**
## **PRACTICE NOTEBOOK 2 - Evaluate model bias "manually"**


In this part of the course, we will look for bias using a practical example. A  company is looking to hire a new employee. They use a machine learning algorithm to select the top candidates. The candidates are assigned either 0 if they're not selected or 1 if they are. 

There are 4 practice notebooks in total (current one in red):
1. Explore given data to: detect potential bias early & check for proxies 
2. Evaluate model bias "manually"
3. Evaluate model bias using **holisticai** library
4. <font color='red'> **Example code to get confidence intervals for a metric (nothing to do)**</font>

Instructions to complete in each parts are in bold. Intermediate results are given so one can continue the exercise. 

This is notebook number 2. 

Up to this point, we haven't used any model yet and simply analysed the dataset used for training. We haven't used any fairness metrics either, but we can already see a few points that are warning signs. The first one is that the average success rate within the training data differs for various ethnicities (Asian : 0.32, Black : 0.35, Hispanic : 0.38 , White : 0.39). If there was no proxies at all for ethnicity in the data, it would be fine to train with this data as ethnicity would have zero influence on the result. Unfortunately, we have seen that it is not the case. We can therefore expect there might be some bias on a model trained on this data. 

Note that in a more thorough experiment, we would test fairness using cross-validation or bootsampling methods (see section 10.2 from this book), giving us more trustworthy average figures for each metric (for accuracy and fairness). In this course, we do the analysis on one single random split of the data, hence yielding only one figure. Example code to obtain a confidence interval is provided in practice notebook 4. 

We evaluate two types of metrics:

- Equality of outcome metrics: These measure the distribution of positive outcomes with respect to the protected characteristic. We focus on this when we would like our model to predict an equal proportion of positive outcomes for the protected group compared with the rest of the population. We will estimate the two following metrics: Statistical Parity and Disparate Impact. 

- Equalized Odds and Equality of Opportunity metrics. These measure the distribution of model errors with respect to the protected characteristic. These are useful when Statistical Parity is not an appropriate goal, for example where there are legitimate reasons for a protected group to have a different rate of positive outcome to the rest of the population, but where we would like to make sure the model makes the same volume and types of errors for both groups. We will estimate the following metric: Equal Opportunity Difference.

In this notebook we will:
- Split data into train/test sets, define and train a model, get overall accuracy
- Calculate the Equality of Outcome metrics *Statistical Parity* and *Disparate Impact* on the original test set
- Calculate *Statistical Parity* and *Disparate Impact* on a rebalanced test set
- Calculate the *Equal Opportunity Difference*

## **0 - Import modules, load data and useful functions**

In [1]:
#imports 
import pickle
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.linear_model import RidgeClassifier
from sklearn.metrics import accuracy_score, recall_score, precision_score, precision_recall_fscore_support
from sklearn.metrics import confusion_matrix
from imblearn import under_sampling, over_sampling
import imblearn

In [2]:
# Load data
from sklearn.datasets import fetch_openml
bunch = fetch_openml(data_id=44270)
raw_data = bunch['frame']
raw_data

Unnamed: 0,Label,Gender,Ethnicity,0,1,2,3,4,5,6,...,10,11,12,13,14,15,16,17,18,19
0,0.0,Female,Hispanic,-0.876827,-0.541625,-1.968103,-0.983961,-0.844169,-1.501946,0.690312,...,-1.688365,-0.650548,0.836874,-0.368150,-0.581680,-1.331884,0.650493,-1.568461,-0.049631,1.652508
1,0.0,Male,Hispanic,0.289426,-0.456504,-1.631056,1.412955,-0.024019,0.996284,-2.097603,...,-0.086540,-0.191318,-1.355319,0.132149,0.066033,0.593083,0.586004,0.505305,1.912276,0.894592
2,0.0,Male,White,-1.586135,-1.380555,0.295020,-0.216107,-0.741104,-0.261113,-1.455017,...,0.910211,1.549717,-0.854247,-0.821632,-0.407780,0.279067,-1.077535,0.637090,-1.492060,0.663372
3,1.0,Female,Hispanic,1.848717,-0.366034,0.139653,-1.608005,0.919614,0.012383,-0.940962,...,0.183061,2.232990,-1.446175,-1.086500,1.233636,0.574872,-1.421603,0.784939,-0.019479,-1.510623
4,1.0,Male,Hispanic,1.146786,-0.813799,-0.945187,-1.291322,-0.198067,0.124782,-0.906823,...,-0.321193,0.587776,-1.824633,-1.606074,-1.598096,0.205599,0.079420,-0.440719,-2.475596,-1.792374
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,1.0,Male,Hispanic,-1.238414,1.121718,1.269631,-0.093551,-0.892135,-0.497069,0.424476,...,-0.308610,-1.147748,0.097719,1.165068,-1.349054,-0.547598,2.273229,-2.048750,0.983288,-0.034799
9996,0.0,Female,White,-0.838622,-0.150014,-0.813486,-1.623939,0.935371,0.518092,-0.892630,...,-1.032027,0.649060,-1.123258,-1.966932,-1.093828,-0.781517,0.829586,0.000056,0.135866,-1.007733
9997,0.0,Female,Asian,1.230613,-0.022335,-0.928522,0.141622,1.601439,-0.592846,0.333232,...,0.763746,-0.357190,-0.608629,-0.654341,2.113140,-0.528662,-0.710054,0.706443,-0.537377,0.952972
9998,0.0,Male,Hispanic,-0.796576,-0.913765,1.229233,0.723160,-0.747698,-1.486035,-0.848035,...,-0.312473,0.474229,-1.425753,0.018631,0.640391,0.229921,0.363351,1.078940,-0.279152,0.658600


In [None]:
# remove all nans --> we will use the variable data (without nans) for the remaining of this notebook 
data = raw_data.dropna()

In [None]:
def plot_cm(y_true,y_pred,labels = [1,0],display_labels = [1,0], ax = None):
  cm = confusion_matrix(y_true,y_pred,labels = labels)
  if ax is None:
    fig, ax = plt.subplots()
  else:
    fig = ax.figure
  sns.heatmap(cm, annot=True, ax = ax, cmap='viridis',fmt='g')

  ax.set(xticklabels=display_labels,
          yticklabels=display_labels,
          ylabel="True label",
          xlabel="Predicted label")
  return cm
def split_data_from_df(data):
  y = data['Label'].values
  # g = data['Gender'].values
  # e = data['Ethnicity'].values
  X = data[np.arange(50).astype(str)].values
  filter_col = ['Ethnicity','Gender'] + [col for col in data if str(col).startswith('Ethnicity_')] + [col for col in data if str(col).startswith('Gender_')] 
  dem = data[filter_col].copy()
  return X,y,dem
def encode(df):
  g_enc = LabelEncoder()
  e_enc = LabelEncoder()
  df['Gender'] = g_enc.fit_transform(df['Gender'])
  df['Ethnicity'] = e_enc.fit_transform(df['Ethnicity'])
  return df, g_enc,e_enc
def resample_equal(df,cat):
  df['uid'] = df[cat] + df['Label'].astype(str)
  enc = LabelEncoder()
  df['uid'] = enc.fit_transform(df['uid'])
  # Resample
  uid = df['uid'].values
  res = imblearn.over_sampling.RandomOverSampler(random_state=6)
  df_res,euid = res.fit_resample(df,uid)
  df_res = pd.DataFrame(df_res,columns = df.columns)
  df_res = df_res.sample(frac=1).reset_index(drop=True)
  df_res['Label'] = df_res['Label'].astype(float)
  return df_res

## **1- Train the model & evaluate overall accuracy**

In the following code, we split the data into a train and test set, then train the model and evaluate overall accuracy. 

In [None]:
# split into train/test
data_train, data_test = train_test_split(data,test_size = 0.3,random_state=4)
# get X,y,demographics for each
X_train,y_train,dem_train = split_data_from_df(data_train)
X_test,y_test,dem_test = split_data_from_df(data_test)
# define model and train
model = RidgeClassifier(random_state=42)
model.fit(X_train,y_train)
# evaluate on test
y_pred_test = model.predict(X_test)
acc = accuracy_score(y_test,y_pred_test)
print("Accuracy is %.2f"%acc)
# add predictions to data_test for simplicity of analysis
data_test = data_test.copy()
data_test['Pred'] = y_pred_test

Accuracy is 0.71


We have an overall accuracy of 0.71, let's now calculate some of the fairness metrics described in part I. We will use:

- Statistical Parity and Disparate Impact
- Equal Odds and Equal Opportunity

## **2. Calculate Statistical Parity and Disparate Impact on original test set**

**Questions** : 
- **Calculate the success rate for Male, Female and White, Black, Asian, Hispanic.** Note that this is the same as the mean y value for these subsets as outcomes are either 0 or 1. 

- **Calculate the Statistical Parity and Disparate Impact for:** 
  - **Female vs Male**
  - **Black vs White**
  - **Asian vs White**
  - **Hispanic vs White**

As a reminder, here are the definition:
  - Statistical Parity. Difference between the success rate of the minority group with the majority group. A value below zero indicates a bias. 
  - Disparate Impact. Ratio of the success rate of the minority group over the majority group. A value below 1 indicates a bias. Fairness is often considered achieved for values between 0.8 and 1.2.



You should get the following results.

| Tested | Statistical Parity | Disparate Impact |
| --- | --- | ---|
| Female vs Male |-0.07 | 0.81 |
| Black vs White | -0.11 | 0.67 |
| Asian vs White | -0.05 | 0.86 |
| Hispanic vs White | -0.03 | 0.90 |


**Questions** : 
- **Who experiences the most bias ? Is this in line with what you predicted in 2.2 ? Why could it be different ?** 

## **3. Calculate Statistical Parity and Disparate Impact on a rebalanced test set**

We know for the dataset analysis that the actual success rate (calculated from the ground-truth label) is lower for the *Black/Asian/Hispanic* groups compared to White group. As the test set is a random subset of the dataset, it is also likely to be true in the test set. Hence even a perfectly trained model should predict a lower success rate for those groups. To test the bias on a more balanced dataset, we resample the test set so that the success rate is equal across all groups. Note this might take a few seconds to run. 


In [None]:
# resample test set so success rate equal across all groups
data_test_res = resample_equal(data_test,'Ethnicity')
X_test_res,y_test_res,dem_test_res = split_data_from_df(data_test_res)
# check success rates
print("New success rates for rebalanced test set:")
pred_e_mean_res  = data_test_res.groupby('Ethnicity').mean()['Label']
for e in pred_e_mean_res.index:
  print('   ',e,'-> %.3f'%pred_e_mean_res[e])
## evaluate model on rebalanced test set
y_pred_test_res = model.predict(X_test_res)
acc_res = accuracy_score(y_test_res,y_pred_test_res)
print("Accuracy is %.2f"%acc_res)
## add predictions to data_test for simplicity of analysis
data_test_res = data_test_res.copy()
data_test_res['Pred'] = y_pred_test_res



New success rates for rebalanced test set:
    Asian -> 0.500
    Black -> 0.500
    Hispanic -> 0.500
    White -> 0.500
Accuracy is 0.67


**Questions** : 
- **Repeat the steps from above to calculate the Statistical Parity and Disparate Impact on the rebalanced test set for:** 
  - **Black vs White**
  - **Asian vs White**
  - **Hispanic vs White**

You should get the following results.

| Tested | Statistical Parity | Disparate Impact |
| --- | --- | ---|
| Black vs White | -0.12 | 0.68 |
| Asian vs White | -0.02 | 0.95 |
| Hispanic vs White | -0.04 | 0.89 |

Hence even on a balanced test set there is still a bias. The disparate impact value for Black vs White is just under the "fair" limit of 0.8.

## **4. Calculate the Equal Opportunity Difference**

In this section, we focus on the *Equal Opportunity Difference metric*. This metric is part of a wider category of metrics (Equalized Odds and Equality of opportunity) which compare performance/accuracy of predictions for different groups instead of success rates. 

Equalized Odds state that the classifer should have the same True Positive Rate (TPR) and True Negative Rate (TNR) across all groups. The true positive (negative) rate is the proportion of candidates that should be successful (unsuccessful) according to ground-truth that have been correctly classified as successful (unsuccessful) by the algorithm. Mathematically, this means that the following probabilities should be equal for all groups : 

$$P_{group1}(pred = \text{0 or 1} | Y = \text{0 or 1}) = P_{group2}(pred = \text{0 or 1} | Y = \text{0 or 1})$$


The figure below illustrates the definition of TPR and TNR using a confusion matrix and their relationship with the above probabilities.


<center><img src="https://drive.google.com/uc?id=1bQxfIhcrNym7UC7_yzbdaUq12qCtJ9Dr"></center>

The version of this with equal true positive rates only - i.e only looking at $P(pred = 1|y =1)$ - is called *Equal Opportunity Difference*. This measures the difference in True Positive Rates between the unprivileged and privileged group. The ideal value is 0 and fair values are often considered achieved for values between -0.1 and 0.1. In this part, we calculate the Equal Opportunity Difference for different groups. 


In the following code, we split the data into a train and test set and train the model using the train set. 

In [None]:
# split into train/test
data_train, data_test = train_test_split(data,test_size = 0.3,random_state=4)
# get X,y,demographics for each
X_train,y_train,dem_train = split_data_from_df(data_train)
X_test,y_test,dem_test = split_data_from_df(data_test)
# define model and train
model = RidgeClassifier(random_state=42)
model.fit(X_train,y_train)
# evaluate on test
y_pred_test = model.predict(X_test)
acc = accuracy_score(y_test,y_pred_test)
print("Accuracy is %.2f"%acc)
# add predictions to data_test for simplicity of analysis
data_test = data_test.copy()
data_test['Pred'] = y_pred_test

Accuracy is 0.71


**Questions:**
- **Compute and plot the confusion matrix for Male/Female and White/Black.** Use the pre-defined function *plot_cm(y_true,y_pred)*.
- **Compute the true positive rates for Male/Female and White/Black.** 
- **Compute the Equal Opportunity Difference as the difference of the true positive rate of the unprivileged group with the privileged one:**
$EOD = TPR_{unprivileged}-TPR_{privileged}$.

You should find the following:

| Tested | Equal Opportunity Difference |
| --- | ---|
| Female vs Male | -0.09 |
| Black vs White | -0.15 | 
| Asian vs White | -0.08 | 
| Hispanic vs White | -0.03 | 

**Questions:**
- **According to the Equal Opportunity Difference metrics, which group is subject to the most bias ?**
- **Are these results the same as for the Disparate Impact metrics ? Is this necessarily always the case ?**

    Answer: