## Step 8 - Fairness 

Goal of this step : Assess the **fairness** of your own model with respect to the protected attribute. 
Use a **statistical test** for the following two fairness definitions: **Statistical Parity** and **Conditional Statistical Parity**.

- **Statistical parity :** a classifier satisfies the statistical parity if subjects in both protecte and unprotected groups have equal probabilities of being assigned to the prositive predicted class. In other words, the applicants should have an equivalent opportunity to obtain a good test score, regardless of its gender or its ethnic group. We should have : 
$$P(d=1|Gender=M) = P(d=1|Gender=F)$$

- **Conditional Statistical parity :** if subjects in both protected and unprotected groups have equal probability of being assigned to the positive predicted class, controlling for a legitimate factors (or features) $L \in X$. For example : 
$$P(d=1|Gender=M, L=I) = P(d=1|Gender=F, L=I)$$

The independence assumption for **Conditional statistical parity** is :  $H_0 : (\hat{Y}⫫G)|X $.

**How to build groups ?** For credit scoring applications for example, groups gather applicants with similar risk profiles: Determined through unsupervised clustering methods (K-Means) or using an exogenous classification (Basel classification). We could do the same for students.

**How to choose the number of groups ?** To keep in mind that the larger the number of sub-groups, then : 
- The more homogenous the sub‐groups are (cleaner test)
- The more likely at least one sub‐group is found to be unfair
- The smaller the number of individuals in each sub‐group

In [28]:
# Import statements 
import joblib
import numpy as np
import pandas as pd 
from scipy.stats import chi2
from sklearn.model_selection import train_test_split

In [29]:
# Load dataset
df = pd.read_csv('../Dataset/df_processed.csv')
X = df.drop('Grade', axis=1).copy()
y = df['Grade'].copy()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Load the model from the file
model_filename = '../Models/blackbox_model.pkl'
blackbox_model = joblib.load(model_filename)
y_pred_blackbox_test = blackbox_model.predict(X_test)
y_pred_blackbox_train = blackbox_model.predict(X_train)

# Add prediction to features dataframe
df_test_results = X_test.copy()
df_test_results['y_pred'] = y_pred_blackbox_test
df_test_results.head()

Unnamed: 0,Gender,EthnicGroup,ParentEduc,LunchType,TestPrep,ParentMaritalStatus,PracticeSport,IsFirstChild,NrSiblings,School_Bus,WklyStudyHours,y_pred
25876,male,group B,some college,free/reduced,0,married,sometimes,1,1,0,Between 5-10 hours,0
26259,female,group C,some high school,free/reduced,0,divorced,sometimes,0,1,0,Less than 5 hours,0
11278,male,group B,some high school,free/reduced,0,single,sometimes,1,1,1,More than 10 hours,0
26508,female,group A,high school,standard,0,single,never,1,1,1,Between 5-10 hours,0
23734,female,group C,some college,standard,0,married,regularly,0,1,0,More than 10 hours,0


### 1. Assess model fairness using statistical parity definition of fairness 

The independence assumption for **statistical parity** is :  $H_0 : \hat{Y}⫫G$.
The corresponding chi-squared test statistic satisfies : 

$$\chi^2_{SP} = \sum_j\sum_k\frac{(E(n_{+jk})-n_{+jk})^2}{E(n_{+jk})}$$

Under the null hypothesis, $\chi^2_{SP}$ has one degree of freedom.

In [42]:
def count_gender_type(df):
    df_male = df[df['Gender'] == 'male']
    df_female = df[df['Gender'] == 'female']
    n1m = df_male[df['y_pred']==1].shape[0]
    n0m = df_male[df['y_pred']==0].shape[0]
    n1f = df_female[df['y_pred']==1].shape[0]
    n0f = df_female[df['y_pred']==0].shape[0]
    return n1f, n0f, n1m, n0m

def compute_expectations(df):
    n1f, n0f, n1m, n0m = count_gender_type(df)
    n = df.shape[0]
    nf = n1f+n0f
    nm = n1m+n0m
    n1 = n1f+n1m
    n0 = n0f+n0m
    E1f = nf*n1/n
    E0f = n0f*n0/n
    E1m = nm*n1/n
    E0m = nm*n0/n
    return E1f, E0f, E1m, E0m

def compute_test_statistic(df): 
    n1f, n0f, n1m, n0m = count_gender_type(df)
    E1f, E0f, E1m, E0m = compute_expectations(df)
    chi2 = (n1f - E1f)**2 / E1f + (n0f - E0f)**2 / E0f + (n1m - E1m)**2 / E1m + (n0m - E0m)**2 / E0m 
    return chi2

n1f, n0f, n1m, n0m = count_gender_type(df_test_results)
E1f, E0f, E1m, E0m = compute_expectations(df_test_results)
chi2 = compute_test_statistic(df_test_results)

print('total students : ', df_test_results.shape[0])
print("Total 1 y_pred : ", n1f+n1m)
print("Total 0 y_pred : ", n0f+n0m)
print("Total women : ", n1f+n0f)
print("Total male : ", n1m+n0m)
print('\n')

print("n1f:", n1f)
print("n0f:", n0f)
print("n1m:", n1m)
print("n0m:", n0m)
print('\n')

print("E1f :", np.round(E1f, 2))
print("E0f :", np.round(E0f, 2))
print("E1m :", np.round(E1m, 2))
print("E0m :", np.round(E0m, 2))

print('\n')
print("Test statistic value : ", chi2)

620 * 858 / 9193
#(620 - 57.87)**2/57.87

total students :  9193
Total 1 y_pred :  858
Total 0 y_pred :  8335
Total women :  4612
Total male :  4581


n1f: 620
n0f: 3992
n1m: 238
n0m: 4343


E1f : 430.45
E0f : 3619.42
E1m : 427.55
E0m : 4153.45


Test statistic value :  214.51396289073045


  n1m = df_male[df['y_pred']==1].shape[0]
  n0m = df_male[df['y_pred']==0].shape[0]
  n1f = df_female[df['y_pred']==1].shape[0]
  n0f = df_female[df['y_pred']==0].shape[0]
  n1m = df_male[df['y_pred']==1].shape[0]
  n0m = df_male[df['y_pred']==0].shape[0]
  n1f = df_female[df['y_pred']==1].shape[0]
  n0f = df_female[df['y_pred']==0].shape[0]
  n1m = df_male[df['y_pred']==1].shape[0]
  n0m = df_male[df['y_pred']==0].shape[0]
  n1f = df_female[df['y_pred']==1].shape[0]
  n0f = df_female[df['y_pred']==0].shape[0]
  n1m = df_male[df['y_pred']==1].shape[0]
  n0m = df_male[df['y_pred']==0].shape[0]
  n1f = df_female[df['y_pred']==1].shape[0]
  n0f = df_female[df['y_pred']==0].shape[0]


57.865767431741546