<a href="https://colab.research.google.com/github/RayaneZen05/Fairness-in-AI/blob/main/TD1_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# TD 1: Fairness notion examples

In this first TD we are going to manipulate some data and see the behaviour of the different fairness metrics

## Installation of the environnement

We highly recommend you to follow these steps, it will allow every student to work in an environment as similar as possible to the one used during testing.

### Colab Settings
  The next cell of code are to execute only once per colab environment


#### Python env creation

        ```
        ! python -m pip install numpy fairlearn plotly nbformat ipykernel aif360["inFairness"] aif360['AdversarialDebiasing'] causal-learn BlackBoxAuditing cvxpy dice-ml lime shapkit
        ```
### Local Settings

#### 1. Uv installation


        https://docs.astral.sh/uv/getting-started/installation/


        `curl -LsSf https://astral.sh/uv/install.sh | sh`

        Python version 3.12 installation (highly recommended)
        `uv python install 3.12`


#### 3. Python env creation

        ```
        mkdir TD_bias_mitigation
        cd TD_bias_mitigation
        uv python pin 3.12
        uv init
        uv pip install numpy fairlearn plotly nbformat ipykernel aif360["inFairness"] aif360['AdversarialDebiasing'] causal-learn BlackBoxAuditing cvxpy dice-ml lime shapkit
        ```


In [1]:
    ! python -m pip install numpy fairlearn plotly nbformat ipykernel aif360["inFairness"] aif360['AdversarialDebiasing'] causal-learn BlackBoxAuditing cvxpy dice-ml lime shapkit

Collecting fairlearn
  Downloading fairlearn-0.13.0-py3-none-any.whl.metadata (7.3 kB)
Collecting causal-learn
  Downloading causal_learn-0.1.4.4-py3-none-any.whl.metadata (4.6 kB)
Collecting BlackBoxAuditing
  Downloading BlackBoxAuditing-0.1.54.tar.gz (2.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.6/2.6 MB[0m [31m73.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting dice-ml
  Downloading dice_ml-0.12-py3-none-any.whl.metadata (20 kB)
Collecting lime
  Downloading lime-0.2.0.1.tar.gz (275 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m275.7/275.7 kB[0m [31m21.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting shapkit
  Downloading shapkit-0.0.4-py3-none-any.whl.metadata (7.2 kB)
Collecting aif360[inFairness]
  Downloading aif360-0.6.1-py3-none-any.whl.metadata (5.0 kB)
Collecting scipy<1.16.0,>=1.9.3 (from fairlearn)
  Downloading sci


## Objectives


 1. Study the data, the distribution of each feature and its relation to the target.

 2. Highlight some bias present in the data

 3. Learn a basic machine learning model using logistic regression

 4. Compute the confusion matrix and different fairness metrics

## Dataset: Diabetes 130-Hospitals


https://fairlearn.org/main/api_reference/generated/fairlearn.datasets.fetch_diabetes_hospital.html

Ce dataset contient 101,766 lignes chacunes concernant un patient hospitalisé pour du diabètes sur une durée allant de 1 à 14 jours. Les données ont été récoltées sur 10 ans et 130 hopitaux différents. Chaque donnée possède 25 caractéristiques concernant des informations médicales, mais aussi demographiques, enfin la colonne 'readmitted' indique si le patient a été réadmis, et si oui s'il l'a été dans les 30jours ou après. Cette colonne est binarisée en deux autres 'readmit_30_days' (True si réadmis dans les 30 jours, False sinon) et 'readmitted' ( True si réadmis, False sinon).

Nous utiliserons en label/vérité, la colonne 'readmit_30_days'.

Nous allons simplifier en ne considérant qu'un sous-ensemble de 14 des caractéristiques fournies:
age, gender, race, time_in_hospital, num_lab_procedures, num_procedures, num_medications, number_diagnoses, max_glu_serum, A1Cresult, insulin, had_emergency, had_inpatient_days, had_outpatient_days





In [None]:
# To execute only in Colab
! python -m pip install numpy fairlearn plotly nbformat ipykernel aif360["inFairness"] aif360['AdversarialDebiasing'] causal-learn BlackBoxAuditing cvxpy dice-ml lime shapkit

In [2]:
# Code to compute fairness metrics using aif360

from aif360.sklearn.metrics import *
from sklearn.metrics import  balanced_accuracy_score


# This method takes lists
def get_metrics(
    y_true, # list or np.array of truth values
    y_pred=None,  # list or np.array of predictions
    prot_attr=None, # list or np.array of protected/sensitive attribute values
    priv_group=1, # value taken by the privileged group
    pos_label=1, # value taken by the positive truth/prediction
    sample_weight=None # list or np.array of weights value,
):
    group_metrics = {}
    group_metrics["base_rate_truth"] = base_rate(
        y_true=y_true, pos_label=pos_label, sample_weight=sample_weight
    )
    group_metrics["statistical_parity_difference"] = statistical_parity_difference(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
    )
    group_metrics["disparate_impact_ratio"] = disparate_impact_ratio(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
    )
    if not y_pred is None:
        group_metrics["base_rate_preds"] = base_rate(
        y_true=y_pred, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["equal_opportunity_difference"] = equal_opportunity_difference(
            y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["average_odds_difference"] = average_odds_difference(
            y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
        )
        if len(set(y_pred))>1:
            group_metrics["conditional_demographic_disparity"] = conditional_demographic_disparity(
                y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
            )
        else:
            group_metrics["conditional_demographic_disparity"] =None
        group_metrics["smoothed_edf"] = smoothed_edf(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["df_bias_amplification"] = df_bias_amplification(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["balanced_accuracy_score"] = balanced_accuracy_score(
        y_true=y_true, y_pred=y_pred, sample_weight=sample_weight
        )
    return group_metrics

  vect_normalized_discounted_cumulative_gain = vmap(
  monte_carlo_vect_ndcg = vmap(vect_normalized_discounted_cumulative_gain, in_dims=(0,))


## Download and simplify the dataset

In [3]:
import numpy as np
import fairlearn
np.__version__, fairlearn.__version__

('2.0.2', '0.13.0')

In [4]:
from fairlearn.datasets import fetch_diabetes_hospital
dataset = fetch_diabetes_hospital()

In [5]:
selection = [
    "age",
    "gender",
    "race",
    "time_in_hospital",
    "num_lab_procedures",
    "num_procedures",
    "num_medications",
    "number_diagnoses",
    "max_glu_serum",
    "A1Cresult",
    "insulin",
    "had_emergency",
    "had_inpatient_days",
    "had_outpatient_days"]
df = dataset.data[selection].copy(deep=True)
label = 'readmit_30_days'
df[label] = dataset.target
# We transform boolean into integer, False=>0, True=>1
df.had_emergency = df.had_emergency.replace({"True":1, "False":0})
df.had_inpatient_days = df.had_inpatient_days.replace({"True":1, "False":0})
df.had_outpatient_days = df.had_outpatient_days.replace({"True":1, "False":0})
df

  df.had_emergency = df.had_emergency.replace({"True":1, "False":0})
  df.had_emergency = df.had_emergency.replace({"True":1, "False":0})
  df.had_inpatient_days = df.had_inpatient_days.replace({"True":1, "False":0})
  df.had_inpatient_days = df.had_inpatient_days.replace({"True":1, "False":0})
  df.had_outpatient_days = df.had_outpatient_days.replace({"True":1, "False":0})
  df.had_outpatient_days = df.had_outpatient_days.replace({"True":1, "False":0})


Unnamed: 0,age,gender,race,time_in_hospital,num_lab_procedures,num_procedures,num_medications,number_diagnoses,max_glu_serum,A1Cresult,insulin,had_emergency,had_inpatient_days,had_outpatient_days,readmit_30_days
0,'30 years or younger',Female,Caucasian,1,41,0,1,1,,,No,0,0,0,0
1,'30 years or younger',Female,Caucasian,3,59,0,18,9,,,Up,0,0,0,0
2,'30 years or younger',Female,AfricanAmerican,2,11,5,13,6,,,No,0,1,1,0
3,'30-60 years',Male,Caucasian,2,44,1,16,7,,,Up,0,0,0,0
4,'30-60 years',Male,Caucasian,1,51,0,8,5,,,Steady,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
101761,'Over 60 years',Male,AfricanAmerican,3,51,0,16,9,,>8,Down,0,0,0,0
101762,'Over 60 years',Female,AfricanAmerican,5,33,3,18,9,,,Steady,0,1,0,0
101763,'Over 60 years',Male,Caucasian,1,53,0,9,13,,,Down,0,0,1,0
101764,'Over 60 years',Female,Caucasian,10,45,2,21,9,,,Up,0,1,0,0


## Part 1: Data Analysis

### Question1 : Count the number of positive and negative label

In [7]:
df.head()

Unnamed: 0,age,gender,race,time_in_hospital,num_lab_procedures,num_procedures,num_medications,number_diagnoses,max_glu_serum,A1Cresult,insulin,had_emergency,had_inpatient_days,had_outpatient_days,readmit_30_days
0,'30 years or younger',Female,Caucasian,1,41,0,1,1,,,No,0,0,0,0
1,'30 years or younger',Female,Caucasian,3,59,0,18,9,,,Up,0,0,0,0
2,'30 years or younger',Female,AfricanAmerican,2,11,5,13,6,,,No,0,1,1,0
3,'30-60 years',Male,Caucasian,2,44,1,16,7,,,Up,0,0,0,0
4,'30-60 years',Male,Caucasian,1,51,0,8,5,,,Steady,0,0,0,0


In [11]:
df.groupby(["readmit_30_days"]).count()

Unnamed: 0_level_0,age,gender,race,time_in_hospital,num_lab_procedures,num_procedures,num_medications,number_diagnoses,max_glu_serum,A1Cresult,insulin,had_emergency,had_inpatient_days,had_outpatient_days
readmit_30_days,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
0,90409,90409,90409,90409,90409,90409,90409,90409,90409,90409,90409,90409,90409,90409
1,11357,11357,11357,11357,11357,11357,11357,11357,11357,11357,11357,11357,11357,11357


Now we look at the different features.
First the numerical features

### Question2: Display the distribution of the numerical features and compute their correlation with the target

In [None]:
def Compute_correlation(cola, colb):
  return np.corrcoef(df[cola].values, df[colb].values)[0][1]

In [None]:
print("TODO")

Then we consider the categorical features.

### Question3: Display histogram of categorical distribution by label for each categorical features.

In [None]:
import plotly.express as px

def Display_categorical_hist(cat_feature, target):
  fig = px.histogram(df, x=cat_feature, color=target)
  fig.show()

def Display_categorical_hist_percent(cat_feature, target):
  df_summarized = df.groupby([target,cat_feature]).agg("count").reset_index()
  df_summarized[f"percent of {cat_feature}"] = df_summarized[[cat_feature,"time_in_hospital"]].apply(
    lambda x: 100*x[1]/df_summarized[df_summarized[cat_feature]==x[0]]["time_in_hospital"].sum(), axis=1
  )
  df_summarized[label] = df_summarized[label].astype(str)
  fig = px.bar(df_summarized, x=f"{cat_feature}", y=f"percent of {cat_feature}", color=target)
  fig.show()

In [None]:
print("TODO")

### Question4: What are the bias highlighted by the data analysis ?


L'analyse des correlations montre que les variables numériques sont faiblement corrélées linéairement à la réadmission.
De même l'analyse des variables catégorielles ne montrent pas de différences signicatives en réadmission pour chacune de leur valeur.
Il n'y a pas de bias mis en valeur ici.

In [None]:
print("TO WRITE")

## Part 2: Learn a Decision Tree and study the fairness of its output

### Question5: Utiliser la technique du "one hot encoding" pour transformer chaque colonne categorielle à N catégories en N colonnes binaires

In [None]:
print("TODO: Create df_X, with numerical features and one hot encoded categorical features")

### Question6 : Split data into train and test sets



In [None]:
print("TODO: Create X_train, X_test, y_train, y_test, from df_X and the label")

### Question 7: Train a DecisionTreeClassifier (https://scikit-learn.org/stable/modules/tree.html#classification)

In [None]:
print("TODO")

### Question8: Compute the Confusion Matrix

In [None]:
print("TODO")

### Question 9: Compute base rate metrics for a sensitive binary attribute (gender, race etc)

In [None]:
print("TODO compute diparate impact")

### Question 10: Compute model perfomance for a sensitive binary attribute (gender, race etc)

In [None]:
print("TODO compute model performance")

### Question 11: Compute model calibration according to a sensitive binary attribute (gender, race etc)

In [None]:
print("TODO")