<a href="https://colab.research.google.com/github/Pitou11/fairness2/blob/main/UPP26/TD5_inprocessing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# TD 5: Mitigation des biais avec une méthode de in-processsing Prejudice Remover

The aim of this notebook is to use the Prejudice Remover in-processing approach and analyse its impact on the model output.
In terms of Machine Learning we will go a bit further in the train/valid/test paradigm.

The model has to be learn on the train dataset, then the model parameters has to be optimized on the valid dataset, and finally the model performance is evaluated on the test dataset.
No choice/decision etc can be taken depending on the test dataset. This could result on an overfitting on the test dataset.

Here you will manipulate:
- Prejudice Remover approach as a black box
- Training of the prejudice remover using the train/valid paradigm. to choice the 'best' threshold
- Combine Prejudice Remover with Reweighing

As a reminder of pre-processing approach we encourage you to :
- analyse the impact of the Reweighing on different model (Logistic Regression, Decision Tree, Random Forest, etc.)


## Installation of the environnement

We highly recommend you to follow these steps, it will allow every student to work in an environment as similar as possible to the one used during testing.

### Colab Settings ---- for Colab Users ONLY
  The next cell of code are to execute only once per colab environment


#### Python env creation (Colab only)

        ```
        ! python -m pip install numpy fairlearn plotly nbformat ipykernel aif360["inFairness"] aif360['AdversarialDebiasing'] causal-learn BlackBoxAuditing cvxpy dice-ml lime shapkit
        ```
#### 2. Download MEPS dataset (for part2) it can take several minutes (Colab only)

        ```
        ! Rscript /usr/local/lib/python3.12/dist-packages/aif360/data/raw/meps/generate_data.R
        ! mv h181.csv /usr/local/lib/python3.12/dist-packages/aif360/data/raw/meps/
        ! mv h192.csv /usr/local/lib/python3.12/dist-packages/aif360/data/raw/meps/
        ```

### Local Settings ---- for installation on local computer ONLY

If you arleady have an env from TD2, TD3 or TD4, you can simply reuse it.


#### 1. Uv installation (local only, no need to redo if already done)


        https://docs.astral.sh/uv/getting-started/installation/


        `curl -LsSf https://astral.sh/uv/install.sh | sh`

        Python version 3.12 installation (highly recommended)
        `uv python install 3.12`

#### 2. R installation *NEW* (local only)

        In the command `Rscript` says 'command not found'

        `sudo apt install r-base-core`

#### 3. Python env creation (local only, no need to redo if already done)

        ```
        mkdir TD_bias_mitigation
        cd TD_bias_mitigation
        uv python pin 3.12
        uv init
        uv venv
        uv add numpy fairlearn plotly nbformat ipykernel aif360["inFairness"] aif360['AdversarialDebiasing'] causal-learn BlackBoxAuditing cvxpy dice-ml lime shapkit
        uv add pandas==2.2.2
        ```

#### 4. Download MEPS dataset, it can take several minutes *NEW* (local only)

        ```
        cd TD_bias_mitigation/.venv/lib/python3.12/site-packages/aif360/data/raw/meps/
        Rscript generate_data.R
        ```

In [1]:
# To execute only in Colab
! python -m pip install numpy fairlearn plotly nbformat ipykernel aif360["inFairness"] aif360['AdversarialDebiasing'] causal-learn BlackBoxAuditing cvxpy dice-ml lime shapkit

Collecting fairlearn
  Downloading fairlearn-0.13.0-py3-none-any.whl.metadata (7.3 kB)
Collecting causal-learn
  Downloading causal_learn-0.1.4.4-py3-none-any.whl.metadata (4.6 kB)
Collecting BlackBoxAuditing
  Downloading BlackBoxAuditing-0.1.54.tar.gz (2.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.6/2.6 MB[0m [31m21.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting dice-ml
  Downloading dice_ml-0.12-py3-none-any.whl.metadata (20 kB)
Collecting lime
  Downloading lime-0.2.0.1.tar.gz (275 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m275.7/275.7 kB[0m [31m11.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting shapkit
  Downloading shapkit-0.0.4-py3-none-any.whl.metadata (7.2 kB)
Collecting aif360[inFairness]
  Downloading aif360-0.6.1-py3-none-any.whl.metadata (5.0 kB)
Collecting scipy<1.16.0,>=1.9.3 (from fairlearn)
  Downloading sci

In [2]:
# To execute only in Colab
! Rscript /usr/local/lib/python3.12/dist-packages/aif360/data/raw/meps/generate_data.R
! mv h181.csv /usr/local/lib/python3.12/dist-packages/aif360/data/raw/meps/
! mv h192.csv /usr/local/lib/python3.12/dist-packages/aif360/data/raw/meps/


By using this script you acknowledge the responsibility for reading and
abiding by any copyright/usage rules and restrictions as stated on the
MEPS web site (https://meps.ahrq.gov/data_stats/data_use.jsp).

Continue [y/n]? > y
Loading required package: foreign
trying URL 'https://meps.ahrq.gov/mepsweb/data_files/pufs/h181ssp.zip'
Content type 'application/zip' length 13303652 bytes (12.7 MB)
downloaded 12.7 MB

Loading dataframe from file: h181.ssp
Exporting dataframe to file: h181.csv
trying URL 'https://meps.ahrq.gov/mepsweb/data_files/pufs/h192ssp.zip'
Content type 'application/zip' length 15505898 bytes (14.8 MB)
downloaded 14.8 MB

Loading dataframe from file: h192.ssp
Exporting dataframe to file: h192.csv


## 1. Import and load the dataset

In [3]:
# imports
import numpy as np
import pandas as pd
import plotly.express as px
import warnings

warnings.simplefilter(action="ignore", category=FutureWarning)
warnings.simplefilter(action="ignore", append=True, category=UserWarning)
# Datasets
from aif360.datasets import MEPSDataset19

# Fairness metrics
from sklearn.metrics import accuracy_score, balanced_accuracy_score
from sklearn.preprocessing import StandardScaler

MEPSDataset19_data = MEPSDataset19()
(dataset_orig_panel19_train, dataset_orig_panel19_val, dataset_orig_panel19_test) = (
    MEPSDataset19().split([0.5, 0.8], shuffle=True)
)

In [4]:
len(dataset_orig_panel19_train.instance_weights), len(
    dataset_orig_panel19_val.instance_weights
), len(dataset_orig_panel19_test.instance_weights)

(7915, 4749, 3166)

In [5]:
instance_weights = MEPSDataset19_data.instance_weights
instance_weights

array([21854.981705, 18169.604822, 17191.832515, ...,  3896.116219,
        4883.851005,  6630.588948])

In [6]:
f"Taille du dataset {len(instance_weights)}, poids total du dataset {instance_weights.sum()}."

'Taille du dataset 15830, poids total du dataset 141367240.546316.'

In [7]:
from aif360.sklearn.metrics import *
from sklearn.metrics import  balanced_accuracy_score


# This method takes lists
def get_metrics(
    y_true, # list or np.array of truth values
    y_pred=None,  # list or np.array of predictions
    prot_attr=None, # list or np.array of protected/sensitive attribute values
    priv_group=1, # value taken by the privileged group
    pos_label=1, # value taken by the positive truth/prediction
    sample_weight=None # list or np.array of weights value,
):
    group_metrics = {}
    group_metrics["base_rate_truth"] = base_rate(
        y_true=y_true, pos_label=pos_label, sample_weight=sample_weight
    )
    group_metrics["statistical_parity_difference"] = statistical_parity_difference(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
    )
    group_metrics["disparate_impact_ratio"] = disparate_impact_ratio(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
    )
    if not y_pred is None:
        group_metrics["base_rate_preds"] = base_rate(
        y_true=y_pred, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["equal_opportunity_difference"] = equal_opportunity_difference(
            y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["average_odds_difference"] = average_odds_difference(
            y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
        )
        if len(set(y_pred))>1:
            group_metrics["conditional_demographic_disparity"] = conditional_demographic_disparity(
                y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
            )
        else:
            group_metrics["conditional_demographic_disparity"] =None
        group_metrics["smoothed_edf"] = smoothed_edf(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["df_bias_amplification"] = df_bias_amplification(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["balanced_accuracy_score"] = balanced_accuracy_score(
        y_true=y_true, y_pred=y_pred, sample_weight=sample_weight
        )
    return group_metrics

## Learning a Prejudice Remover model on the training dataset, and choose the best parameters with the validation dataset

In [8]:
# Bias mitigation techniques
from aif360.algorithms.preprocessing import Reweighing
from aif360.algorithms.inprocessing import PrejudiceRemover

### Question1 : Learn a Standard Scaler on the training dataset features, its output will be used as input of the model learned

In [12]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
train_scale = scaler.fit_transform(dataset_orig_panel19_train.features)
print(train_scale)

[[ 1.18960685  1.33092394  0.59754104 ...  0.9452728  -0.74601531
  -0.35930667]
 [ 0.43824652  1.33092394  1.03995285 ...  0.9452728  -0.74601531
  -0.35930667]
 [ 1.36639751  1.33092394 -1.14944794 ...  0.9452728  -0.74601531
  -0.35930667]
 ...
 [-0.35731147  1.33092394 -1.14944794 ... -1.05789567  1.34045507
  -0.35930667]
 [ 0.30565352 -0.75135774  1.16322906 ...  0.9452728  -0.74601531
  -0.35930667]
 [-0.66669513  1.33092394  1.11484027 ... -1.05789567 -0.74601531
   2.78313786]]


### Question2: Create a method to learn a Prejudice Remover on the train dataset and retrieve the model learned
Execute the method with the parameter eta arbitrarily set at 25.0



In [19]:
pr = PrejudiceRemover(eta = 25.0,sensitive_attr="RACE", class_attr="")
train_pr = pr.fit(dataset_orig_panel19_train)
print(train_pr)

<aif360.algorithms.inprocessing.prejudice_remover.PrejudiceRemover object at 0x7829f77344d0>


Le score du Prejudice Remover donne un sortie pour chaque instance une seule valeur, c'est un seuil, arbritrairement fixé à 0.5 par défault, qui permet à partir de ce score de décider la prédiction 1 ou 0.
Si le score est supérieur au seuil la prédiction est 1, sinon c'est 0.

### Validating: Choose the best parameters

Here there are two parameters :
- eta: fairness penalty parameter of the PR model
- thershold: the threshold of the binary classification

The threshold is used to obtains predictions from the model output.
The eta is used during the training

Question3: Create a method that will loop over 50 threshold ]0:0.5( and 5 values of ETA [1.0: 100.0], and outputs the metrics

In [22]:
threshold = np.linspace(0.01, 0.5, 10) # Reduced to 10 values for faster execution, starting from 0.01
eta = np.linspace(1.0, 100.0, 3) # Reduced to 3 values for faster execution

# Define sensitive and class attributes
sensitive_attribute = 'RACE'
class_attribute = dataset_orig_panel19_train.label_names[0] # 'UTILIZATION'
privileged_group = 1.0 # For 'RACE' == 1
positive_label = 1.0 # For 'UTILIZATION' == 1

results = []

print(f"Starting evaluation for {len(eta)} eta values and {len(threshold)} thresholds.")

for i, current_eta in enumerate(eta):
  for j, current_threshold in enumerate(threshold):
    print(f"  Evaluating eta={current_eta:.2f} ({i+1}/{len(eta)}), threshold={current_threshold:.2f} ({j+1}/{len(threshold)})...")
    # 1. Train Prejudice Remover model
    pr = PrejudiceRemover(
        eta=current_eta,
        sensitive_attr=sensitive_attribute,
        class_attr=class_attribute,
    )
    # Note: PrejudiceRemover's fit method does not return the fitted object itself.
    # The object `pr` is modified in-place.
    pr.fit(dataset_orig_panel19_train)

    # 2. Get scores on validation dataset
    # The predict method returns a Dataset object, from which we can extract scores.
    val_scores_dataset = pr.predict(dataset_orig_panel19_val)
    val_scores = val_scores_dataset.scores # scores are usually in the `scores` attribute of the returned Dataset

    # 3. Make binary predictions based on threshold
    val_pred = (val_scores > current_threshold).astype(float)
    prot_attr_val = dataset_orig_panel19_val.protected_attributes[:, dataset_orig_panel19_val.protected_attribute_names.index(sensitive_attribute)]

    # 5. Calculate metrics
    metrics = get_metrics(
        y_true=dataset_orig_panel19_val.labels[:, 0],
        y_pred=val_pred[:, 0],
        prot_attr=prot_attr_val,
        priv_group=privileged_group,
        pos_label=positive_label,
        sample_weight=dataset_orig_panel19_val.instance_weights,
    )
    result_row = {
        "eta": current_eta,
        "threshold": current_threshold,
    }
    result_row.update(metrics)
    results.append(result_row)

# Convert results to a DataFrame for easier analysis
evaluation_results = pd.DataFrame(results)

print(evaluation_results.head())

Starting evaluation for 3 eta values and 10 thresholds.
  Evaluating eta=1.00 (1/3), threshold=0.01 (1/10)...
  Evaluating eta=1.00 (1/3), threshold=0.06 (2/10)...
  Evaluating eta=1.00 (1/3), threshold=0.12 (3/10)...
  Evaluating eta=1.00 (1/3), threshold=0.17 (4/10)...
  Evaluating eta=1.00 (1/3), threshold=0.23 (5/10)...
  Evaluating eta=1.00 (1/3), threshold=0.28 (6/10)...
  Evaluating eta=1.00 (1/3), threshold=0.34 (7/10)...
  Evaluating eta=1.00 (1/3), threshold=0.39 (8/10)...
  Evaluating eta=1.00 (1/3), threshold=0.45 (9/10)...
  Evaluating eta=1.00 (1/3), threshold=0.50 (10/10)...
  Evaluating eta=50.50 (2/3), threshold=0.01 (1/10)...
  Evaluating eta=50.50 (2/3), threshold=0.06 (2/10)...
  Evaluating eta=50.50 (2/3), threshold=0.12 (3/10)...
  Evaluating eta=50.50 (2/3), threshold=0.17 (4/10)...
  Evaluating eta=50.50 (2/3), threshold=0.23 (5/10)...
  Evaluating eta=50.50 (2/3), threshold=0.28 (6/10)...
  Evaluating eta=50.50 (2/3), threshold=0.34 (7/10)...
  Evaluating eta=5

### Question4 : Make plot to choose the best set of parameters

In [None]:
print("TODO")

### Question 5: Evaluate : compute the metrics on the test dataset using the model learnt with the selected parameters


In [None]:
print("TODO")

## Combine pre-processing and in-processing
### Question6: Redo the Prejudice Remover approach using first the Reweighing pre-processing

In [24]:
rw = Reweighing(unprivileged_groups=[{'RACE': 0.0}], privileged_groups=[{'RACE': 1.0}])
train_rw = rw.fit(dataset_orig_panel19_train)
print(train_rw)

<aif360.algorithms.preprocessing.reweighing.Reweighing object at 0x7829f72eac60>


## Adversarial Debiasing

Adversarial debiasing [1] is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary's ability to determine the protected attribute from the predictions.

See [AIF360 tuto](https://github.com/Trusted-AI/AIF360/blob/main/examples/demo_adversarial_debiasing.ipynb)

Here we show how to learn and Adversarial Debiasing with the argumetn debias set to False

In [25]:
import tensorflow.compat.v1 as tf
tf.disable_eager_execution()
from aif360.algorithms.inprocessing.adversarial_debiasing import AdversarialDebiasing

sess = tf.Session()

plain_model = AdversarialDebiasing(
    unprivileged_groups=[{'RACE': 0.0}],
    privileged_groups=[{'RACE': 1.0}],
    scope_name='plain_classifier',
    debias=False,
    sess=sess)

plain_model.fit(dataset_orig_panel19_train)

Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


epoch 0; iter: 0; batch classifier loss: 0.961256
epoch 1; iter: 0; batch classifier loss: 0.784680
epoch 2; iter: 0; batch classifier loss: 0.616059
epoch 3; iter: 0; batch classifier loss: 0.507672
epoch 4; iter: 0; batch classifier loss: 0.382953
epoch 5; iter: 0; batch classifier loss: 0.326099
epoch 6; iter: 0; batch classifier loss: 0.259677
epoch 7; iter: 0; batch classifier loss: 0.219285
epoch 8; iter: 0; batch classifier loss: 0.395851
epoch 9; iter: 0; batch classifier loss: 0.341150
epoch 10; iter: 0; batch classifier loss: 0.299453
epoch 11; iter: 0; batch classifier loss: 0.302512
epoch 12; iter: 0; batch classifier loss: 0.295940
epoch 13; iter: 0; batch classifier loss: 0.314817
epoch 14; iter: 0; batch classifier loss: 0.299420
epoch 15; iter: 0; batch classifier loss: 0.262912
epoch 16; iter: 0; batch classifier loss: 0.338670
epoch 17; iter: 0; batch classifier loss: 0.273303
epoch 18; iter: 0; batch classifier loss: 0.276795
epoch 19; iter: 0; batch classifier loss:

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x7829f741cb30>

In [26]:
# Apply the plain model to train and val data
dataset_nodebiasing_train = plain_model.predict(dataset_orig_panel19_train)
dataset_nodebiasing_val = plain_model.predict(dataset_orig_panel19_val)

In [27]:
get_metrics(
    y_true = dataset_orig_panel19_train.labels[:,0],
    y_pred= dataset_nodebiasing_train.labels[:,0],
    prot_attr= dataset_orig_panel19_train.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_train.instance_weights
)

{'base_rate_truth': np.float64(0.2175580473224301),
 'statistical_parity_difference': np.float64(-0.1460238687255686),
 'disparate_impact_ratio': 0.3558503689905921,
 'base_rate_preds': np.float64(0.1686109053620597),
 'equal_opportunity_difference': -0.17394215877176566,
 'average_odds_difference': -0.11286350521421812,
 'conditional_demographic_disparity': np.float64(-0.05102689124979527),
 'smoothed_edf': np.float64(1.0332447921975951),
 'df_bias_amplification': np.float64(0.3348973987581134),
 'balanced_accuracy_score': np.float64(0.774212852295711)}

In [28]:
get_metrics(
    y_true = dataset_orig_panel19_val.labels[:,0],
    y_pred= dataset_nodebiasing_val.labels[:,0],
    prot_attr= dataset_orig_panel19_val.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_val.instance_weights
)

{'base_rate_truth': np.float64(0.2174745734278294),
 'statistical_parity_difference': np.float64(-0.14202380676146711),
 'disparate_impact_ratio': 0.33962520969373916,
 'base_rate_preds': np.float64(0.15697349832188962),
 'equal_opportunity_difference': -0.16485180145804762,
 'average_odds_difference': -0.12116322898032325,
 'conditional_demographic_disparity': np.float64(-0.047200169741383725),
 'smoothed_edf': np.float64(1.0799123105241468),
 'df_bias_amplification': np.float64(0.4043317012542853),
 'balanced_accuracy_score': np.float64(0.69217492865539)}

In [29]:
sess.close()
tf.reset_default_graph()


### Question 7: Redo the same (learn and Adversarial Debiasing) with the argument debias set to True

Compare the metrics outputed

In [None]:
print("TODO")

### Question 8: Combine the Reweighing with the Adversarial Debiasing

In [None]:
print("TODO")

This in-processing approach does not seem compatible withe the Reweighing, has the df_bias_amplification is high and the disparate impact ratio is not improved by the use of the reweighing has pre-processing.
Although very efficient on the fairness metrics of the dataset, the Reweighing is not convenient for every kind of machine learning algo.



## Analysis of the influence of Reweighing

### QUESTION 9 : Pour aller plus loin, étudier l'impact du Reweighing sur différents modèles notamment les arbres de décision

In [None]:
print("TODO")