# Comparaison des modèles avec jeu de validation

On effectue exactement la même démarche que précédemment. Cependant cette fois on divise le jeu d'entraînement deux sets distincts.

L'objectif est d'entraîner le classfieur non-contraint et le DLP sur le jeu de train seulement, et d'utiliser le jeu de validation lors de l'ajustement du modèle par seuillage.

Cela permet d'éviter que le modèle par seuillage aprenne simplement le jeu de test (overfitting des contraintes de fairness), et de tester ainsi si la méthode proposée par les auteurs est généralisable.

In [1]:
# On place l'exécution du code à la racine du projet.
import sys
from pathlib import Path

root_path = Path().resolve().parent  
sys.path.append(str(root_path))

In [2]:
import pandas as pd
from data.preprocessing import prepare_data

X_train, X_test, y_train, y_test, protected_train, protected_test = prepare_data()

In [3]:
from sklearn.model_selection import train_test_split

stratify_col = y_train.astype(str) + "_" + protected_train.astype(str)

X_train, X_val, y_train, y_val, protected_train, protected_val = train_test_split(
    X_train,
    y_train,
    protected_train,
    test_size=0.3,       # 30% pour la validation
    random_state=42,     
    stratify=stratify_col
)


## Régression logistique standard

In [4]:
from sklearn.linear_model import LogisticRegression

clf = LogisticRegression(max_iter=1000)

clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)

Calcul des métriques importantes: accuracy, demographic parity, p%-rule

In [5]:
from sklearn.metrics import classification_report

report = classification_report(y_test,y_pred)
print(report)

              precision    recall  f1-score   support

           0       0.88      0.93      0.90     11543
           1       0.73      0.61      0.66      3772

    accuracy                           0.85     15315
   macro avg       0.80      0.77      0.78     15315
weighted avg       0.84      0.85      0.84     15315



In [6]:
def print_metric_result(result):
    print(f'value: {result[0]:.2f} \n' \
          f'Most advantaged_group: {result[1][0]} \n' \
          f'Most disadvantaged_group: {result[1][1]}')

In [7]:
from fairness.fair_metrics import intersectional_demographic_parity, intersectional_p_percent
idp = intersectional_demographic_parity(y_pred,protected_test)
print('Demographic Parity:')
print_metric_result(idp)

Demographic Parity:
value: 0.18 
Most advantaged_group: Asian-Pac-Islander 
Most disadvantaged_group: Amer-Indian-Eskimo


In [8]:
p_per = intersectional_p_percent(y_pred,protected_test)
print('P%-Rule')
print_metric_result(p_per)

P%-Rule
value: 0.34 
Most advantaged_group: Asian-Pac-Islander 
Most disadvantaged_group: Amer-Indian-Eskimo


## DLP avec Fairlearn

Fairlearn propose nativement une implémentation d'un Disparate Learning Process par pénalisation de la fonction de perte par une contraite. On choisit ici la parité démorgaphique.

In [9]:
from fairlearn.reductions import ExponentiatedGradient, DemographicParity

constraint = DemographicParity()

clf_fair = ExponentiatedGradient(
    estimator=LogisticRegression(solver="liblinear"),
    constraints=constraint,
    eps=0.02  # tolérance fairness (à explorer)
)

clf_fair.fit(
    X_train,
    y_train,
    sensitive_features=protected_train
)

y_pred_fair = clf_fair.predict(X_test)

Les mêmes métriques sont calculés pour comparaison

In [10]:
fair_report = classification_report(y_test,y_pred_fair)
print(fair_report)

              precision    recall  f1-score   support

           0       0.80      0.94      0.87     11543
           1       0.61      0.28      0.38      3772

    accuracy                           0.78     15315
   macro avg       0.71      0.61      0.63     15315
weighted avg       0.75      0.78      0.75     15315



In [11]:
fair_idp = intersectional_demographic_parity(y_pred_fair,protected_test)
print('Demographic Parity (fair):')
print_metric_result(fair_idp)

Demographic Parity (fair):
value: 0.09 
Most advantaged_group: Amer-Indian-Eskimo 
Most disadvantaged_group: Black


In [12]:
fair_p_per = intersectional_p_percent(y_pred_fair,protected_test)
print('P%-rule (fair):')
print_metric_result(fair_p_per)

P%-rule (fair):
value: 0.56 
Most advantaged_group: Amer-Indian-Eskimo 
Most disadvantaged_group: Black


## Treatment disparity par seuillage

**Addition principale :** On entraîne le modèle de base sur le train set, et on effectue l'algorithme de calcul des seuils sur le validation set

In [14]:
from fairness.treatment_disparity import MulticlassThresholdOptimizer

best_fair_clf = MulticlassThresholdOptimizer(protected_val)

val_y_pred = best_fair_clf.fit_transform(X_train,y_train,X_val,gamma=0.001)

Résultats pour le jeu de validation:

In [17]:
report = classification_report(y_val,val_y_pred)
print(report)

              precision    recall  f1-score   support

           0       0.90      0.87      0.89      6922
           1       0.65      0.69      0.67      2294

    accuracy                           0.83      9216
   macro avg       0.77      0.78      0.78      9216
weighted avg       0.83      0.83      0.83      9216



In [19]:
opti_fair_idp = intersectional_demographic_parity(val_y_pred,protected_val)
print('Difference in Demographic Parity (optimal fair, validation):')
print_metric_result(opti_fair_idp)

Difference in Demographic Parity (optimal fair, validation):
value: 0.00 
Most advantaged_group: Amer-Indian-Eskimo 
Most disadvantaged_group: White


In [None]:
opti_fair_p_per = intersectional_p_percent(val_y_pred,protected_val)
print('P%-rule (optimal fair, validation):')
print_metric_result(opti_fair_p_per)

P%-rule (optimal fair):
value: 1.00 
Most advantaged_group: Amer-Indian-Eskimo 
Most disadvantaged_group: White


#### **Étape 2:** on regarde si les résultats se généralisent sur le set de test

In [21]:
y_pred_test = best_fair_clf.predict(X_test,protected_test)

In [28]:
from pprint import pprint
print('Computed Tresholds:')
pprint(best_fair_clf.thresholds)

Computed Tresholds:
{'Amer-Indian-Eskimo': np.float64(0.19467357331060448),
 'Asian-Pac-Islander': np.float64(0.3669684063702495),
 'Black': np.float64(0.15902737344321807),
 'Other': np.float64(0.09270798597027412),
 'White': np.float64(0.39284338892737236)}


In [29]:
report = classification_report(y_test,y_pred_test)
print(report)

              precision    recall  f1-score   support

           0       0.91      0.87      0.89     11543
           1       0.65      0.72      0.68      3772

    accuracy                           0.83     15315
   macro avg       0.78      0.80      0.79     15315
weighted avg       0.84      0.83      0.84     15315



In [None]:
opti_fair_idp = intersectional_demographic_parity(y_pred_test,protected_test)
print('Difference in Demographic Parity (optimal fair, test):')
print_metric_result(opti_fair_idp)

Difference in Demographic Parity (optimal fair, validation):
value: 0.10 
Most advantaged_group: Other 
Most disadvantaged_group: Black


In [None]:
opti_fair_p_per = intersectional_p_percent(y_pred_test,protected_test)
print('P%-rule (optimal fair, test):')
print_metric_result(opti_fair_p_per)

P%-rule (optimal fair):
value: 0.72 
Most advantaged_group: Other 
Most disadvantaged_group: Black
