En este notebook **rebalanceamos**

## Import libraries and data

In [1]:
%matplotlib inline

from IPython.display import clear_output


# import matplotlib.pyplot as plt 
import numpy as np 
import pandas as pd

# import shap

pd.set_option('display.max_columns', None)

from sklearn.metrics import precision_score
from sklearn.model_selection import train_test_split

from xgboost import XGBClassifier

from imblearn.over_sampling import SMOTE, RandomOverSampler
from imblearn.under_sampling import RandomUnderSampler



In [17]:
from sklearn.preprocessing import LabelEncoder

In [2]:
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, ConfusionMatrixDisplay, mean_squared_error

# Sesgos en COMPAS

En esta sección, estudiaremos si el modelo COMPAS está sesgado comparando las puntuaciones obtenidas con la tasa real de reincidencia. En otras palabras, dadas dos personas con las mismas características excepto la raza, intentaremos analizar si el modelo sobreestima una puntuación más alta para una raza determinada. 

COMPAS funciona evaluando una serie de factores, entre los que se incluyen la edad, el sexo, los rasgos de personalidad, las medidas de aislamiento social, los antecedentes penales, la criminalidad familiar, la geografía y la situación laboral. Northpointe obtiene parte de esta información de los antecedentes penales y el resto de un cuestionario en el que se pide a los acusados que respondan a preguntas como «¿Cuántos de tus amigos/conocidos consumen drogas ilegales?» y que estén de acuerdo o en desacuerdo con afirmaciones como «Una persona hambrienta tiene derecho a robar».

COMPAS devuelve una puntuación de 0 a 10 que indica el riesgo de reincidencia. Para facilitar la comparación, la puntuación decimal se transforma en una etiqueta binaria que indica riesgo alto (5-10) o riesgo bajo (1-4).

In [3]:
url = 'https://raw.githubusercontent.com/propublica/compas-analysis/master/compas-scores-two-years.csv'
df = pd.read_csv(url)
df['high_risk'] = (df['decile_score'] >= 5).astype(int)

In [4]:
df.head()

Unnamed: 0,id,name,first,last,compas_screening_date,sex,dob,age,age_cat,race,juv_fel_count,decile_score,juv_misd_count,juv_other_count,priors_count,days_b_screening_arrest,c_jail_in,c_jail_out,c_case_number,c_offense_date,c_arrest_date,c_days_from_compas,c_charge_degree,c_charge_desc,is_recid,r_case_number,r_charge_degree,r_days_from_arrest,r_offense_date,r_charge_desc,r_jail_in,r_jail_out,violent_recid,is_violent_recid,vr_case_number,vr_charge_degree,vr_offense_date,vr_charge_desc,type_of_assessment,decile_score.1,score_text,screening_date,v_type_of_assessment,v_decile_score,v_score_text,v_screening_date,in_custody,out_custody,priors_count.1,start,end,event,two_year_recid,high_risk
0,1,miguel hernandez,miguel,hernandez,2013-08-14,Male,1947-04-18,69,Greater than 45,Other,0,1,0,0,0,-1.0,2013-08-13 06:03:42,2013-08-14 05:41:20,13011352CF10A,2013-08-13,,1.0,F,Aggravated Assault w/Firearm,0,,,,,,,,,0,,,,,Risk of Recidivism,1,Low,2013-08-14,Risk of Violence,1,Low,2013-08-14,2014-07-07,2014-07-14,0,0,327,0,0,0
1,3,kevon dixon,kevon,dixon,2013-01-27,Male,1982-01-22,34,25 - 45,African-American,0,3,0,0,0,-1.0,2013-01-26 03:45:27,2013-02-05 05:36:53,13001275CF10A,2013-01-26,,1.0,F,Felony Battery w/Prior Convict,1,13009779CF10A,(F3),,2013-07-05,Felony Battery (Dom Strang),,,,1,13009779CF10A,(F3),2013-07-05,Felony Battery (Dom Strang),Risk of Recidivism,3,Low,2013-01-27,Risk of Violence,1,Low,2013-01-27,2013-01-26,2013-02-05,0,9,159,1,1,0
2,4,ed philo,ed,philo,2013-04-14,Male,1991-05-14,24,Less than 25,African-American,0,4,0,1,4,-1.0,2013-04-13 04:58:34,2013-04-14 07:02:04,13005330CF10A,2013-04-13,,1.0,F,Possession of Cocaine,1,13011511MM10A,(M1),0.0,2013-06-16,Driving Under The Influence,2013-06-16,2013-06-16,,0,,,,,Risk of Recidivism,4,Low,2013-04-14,Risk of Violence,3,Low,2013-04-14,2013-06-16,2013-06-16,4,0,63,0,1,0
3,5,marcu brown,marcu,brown,2013-01-13,Male,1993-01-21,23,Less than 25,African-American,0,8,1,0,1,,,,13000570CF10A,2013-01-12,,1.0,F,Possession of Cannabis,0,,,,,,,,,0,,,,,Risk of Recidivism,8,High,2013-01-13,Risk of Violence,6,Medium,2013-01-13,,,1,0,1174,0,0,1
4,6,bouthy pierrelouis,bouthy,pierrelouis,2013-03-26,Male,1973-01-22,43,25 - 45,Other,0,1,0,0,2,,,,12014130CF10A,,2013-01-09,76.0,F,arrest case no charge,0,,,,,,,,,0,,,,,Risk of Recidivism,1,Low,2013-03-26,Risk of Violence,1,Low,2013-03-26,,,2,0,1102,0,0,0


## Experimentos

Como no disponemos de las características de entrada necesarias para replicar el modelo COMPAS, entrenaremos un clasificador para predecir la puntuación COMPAS a partir del género, la raza, la edad, el número de antecedentes penales y el factor de delincuencia. Evaluaremos el modelo utilizando diferentes métricas de equidad y estudiaremos cómo los distintos métodos de reequilibrio de datos pueden afectar a estas métricas.


SMOTE/Submuestreo/Sobremuestreo -> Entrenar -> Evaluar diferentes métricas.

### Métricas a evaluar:

Castelnovo, A., Crupi, R., Greco, G., & Regoli, D. (2021). The zoo of Fairness metrics in Machine Learning. arXiv preprint arXiv:2106.00467.


INDEPENDENCIA (INDEPENDENCE)

- **Paridad demográfica (Demographic parity)**: Ratio de predicción positiva entre dos razas.
- **¿Paridad demográfica condicionada para principales? (Demographic parity conditioned on priors?)**

SEPARACIÓN (SEPARATION)

- **Igualdad predictiva (Predictive equality)** -> FPR
- **Igualdad de oportunidades (Equality of opportunity)** -> FNR

SUFICIENCIA (SUFFICIENCY)

- **Paridad predictiva (Predictive parity)** -> Precision

In [5]:
def eval_fairness(y_pred, y_true, black_mask, white_mask):
    y_pred_black = y_pred[black_mask]
    y_true_black = y_true[black_mask]
    y_pred_white = y_pred[white_mask]
    y_true_white = y_true[white_mask]
    # False Positive Rates FPR = FP / (FP + TN)
    fpr_black = np.sum((y_pred_black == 1) * (y_true_black == 0)) / np.sum(y_true_black == 0)
    fpr_white = np.sum((y_pred_white == 1) * (y_true_white == 0)) / np.sum(y_true_white == 0)
    # True positive rates TPR = TP / (TP + FN)
    tpr_black = np.sum((y_pred_black == 1)*(y_true_black == 1)) / np.sum(y_true_black == 1)
    tpr_white = np.sum((y_pred_white == 1)*(y_true_white == 1)) / np.sum(y_true_white == 1)
    # Precision
    precision_black = precision_score(y_true_black, y_pred_black)
    precision_white = precision_score(y_true_white, y_pred_white)

    data = {}
    data['TPR_w'] = tpr_white
    data['TPR_b'] = tpr_black
    data['FPR_w'] = fpr_white
    data['FPR_b'] = fpr_black
    data['Eq. Oportunity'] = abs(tpr_white-tpr_black)
    data['Pred. Equality'] = abs(fpr_white-fpr_black)
    data['Eq. odds'] = abs(tpr_white-tpr_black) + abs(fpr_white-fpr_black)
    data['Accuracy'] = np.mean(y_pred == y_true)
    # cm_tmp = confusion_matrix(y_true_black, y_pred_black)
    # print(f"cm black FPR: {cm_tmp[1,0]/(cm_tmp[1,0]+cm_tmp[0,0])}")
    # print(f"FPR black: {fpr_black}")
    # print(f"same?? {(cm_tmp[1,0]/(cm_tmp[1,0]+cm_tmp[0,0])) == fpr_black}")

    return data 

### SMOTE/Oversampling/Undersampling

In [None]:
# def eval_resampler(df, sampler=None, resample_test=False):

#     # Prepare the data
#     df_temp = df[(df['race'] == 'African-American') | (df['race'] == 'Caucasian')]
#     cols = ['age', 'sex', 'race', 'priors_count', 'score_text']
#     X, recid = df_temp[cols], df_temp['two_year_recid']
#     X['score_text'] = [0 if y_i == 'Low' else 1 for y_i in X['score_text']]
#     X = pd.get_dummies(X, drop_first=True)
#     X_train, X_test, recid_train, recid_test = train_test_split(X, recid.values, test_size=0.2, random_state=42)

#     ##############################
#     # RESAMPLE THE TRAINING SET  #
#     ##############################

#     # Build target variable combining both the race and whether it has recivided or not
#     #   - '00': Black, Non-recividist
#     #   - '01': Black, Recividist
#     #   - '10': White, Non-recividist
#     #   - '11': White, Recividist
#     if sampler:
#         # get the race value
#         y_race = X_train['race_Caucasian'].values
#         # build the target variable
#         y_sampler = np.array([str(a) + str(b) for a, b in zip(y_race, recid_train)])

#         print("TRAINING SET:")
#         print("Before Sampling: \n\tBlack, Non-recidivist: {}\n\tBlack, Recidivist: {}\
#             \n\tWhite, Non-recidivist: {}\n\tWhite, Recidivist: {}".format(np.sum(y_sampler == '00'), \
#             np.sum(y_sampler == '01'), np.sum(y_sampler == '10'), np.sum(y_sampler == '11')))

#         # Sample the dataset according to the race and the recividism rates
#         X_train, y_sampler = sampler.fit_resample(X_train, y_sampler)

#         print("After Sampling: \n\tBlack, Non-recidivist: {}\n\tBlack, Recidivist: {}\
#             \n\tWhite, Non-recidivist: {}\n\tWhite, Recidivist: {}".format(np.sum(y_sampler == '00'), \
#             np.sum(y_sampler == '01'), np.sum(y_sampler == '10'), np.sum(y_sampler == '11')))

#         # Undo the label, i.e. get the race and the real recividism rate
#         race, recid_train = np.array([int(y_i[0]) for y_i in y_sampler]), np.array([int(y_i[1]) for y_i in y_sampler])
#         X_train['race_Caucasian'] = race 
        
#     X_train, y_train = X_train.drop(columns='score_text'), X_train['score_text']

#     ####################################
#     # RESAMPLE THE TEST SET (OPTIONAL) #
#     ####################################

#     if resample_test and sampler:
#     # get the race value
#         y_race = X_test['race_Caucasian'].values
#         # build the target variable
#         y_sampler = np.array([str(a) + str(b) for a, b in zip(y_race, recid_test)])

#         print("TEST SET:")
#         print("Before Sampling: \n\tBlack, Non-recidivist: {}\n\tBlack, Recidivist: {}\
#             \n\tWhite, Non-recidivist: {}\n\tWhite, Recidivist: {}".format(np.sum(y_sampler == '00'), \
#             np.sum(y_sampler == '01'), np.sum(y_sampler == '10'), np.sum(y_sampler == '11')))

#         # Sample the dataset according to the race and the recividism rates
#         X_test, y_sampler = sampler.fit_resample(X_test, y_sampler)

#         print("After Sampling: \n\tBlack, Non-recidivist: {}\n\tBlack, Recidivist: {}\
#             \n\tWhite, Non-recidivist: {}\n\tWhite, Recidivist: {}".format(np.sum(y_sampler == '00'), \
#             np.sum(y_sampler == '01'), np.sum(y_sampler == '10'), np.sum(y_sampler == '11')))

#         # Undo the label, i.e. get the race and the real recividism rate
#         race, recid_test = np.array([int(y_i[0]) for y_i in y_sampler]), np.array([int(y_i[1]) for y_i in y_sampler])
#         X_test['race_Caucasian'] = race 

#     X_test, y_test = X_test.drop(columns='score_text'), X_test['score_text']

#     # Train the model

#     clf = XGBClassifier(use_label_encoder=False, eval_metric='logloss')
#     clf.fit(X_train, y_train)

#     # Predict
#     y_pred = clf.predict(X_test)

#     black_mask = X_test['race_Caucasian'] == 0
#     white_mask = X_test['race_Caucasian'] == 1

#     # Evaluate fairness metrics
#     data = eval_fairness(y_pred, recid_test, black_mask, white_mask)
#     return data

In [16]:
def eval_resampler(df, sampler=None, resample_test=False):
    
    # --------------------------------------------------------
    # 1. FILTRAR DATOS: Solo African-American y Caucasian
    # --------------------------------------------------------
    df_temp = df[(df["race"] == "African-American") | (df["race"] == "Caucasian")].copy()

    # --------------------------------------------------------
    # 2. PREPARACIÓN DE FEATURES
    # --------------------------------------------------------
    cols = ["age", "sex", "priors_count", "score_text"]
    X = df_temp[cols].copy()

    # score_text → 0 = Low, 1 = Medium/High
    X["score_text"] = (X["score_text"] != "Low").astype(int)

    # Codificar sex a 0/1
    X["sex"] = (X["sex"] == "Male").astype(int)

    # Codificar race a 0/1 para fairness
    df_temp["race_bin"] = (df_temp["race"] == "Caucasian").astype(int)

    # target recidivism (asegurar tipo entero)
    recid = df_temp["two_year_recid"].astype(int).values

    # Añadimos la variable racial (0/1)
    X["race_bin"] = df_temp["race_bin"].values

    # One-hot encoding limpio
    X = pd.get_dummies(X, drop_first=True)

    # --------------------------------------------------------
    # 3. SPLIT DATA
    # --------------------------------------------------------
    X_train, X_test, recid_train, recid_test = train_test_split(
        X, recid, test_size=0.2, random_state=42
    )

    race_train = X_train["race_bin"].values
    race_test  = X_test["race_bin"].values

    # --------------------------------------------------------
    # 4. COMBINAR RACE + RECID COMO TARGET PARA SMOTE (si aplica)
    # --------------------------------------------------------
    if sampler:

        # target conjunto para fairness → 00,01,10,11
        y_joint = np.array([f"{r}{c}" for r, c in zip(race_train, recid_train)])

        # print("\nTRAIN SET BEFORE SAMPLING:")
        # print("Black Non-rec:", np.sum(y_joint == "00"))
        # print("Black Recid:  ", np.sum(y_joint == "01"))
        # print("White Non-rec:", np.sum(y_joint == "10"))
        # print("White Recid:  ", np.sum(y_joint == "11"))

        # Resampling
        X_train_res, y_joint_res = sampler.fit_resample(X_train, y_joint)

        # print("\nTRAIN SET AFTER SAMPLING:")
        # print("Black Non-rec:", np.sum(y_joint_res == "00"))
        # print("Black Recid:  ", np.sum(y_joint_res == "01"))
        # print("White Non-rec:", np.sum(y_joint_res == "10"))
        # print("White Recid:  ", np.sum(y_joint_res == "11"))

        # DESCOMPONER el label conjunto
        race_train = np.array([int(y[0]) for y in y_joint_res])
        recid_train = np.array([int(y[1]) for y in y_joint_res])

        X_train_res["race_bin"] = race_train
        X_train = X_train_res.copy()

    # --------------------------------------------------------
    # 5. RESAMPLE TEST SET (OPCIONAL)
    # --------------------------------------------------------
    if resample_test and sampler:

        y_joint_test = np.array([f"{r}{c}" for r, c in zip(race_test, recid_test)])

        # print("\nTEST SET BEFORE SAMPLING:")
        # print("Black Non-rec:", np.sum(y_joint_test == "00"))
        # print("Black Recid:  ", np.sum(y_joint_test == "01"))
        # print("White Non-rec:", np.sum(y_joint_test == "10"))
        # print("White Recid:  ", np.sum(y_joint_test == "11"))

        X_test_res, y_joint_test_res = sampler.fit_resample(X_test, y_joint_test)

        # print("\nTEST SET AFTER SAMPLING:")
        # print("Black Non-rec:", np.sum(y_joint_test_res == "00"))
        # print("Black Recid:  ", np.sum(y_joint_test_res == "01"))
        # print("White Non-rec:", np.sum(y_joint_test_res == "10"))
        # print("White Recid:  ", np.sum(y_joint_test_res == "11"))

        race_test  = np.array([int(y[0]) for y in y_joint_test_res])
        recid_test = np.array([int(y[1]) for y in y_joint_test_res])

        X_test = X_test_res.copy()
        X_test["race_bin"] = race_test

    # --------------------------------------------------------
    # 6. TRAIN MODEL
    # --------------------------------------------------------
    clf = XGBClassifier(use_label_encoder=False, eval_metric="logloss")
    clf.fit(X_train, recid_train)

    # --------------------------------------------------------
    # 7. PREDICT
    # --------------------------------------------------------
    y_pred = clf.predict(X_test)

    # --------------------------------------------------------
    # 8. FAIRNESS MASKS
    # --------------------------------------------------------
    black_mask = (race_test == 0)
    white_mask = (race_test == 1)

    # --------------------------------------------------------
    # 9. MÉTRICAS DE FAIRNESS
    # --------------------------------------------------------
    data = eval_fairness(y_pred, recid_test, black_mask, white_mask)

    return data


In [None]:
def eval_resampler(df, sampler=None, resample_test=False):
    
    # --------------------------------------------------------
    # 1. FILTRAR DATOS: Solo African-American y Caucasian
    # --------------------------------------------------------
    df_temp = df[(df["race"] == "African-American") | (df["race"] == "Caucasian")].copy()

    # --------------------------------------------------------
    # 2. PREPARACIÓN DE FEATURES
    # --------------------------------------------------------
    cols = ["age", "sex", "priors_count", "score_text"]
    X = df_temp[cols].copy()

    # score_text → 0 = Low, 1 = Medium/High
    X["score_text"] = (X["score_text"] != "Low").astype(int)

    # Codificar sex a 0/1
    X["sex"] = (X["sex"] == "Male").astype(int)

    # Codificar race a 0/1 para fairness
    df_temp["race_bin"] = (df_temp["race"] == "Caucasian").astype(int)

    # target recidivism (asegurar tipo entero)
    recid = df_temp["two_year_recid"].astype(int).values

    # Añadimos la variable racial (0/1)
    X["race_bin"] = df_temp["race_bin"].values

    # One-hot encoding limpio
    X = pd.get_dummies(X, drop_first=True)

    # --------------------------------------------------------
    # 3. SPLIT DATA
    # --------------------------------------------------------
    X_train, X_test, recid_train, recid_test = train_test_split(
        X, recid, test_size=0.2, random_state=42
    )

    race_train = X_train["race_bin"].values
    race_test  = X_test["race_bin"].values

    # --------------------------------------------------------
    # 4. COMBINAR RACE + RECID COMO TARGET PARA SMOTE (si aplica)
    # --------------------------------------------------------
    if sampler:

        # target conjunto para fairness → 00,01,10,11
        y_joint = np.array([f"{r}{c}" for r, c in zip(race_train, recid_train)])

        # print("\nTRAIN SET BEFORE SAMPLING:")
        # print("Black Non-rec:", np.sum(y_joint == "00"))
        # print("Black Recid:  ", np.sum(y_joint == "01"))
        # print("White Non-rec:", np.sum(y_joint == "10"))
        # print("White Recid:  ", np.sum(y_joint == "11"))

        # Resampling
        X_train_res, y_joint_res = sampler.fit_resample(X_train, y_joint)

        # print("\nTRAIN SET AFTER SAMPLING:")
        # print("Black Non-rec:", np.sum(y_joint_res == "00"))
        # print("Black Recid:  ", np.sum(y_joint_res == "01"))
        # print("White Non-rec:", np.sum(y_joint_res == "10"))
        # print("White Recid:  ", np.sum(y_joint_res == "11"))

        # DESCOMPONER el label conjunto
        race_train = np.array([int(y[0]) for y in y_joint_res])
        recid_train = np.array([int(y[1]) for y in y_joint_res])

        X_train_res["race_bin"] = race_train
        X_train = X_train_res.copy()

    # --------------------------------------------------------
    # 5. RESAMPLE TEST SET (OPCIONAL)
    # --------------------------------------------------------
    if resample_test and sampler:

        y_joint_test = np.array([f"{r}{c}" for r, c in zip(race_test, recid_test)])

        # print("\nTEST SET BEFORE SAMPLING:")
        # print("Black Non-rec:", np.sum(y_joint_test == "00"))
        # print("Black Recid:  ", np.sum(y_joint_test == "01"))
        # print("White Non-rec:", np.sum(y_joint_test == "10"))
        # print("White Recid:  ", np.sum(y_joint_test == "11"))

        X_test_res, y_joint_test_res = sampler.fit_resample(X_test, y_joint_test)

        # print("\nTEST SET AFTER SAMPLING:")
        # print("Black Non-rec:", np.sum(y_joint_test_res == "00"))
        # print("Black Recid:  ", np.sum(y_joint_test_res == "01"))
        # print("White Non-rec:", np.sum(y_joint_test_res == "10"))
        # print("White Recid:  ", np.sum(y_joint_test_res == "11"))

        race_test  = np.array([int(y[0]) for y in y_joint_test_res])
        recid_test = np.array([int(y[1]) for y in y_joint_test_res])

        X_test = X_test_res.copy()
        X_test["race_bin"] = race_test
    return X_train, X_test, recid_train, race_test, recid_test

In [None]:
def train_test_model(X_train, X_test, recid_train, race_test, recid_test):
    # --------------------------------------------------------
    # 6. TRAIN MODEL
    # --------------------------------------------------------
    clf = XGBClassifier(use_label_encoder=False, eval_metric="logloss")
    clf.fit(X_train, recid_train)

    # --------------------------------------------------------
    # 7. PREDICT
    # --------------------------------------------------------
    y_pred = clf.predict(X_test)

    # --------------------------------------------------------
    # 8. FAIRNESS MASKS
    # --------------------------------------------------------
    black_mask = (race_test == 0)
    white_mask = (race_test == 1)

    # --------------------------------------------------------
    # 9. MÉTRICAS DE FAIRNESS
    # --------------------------------------------------------
    data = eval_fairness(y_pred, recid_test, black_mask, white_mask)

In [None]:
eval_resampler(df, sampler=RandomUnderSampler(random_state=42))

In [18]:
data = []
index= []

index.append("Original Training - Original Test")
ev_0 = eval_resampler(df)
# print(ev_0)
data.append(ev_0)
index.append("SMOTE Training - Original Test")
ev_1 = eval_resampler(df, sampler=SMOTE(random_state=42))
# print(ev_1)
data.append(ev_1)
index.append("SMOTE Training - SMOTE Test")
data.append(eval_resampler(df, sampler=SMOTE(random_state=42), resample_test=True))
index.append("Oversampling Training - Original Test")
data.append(eval_resampler(df, sampler=RandomOverSampler(random_state=42)))
index.append("Oversampling Training - Oversampling Test")
data.append(eval_resampler(df, sampler=RandomOverSampler(random_state=42), resample_test=True))
index.append("Undersampling Training - Original Test")
data.append(eval_resampler(df, sampler=RandomUnderSampler(random_state=42)))
index.append("Undersampling Training - Undersampling Test")
data.append(eval_resampler(df, sampler=RandomUnderSampler(random_state=42), resample_test=True))


# clear_output(wait=True)

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.



In [19]:
pd.DataFrame(data, index=index)

Unnamed: 0,TPR_w,TPR_b,FPR_w,FPR_b,Eq. Oportunity,Pred. Equality,Eq. odds,Accuracy
Original Training - Original Test,0.391753,0.648438,0.122112,0.332378,0.256685,0.210266,0.466951,0.669919
SMOTE Training - Original Test,0.536082,0.630208,0.267327,0.312321,0.094126,0.044994,0.13912,0.656911
SMOTE Training - SMOTE Test,0.46875,0.630208,0.242188,0.309896,0.161458,0.067708,0.229167,0.636719
Oversampling Training - Original Test,0.541237,0.630208,0.270627,0.332378,0.088971,0.061751,0.150722,0.65122
Oversampling Training - Oversampling Test,0.526042,0.630208,0.270833,0.34375,0.104167,0.072917,0.177083,0.635417
Undersampling Training - Original Test,0.561856,0.645833,0.254125,0.303725,0.083978,0.0496,0.133577,0.671545
Undersampling Training - Undersampling Test,0.561856,0.613402,0.268041,0.324742,0.051546,0.056701,0.108247,0.645619


### Training a different classifier for each race

In [20]:
df_temp = df[(df['race'] == 'African-American') | (df['race'] == 'Caucasian')]
cols = ['age', 'sex', 'race', 'priors_count', 'score_text']
X, recid = df_temp[cols], df_temp['two_year_recid']
X['score_text'] = [0 if y_i == 'Low' else 1 for y_i in X['score_text']]
X = pd.get_dummies(X, drop_first=True)
X_train, X_test, recid_train, recid_test = train_test_split(X, recid.values, test_size=0.2, random_state=42)

# Train a classifier for each race
X_train_black, recid_train_black = X_train[X_train['race_Caucasian'] == 0], recid_train[X_train['race_Caucasian'] == 0]
X_train_white, recid_train_white = X_train[X_train['race_Caucasian'] == 1], recid_train[X_train['race_Caucasian'] == 1]
# Get score text in order to train
X_train_black, y_train_black = X_train_black.drop(columns='score_text'), X_train_black['score_text']
X_train_white, y_train_white = X_train_white.drop(columns='score_text'), X_train_white['score_text']

clf_black = XGBClassifier(use_label_encoder=False, eval_metric='logloss')
clf_white = XGBClassifier(use_label_encoder=False, eval_metric='logloss')

# Fit the models
clf_black.fit(X_train_black, y_train_black)
clf_white.fit(X_train_white, y_train_white)

# Make predictions
X_test_black, recid_test_black = X_test[X_test['race_Caucasian'] == 0], recid_test[X_test['race_Caucasian'] == 0]
X_test_white, recid_test_white = X_test[X_test['race_Caucasian'] == 1], recid_test[X_test['race_Caucasian'] == 1]
# Get score text in order to train
X_test_black, y_test_black = X_test_black.drop(columns='score_text'), X_test_black['score_text']
X_test_white, y_test_white = X_test_white.drop(columns='score_text'), X_test_white['score_text']

y_pred_black = clf_black.predict(X_test_black)
y_pred_white = clf_white.predict(X_test_white)
y_pred = np.concatenate((y_pred_black, y_pred_white))
recid_test = np.concatenate((recid_test_black, recid_test_white))
black_mask = np.array([True]*len(y_pred_black) + [False]*len(y_pred_white))
white_mask = np.array([False]*len(y_pred_black) + [True]*len(y_pred_white))

index.append("Split by race")
data.append(eval_fairness(y_pred, recid_test, black_mask, white_mask))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X['score_text'] = [0 if y_i == 'Low' else 1 for y_i in X['score_text']]
Parameters: { "use_label_encoder" } are not used.

Parameters: { "use_label_encoder" } are not used.



### Removing race attribute

In [21]:
# Train without the race variable
df_temp = df[(df['race'] == 'African-American') | (df['race'] == 'Caucasian')]
cols = ['age', 'sex', 'race', 'priors_count', 'score_text']
X, recid = df_temp[cols], df_temp['two_year_recid']
X['score_text'] = [0 if y_i == 'Low' else 1 for y_i in X['score_text']]
X = pd.get_dummies(X, drop_first=True)
X_train, X_test, recid_train, recid_test = train_test_split(X, recid.values, test_size=0.2, random_state=42)

# drop the race
X_train, y_train = X_train.drop(columns=['race_Caucasian', 'score_text']), X_train['score_text']
# Train the model without race
clf = XGBClassifier(use_label_encoder=False, eval_metric='logloss')
clf.fit(X_train, y_train)

# Predict
y_pred = clf.predict(X_test.drop(columns=['race_Caucasian', 'score_text']))
black_mask = X_test['race_Caucasian'] == 0
white_mask = X_test['race_Caucasian'] == 1 

index.append("Remove race attribute")
data.append(eval_fairness(y_pred, recid_test, black_mask, white_mask))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X['score_text'] = [0 if y_i == 'Low' else 1 for y_i in X['score_text']]
Parameters: { "use_label_encoder" } are not used.



In [28]:
df_results = pd.DataFrame(data, index=index)

In [30]:
df_results

Unnamed: 0,TPR_w,TPR_b,FPR_w,FPR_b,Eq. Oportunity,Pred. Equality,Eq. odds,Accuracy
Original Training - Original Test,0.391753,0.648438,0.122112,0.332378,0.256685,0.210266,0.466951,0.669919
SMOTE Training - Original Test,0.536082,0.630208,0.267327,0.312321,0.094126,0.044994,0.13912,0.656911
SMOTE Training - SMOTE Test,0.46875,0.630208,0.242188,0.309896,0.161458,0.067708,0.229167,0.636719
Oversampling Training - Original Test,0.541237,0.630208,0.270627,0.332378,0.088971,0.061751,0.150722,0.65122
Oversampling Training - Oversampling Test,0.526042,0.630208,0.270833,0.34375,0.104167,0.072917,0.177083,0.635417
Undersampling Training - Original Test,0.561856,0.645833,0.254125,0.303725,0.083978,0.0496,0.133577,0.671545
Undersampling Training - Undersampling Test,0.561856,0.613402,0.268041,0.324742,0.051546,0.056701,0.108247,0.645619
Split by race,0.35567,0.703125,0.188119,0.389685,0.347455,0.201566,0.549021,0.64878
Remove race attribute,0.407216,0.674479,0.181518,0.355301,0.267263,0.173783,0.441045,0.65935


In [31]:
df_results.style \
    .format("{:.2f}") \
    .background_gradient(
        cmap="coolwarm",  # azul = bajo, rojo = alto
        axis=None         # usar toda la tabla para calcular rangos
    )

Unnamed: 0,TPR_w,TPR_b,FPR_w,FPR_b,Eq. Oportunity,Pred. Equality,Eq. odds,Accuracy
Original Training - Original Test,0.39,0.65,0.12,0.33,0.26,0.21,0.47,0.67
SMOTE Training - Original Test,0.54,0.63,0.27,0.31,0.09,0.04,0.14,0.66
SMOTE Training - SMOTE Test,0.47,0.63,0.24,0.31,0.16,0.07,0.23,0.64
Oversampling Training - Original Test,0.54,0.63,0.27,0.33,0.09,0.06,0.15,0.65
Oversampling Training - Oversampling Test,0.53,0.63,0.27,0.34,0.1,0.07,0.18,0.64
Undersampling Training - Original Test,0.56,0.65,0.25,0.3,0.08,0.05,0.13,0.67
Undersampling Training - Undersampling Test,0.56,0.61,0.27,0.32,0.05,0.06,0.11,0.65
Split by race,0.36,0.7,0.19,0.39,0.35,0.2,0.55,0.65
Remove race attribute,0.41,0.67,0.18,0.36,0.27,0.17,0.44,0.66


## SHAP