# Exploring models trained on the COMPAS Recidivism Racial Bias dataset

<a href="https://colab.research.google.com/drive/1JvK4wuB7aAgNSOfY6fJRp677ivsoQNgO" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab">
</a>

Return to the [castle](https://github.com/Nkluge-correa/TeenyTinyCastle).

Fairness and interpretability are two important considerations in machine learning, and they are closely related. Fairness refers to the absence of bias or discrimination in the ML models and their outputs, while interpretability refers to the ability to understand and explain how the models make their predictions.

In some cases, improving a model's interpretability can help identify and correct sources of bias or discrimination, improving the model's fairness. For example, suppose a model is biased against a particular group of people. In that case, it may be necessary to examine the features the model uses to make its predictions and determine whether any of these features unfairly influence the model's output. Hence, by improving the interpretability of the model, it may be possible to identify these problematic features and make changes to reduce bias and discrimination. Conversely, fairness can also be used as a metric to evaluate the interpretability of a model, given that a more interpretable model may be easier to assess for bias and discrimination, as it is easier to understand how the model is making its predictions.

Therefore, fairness and interpretability are closely linked, and it is important to consider both aspects when developing ML models. We will work with these two principles in this notebook using the `COMPAS` dataset.

<img src="https://static.propublica.org/projects/algorithmic-bias/assets/img/generated/opener-b-crop-1200*675-00796e.jpg" width="600"/>

Source: [ProPublica](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing).

**Correctional Offender Management Profiling for Alternative Sanctions** (`COMPAS`) is a case management and decision support tool developed and owned by Northpointe (now [Equivant](https://www.equivant.com/)) used by USA courts to assess the likelihood of a defendant becoming a recidivist. In short, the `COMPAS` software uses an algorithm to assess potential recidivism risk. Northpointe created risk scales for general and **violent recidivism**, and for **pretrial misconduct**.

> **Note**: all datasets and models related to the course and repo are in the Hub. 🤗

A general critique of the use of proprietary software such as COMPAS is that since the algorithms it uses are **trade secrets**, they cannot be examined by the public and affected parties, which may violate due process. Another general criticism of machine-learning-based algorithms is that since they are data-dependent, the software will likely yield biased results if the data used in training is biased.

To test this critique, we will create a classifier from scratch and train it on the COMPAS dataset.



In [27]:
!pip install datasets -q

from datasets import load_dataset

# Load the datasets from the hub
dataset = load_dataset('AiresPucrs/COMPAS', split="train")

# Turn the datasets into a pandas.DataFrame
df = dataset.to_pandas()

After downloading our dataset, we need to eliminate the labels (scores and categories) that the original algorithm produced. We will only use a subset of the features from the original dataset in this tutorial. Also, for simplicity, we are merging the `Low` and `Medium` labels to turn this classification task into a binary problem (Fairness analyses are more straightforward in these cases).

> **Note:** `High` risk samples represent only 25% of our dataset, and these are precisely the cases we want to distinguish better.

In [28]:
import pandas as pd

# Create a Label column
df['label'] = df['score_text'].apply(lambda x: 0 if x == 'High' else 1)

# Select the features we will use
features = ['sex', 'age_cat', 'race',
        'juv_fel_count', 'juv_misd_count',
        'juv_other_count', 'priors_count',
        'days_b_screening_arrest', 'c_days_from_compas',
        'c_charge_degree', 'is_recid', 'is_violent_recid',
        'label']

df = df[features].dropna()
df.reset_index(inplace=True, drop=True)

with pd.option_context('display.max_columns', None):
    display(df)

Unnamed: 0,sex,age_cat,race,juv_fel_count,juv_misd_count,juv_other_count,priors_count,days_b_screening_arrest,c_days_from_compas,c_charge_degree,is_recid,is_violent_recid,label
0,Male,Greater than 45,Other,0,0,0,0,-1.0,1.0,(F3),0,0,1
1,Male,Greater than 45,Other,0,0,0,0,-1.0,1.0,(F3),0,0,1
2,Male,25 - 45,African-American,0,0,0,0,-1.0,1.0,(F3),1,1,1
3,Male,Less than 25,African-American,0,0,1,4,-1.0,1.0,(F3),1,0,1
4,Male,Less than 25,African-American,0,0,1,4,-1.0,1.0,(F3),1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
17014,Female,25 - 45,African-American,0,0,0,5,-1.0,1.0,(M1),0,0,1
17015,Male,Greater than 45,Other,0,0,0,0,-1.0,1.0,(F2),0,0,1
17016,Female,25 - 45,African-American,0,0,0,3,-1.0,1.0,(M1),0,0,1
17017,Female,Less than 25,Hispanic,0,0,0,2,-2.0,2.0,(F3),1,0,1


Now, let us see how our target is related to the sensitive attributes of our dataset (`Age`, `Race`, and `Sex`).

In [29]:
high_risk = []
low_risk = []

for element in list(df['sex'].unique()):
    a = df[df['sex'] == element]['label'].value_counts()[0]
    b = df[df['sex'] == element]['label'].value_counts()[1]
    high_risk.append(a)
    low_risk.append(b)

import plotly.graph_objects as go

fig = go.Figure(data=[
    go.Bar(name='High Risk', x=list(df['sex'].unique()), y=high_risk),
    go.Bar(name='Low Risk', x=list(df['sex'].unique()), y=low_risk)
])

fig.update_layout(
    barmode='group',
    template='plotly_dark',
    xaxis_title="<b>Sex</b>",
    yaxis_title="<b>Risk by Sex</b>",
    title='Distribution of <i>Risk Scores</i> by "Sex"',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)',
    )
fig.show()

high_risk = []
low_risk = []

for element in list(df['age_cat'].unique()):
    a = df[df['age_cat'] == element]['label'].value_counts()[0]
    b = df[df['age_cat'] == element]['label'].value_counts()[1]
    high_risk.append(a)
    low_risk.append(b)

import plotly.graph_objects as go

fig = go.Figure(data=[
    go.Bar(name='High Risk', x=list(df['age_cat'].unique()), y=high_risk),
    go.Bar(name='Low Risk', x=list(df['age_cat'].unique()), y=low_risk)
])

fig.update_layout(
    barmode='group',
    template='plotly_dark',
    xaxis_title="<b>Age</b>",
    yaxis_title="<b>Risk by Age</b>",
    title='Distribution of <i>Risk Scores</i> by "Age"',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)',
    )
fig.show()

high_risk = []
low_risk = []

for element in list(df['race'].unique()):
    a = df[df['race'] == element]['label'].value_counts()[0]
    b = df[df['race'] == element]['label'].value_counts()[1]
    high_risk.append(a)
    low_risk.append(b)

import plotly.graph_objects as go

fig = go.Figure(data=[
    go.Bar(name='High Risk', x=list(df['race'].unique()), y=high_risk),
    go.Bar(name='Low Risk', x=list(df['race'].unique()), y=low_risk)
])

fig.update_layout(
    barmode='group',
    template='plotly_dark',
    xaxis_title="<b>Race</b>",
    yaxis_title="<b>Risk by Race</b>",
    title='Distribution of <i>Risk Scores</i> by "Race"',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)',
    )
fig.show()

Samples with the features `African-American`, `Male`, and `25-45` represent this dataset's bulk of high-risk samples. Let us see if our sensitive attributes correlate with our label. If they are, this is already a sign that our future model could inherit these biases against a specific unprivileged class.

> **Note: To be able to calculate correlations, let us transform all categorical values into numbers.**

In [30]:
from sklearn.preprocessing import LabelEncoder
import plotly.express as px

corr_df = df.copy()

le = LabelEncoder()

for column in list(set(df.columns) - set(df._get_numeric_data().columns)):
    corr_df[column] = le.fit_transform(corr_df[column])

fig = px.imshow(corr_df.corr(numeric_only=True).values,
                labels=dict(x="Features", y="Features"),
                x=list(corr_df.columns),
                y=list(corr_df.columns),
                text_auto=True
                )

fig.update_xaxes(side='top')

fig.update_layout(template='plotly_dark',
                  title='Correlation Matrix',
                  coloraxis_showscale=False,
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')
fig.show()

According to the correlation scores, `race` has an alarming 0.22 correlation with our label. Let us see how this will impact our future model.

We will create two classifiers to deal with the classification problem: a `RandomForestClassifier` and a `LogisticRegressor`. We will make two classifiers to (1) compare their performance and (2) because analysis of coefficients is not possible with forest-type classifiers (decision trees).

In [31]:
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.compose import make_column_transformer
from sklearn.pipeline import make_pipeline

# Create a seed for reproducibility
seed = 42

# Define features and labels
X, y = df[df.columns.values.tolist()[0:12]], df[df.columns.values.tolist()[-1]]

# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.2,
    random_state=seed
)

# Preprocess features
preprocess = make_column_transformer(
    (StandardScaler(), ['juv_fel_count', 'juv_misd_count',
    'juv_other_count', 'priors_count', 'days_b_screening_arrest',
    'c_days_from_compas']),
    (OneHotEncoder(), ['sex', 'age_cat', 'race', 'c_charge_degree',
    'is_recid', 'is_violent_recid']))

from sklearn.ensemble import RandomForestClassifier

# Create an instance of a `RandomForestClassifier`
model_rf = make_pipeline(
    preprocess,
    RandomForestClassifier(max_depth=3, n_estimators=500))

# Fit the model
model_rf.fit(X_train, y_train.values.ravel())

# Evaluate the model
score = model_rf.score(X_test, y_test.values.ravel())

print(f'Accuracy (Random Forest): ' + '{:.2f}'.format(score * 100) + ' %')

Accuracy (Random Forest): 76.94 %


In [32]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
import plotly.express as px

# Create an instance of a `LogisticRegression`
model_lr = make_pipeline(
    preprocess,
    LogisticRegression(penalty='l2', max_iter= 500))

# Fit the model
model_lr.fit(X_train, y_train.values.ravel())

# Evaluate the model
score = model_lr.score(X_test, y_test.values.ravel())

print(f'Accuracy (Logistic Regression): ' +
      '{:.2f}'.format(score * 100) + ' %')

# Plot results as a confusion matrix
preds = model_lr.predict(X_test)
matrix = confusion_matrix(y_test.values.ravel(), preds)

fig = px.imshow(matrix,
                labels=dict(x="Predicted", y="True label"),
                x=['High', 'Low'],
                y=['High', 'Low'],
                text_auto=True
                )

fig.update_xaxes(side='top')

fig.update_layout(template='plotly_dark',
                  title='Confusion Matrix (Logistic Regression Model)',
                  coloraxis_showscale=False,
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')

fig.show()

Accuracy (Logistic Regression): 80.20 %


Accuracy varies considerably between classifiers. If we look at the confusion matrix of the second classifier, we see that the class the algorithm has the most trouble getting right is "`High`" (label = 0). Meanwhile, the algorithm's most significant error is in the False Negatives (individuals classified as "`Low`" who should be given "`High`" risk score).

It is worth remembering that given the original distribution of the data (only 25% of samples are labeled "`High`"). Hence, to achieve 75% accuracy on this problem, it would be sufficient for the classifier to label all entries as "`Low`."

Let us now analyze the coefficients learned by the model during its training.

In [35]:
import plotly.graph_objects as go

# Get the model's coefficients and turn them into a data frame
coefs = pd.DataFrame(
    model_lr[-1].coef_,
    columns=model_lr[:-1].get_feature_names_out(),
    index=['Coefficients']).transpose()

display(coefs)

# Create a Go.figure
fig = go.Figure(go.Bar(
    x=coefs['Coefficients'],
    y=model_lr[:-1].get_feature_names_out(),
    orientation='h'))

# Updated the range of the x axis
fig.update_xaxes(range=[model_lr[-1].coef_.min(
) + (model_lr[-1].coef_.min() * 0.1), model_lr[-1].coef_.max() + (model_lr[-1].coef_.max() * 0.1)])

# Updated the layout of the figure
fig.update_layout(
    xaxis=dict(
        tickmode='linear',
        tick0=0,
        dtick=0.5
    ),
    template='plotly_dark',
    title_text='LogisticRegression Coefficients',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)'

)

fig.show()

|                                         |   Coefficients |
|:----------------------------------------|---------------:|
| standardscaler__juv_fel_count           |    -0.257249   |
| standardscaler__juv_misd_count          |    -0.155662   |
| standardscaler__juv_other_count         |    -0.121177   |
| standardscaler__priors_count            |    -0.785661   |
| standardscaler__days_b_screening_arrest |    -0.0396254  |
| standardscaler__c_days_from_compas      |    -0.111371   |
| onehotencoder__sex_Female               |     0.046411   |
| onehotencoder__sex_Male                 |    -0.0527134  |
| onehotencoder__age_cat_25 - 45          |    -0.0985223  |
| onehotencoder__age_cat_Greater than 45  |     1.11599    |
| onehotencoder__age_cat_Less than 25     |    -1.02377    |
| onehotencoder__race_African-American    |    -0.631542   |
| onehotencoder__race_Asian               |    -0.188078   |
| onehotencoder__race_Caucasian           |    -0.0938192  |
| onehotencoder__race_Hispanic            |     0.135407   |
| onehotencoder__race_Native American     |    -0.349588   |
| onehotencoder__race_Other               |     1.12132    |
| onehotencoder__c_charge_degree_(CO3)    |     0.275411   |
| onehotencoder__c_charge_degree_(F1)     |    -0.377484   |
| onehotencoder__c_charge_degree_(F2)     |    -0.255121   |
| onehotencoder__c_charge_degree_(F3)     |    -0.388378   |
| onehotencoder__c_charge_degree_(F5)     |     1.09637    |
| onehotencoder__c_charge_degree_(F6)     |     0.407385   |
| onehotencoder__c_charge_degree_(F7)     |     0.00279768 |
| onehotencoder__c_charge_degree_(M1)     |     0.091829   |
| onehotencoder__c_charge_degree_(M2)     |    -0.184265   |
| onehotencoder__c_charge_degree_(MO3)    |    -0.72925    |
| onehotencoder__c_charge_degree_(NI0)    |    -0.351153   |
| onehotencoder__c_charge_degree_(TCX)    |     0.231925   |
| onehotencoder__c_charge_degree_(X)      |     0.173631   |
| onehotencoder__is_recid_-1              |     0.0177461  |
| onehotencoder__is_recid_0               |     0.179593   |
| onehotencoder__is_recid_1               |    -0.203641   |
| onehotencoder__is_violent_recid_0       |     0.14288    |
| onehotencoder__is_violent_recid_1       |    -0.149182   |

At first glance, several things contribute equally to this classifier.

However, we need to be cautious when interpreting coefficients from linear models. Since each feature represents a measured quantity on its scale, it doesn't make sense to compare them:

- Age can range from, e.g., 16 to 100, but binary features only from 0 to 1. **This does not mean that an age of 100 has 100 times more weight than a feature with gender.**

To get a correct view, we first need to normalize these values by their `standard deviation` (which brings all the values to a standard scale).

In [36]:
X_train_preprocessed = pd.DataFrame(
    model_lr[:-1].transform(X_train), columns=model_lr[:-1].get_feature_names_out(),
)

# Get the model's coefficients (standerdized by the std) and turn them into a data frame
coefs = pd.DataFrame(
    model_lr[-1].coef_ * X_train_preprocessed.std(axis=0).values,
    columns=model_lr[:-1].get_feature_names_out(),
    index=['Coefficients (Normalized by STD)']).transpose()

display(coefs)

fig = go.Figure(go.Bar(
    x=coefs['Coefficients (Normalized by STD)'],
    y=model_lr[:-1].get_feature_names_out(),
    orientation='h'))

fig.update_xaxes(range=[model_lr[-1].coef_.min(
) + (model_lr[-1].coef_.min() * 0.1), model_lr[-1].coef_.max() + (model_lr[-1].coef_.max() * 0.1)])

fig.update_layout(
    xaxis=dict(
        tickmode='linear',
        tick0=0,
        dtick=0.5
    ),
    template='plotly_dark',
    title_text='LogisticRegression Coefficients (Normalized by STD)',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)'

)

fig.show()

Unnamed: 0,Coefficients (Normalized by STD)
standardscaler__juv_fel_count,-0.257258
standardscaler__juv_misd_count,-0.155668
standardscaler__juv_other_count,-0.121181
standardscaler__priors_count,-0.78569
standardscaler__days_b_screening_arrest,-0.039627
standardscaler__c_days_from_compas,-0.111375
onehotencoder__sex_Female,0.017934
onehotencoder__sex_Male,-0.020369
onehotencoder__age_cat_25 - 45,-0.04876
onehotencoder__age_cat_Greater than 45,0.442049


Now, we have a more reliable view of what our model has learned. Its most important features are `race`, `age` and `prior`. This is something that shallow models allow us to do: opening the box to see what is inside. And maybe that is why, [for some researchers](https://www.nature.com/articles/s42256-019-0048-x), they should be the default in high-stakes decisions.

For example, a fully interpretable model (based on an analysis of our `LogisticRegression` model) could be used in this problem. A model that can be written in 4 lines of pseudocode:

````

If sample has < 25 years & is African-American & has prior counts > 0:
    Risk = High
Else:
    Risk = Low
    
````

Let us call this model the `evil_model` and see how it performs compared to our `RandomForestClassifier`.

In [38]:
# Define the evil model by applying a very biased and racist rule ...
evil_model = lambda df: [0 if (sample['age_cat'] == 'Less than 25' and sample['race'] \
                               == 'African-American' and sample['priors_count'] > 0) else 1 for _, sample in df.iterrows()]

# Make predictions on the test set
predictions = evil_model(X_test)

# Plot the confusion matrix and retrive its values
matrix = confusion_matrix(y_test, predictions)
TN, FP, FN, TP = matrix.ravel()

print(f'Accuracy (Evil-Model): {((TP + TN)/(TP + FP + TN + FN)) * 100:.2f}%')


fig = px.imshow(matrix,
                labels=dict(x="Predicted", y="True label"),
                x=['High', 'Low'],
                y=['High', 'Low'],
                text_auto=True
                )

fig.update_xaxes(side='top')

fig.update_layout(template='plotly_dark',
                  title='Confusion Matrix (Evil_Model)',
                  coloraxis_showscale=False,
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')

fig.show()


Accuracy (Evil-Model): 74.71%


In the end:

- Accuracy (Random Forest): 76%;
- Accuracy (Evil-Model): 74%.

The `evil_model` even seems wrong in the same way as the `LogisticRegressor` (i.e., most of the mistakes are False Negatives). However, the big difference between this model and the model produced by Northpointe is that the Northpointe model is a black box. We simply do not know how it works. In the case of the `evil_model`, with accuracy similar to the Random Forest model, we know exactly what it does. And so, we can say: **This is unacceptable**.

Let us now further explore our `LogisticRegressor` by using fairness metrics.

In [40]:
from IPython.display import Markdown

def calc_fair(model, DataFrame, protected_atributte, group_priv, group_unpriv, label):
    """
    The function calc_fair computes several fairness metrics for a given machine
    learning model on a test set DataFrame. The fairness metrics calculated include
    statistical parity ratio, true positive rate, positive predictive value,
    false positive rate, accuracy, equal opportunity ratio, predictive parity
    ratio, predictive equality ratio, and accuracy equality ratio. The function
    takes in the following arguments:

    Args:
    --------
        - model: The trained machine learning model to evaluate fairness on.
        - DataFrame: The test set data used to evaluate the model.
        - protected_attribute: The name of the protected attribute in the DataFrame.
        - group_priv: The value of the protected attribute for the privileged group.
        - group_unpriv: The value of the protected attribute for the unprivileged group.
        - label: The name of the column in the DataFrame that contains the ground truth labels.

    Returns:
    --------
    The function returns a dictionary containing the fairness metric names and their
    corresponding scores. The scores are rounded to two decimal places. Additionally,
    the function returns the equalized odds as a string.
    """
    test_set = DataFrame.copy()

    test_set_priv_labels, test_set_priv = list(test_set[test_set[protected_atributte] == group_priv][label]), test_set[test_set[protected_atributte] == group_priv].drop(label, axis = 1)
    test_set_unpriv_labels, test_set_unpriv = list(test_set[test_set[protected_atributte] == group_unpriv][label]), test_set[test_set[protected_atributte] == group_unpriv].drop(label, axis = 1)

    preds_priv = model.predict(test_set_priv)
    preds_unpriv = model.predict(test_set_unpriv)

    TN_PV, FP_PV, FN_PV, TP_PV = confusion_matrix(test_set_priv_labels, preds_priv).ravel()
    TN_UPV, FP_UPV, FN_UPV, TP_UPV = confusion_matrix(test_set_unpriv_labels, preds_unpriv).ravel()

    statistical_parity_priv = (TP_PV + FP_PV)/(TP_PV + FP_PV + TN_PV + FN_PV)  # STATISTICAL PARITY RATIO
    statistical_parity_unpriv = (TP_UPV + FP_UPV)/(TP_UPV + FP_UPV + TN_UPV + FN_UPV)  # STATISTICAL PARITY RATIO
    equal_oportunity_priv = TP_PV / (TP_PV+FN_PV)  # TRUE POSITIVE RATIO
    equal_oportunity_unpriv = TP_UPV / (TP_UPV+FN_UPV)  # TRUE POSITIVE RATIO
    predictive_parity_priv = TP_PV/(TP_PV + FP_PV)  # POSITIVE PREDICTIVE VALUE
    predictive_parity_unpriv = TP_UPV/(TP_UPV + FP_UPV)  # POSITIVE PREDICTIVE VALUE
    predictive_equality_priv = FP_PV / (FP_PV+TN_PV)  # FALSE POSITIVE RATE
    predictive_equality_unpriv = FP_UPV / (FP_UPV+TN_UPV)  # FALSE POSITIVE RATE
    accuracy_equality_priv = (TP_PV + TN_PV)/(TP_PV + FP_PV + TN_PV + FN_PV)  # ACCURACY EQUALITY RATIO
    accuracy_equality_unpriv = (TP_UPV + TN_UPV)/(TP_UPV + FP_UPV + TN_UPV + FN_UPV)  # ACCURACY EQUALITY RATIO

    if statistical_parity_priv >= statistical_parity_unpriv:
        statistical_parity_ratio = statistical_parity_unpriv/statistical_parity_priv
    elif statistical_parity_priv < statistical_parity_unpriv:
        statistical_parity_ratio = statistical_parity_priv/statistical_parity_unpriv

    if equal_oportunity_priv >= equal_oportunity_unpriv:
        equal_oportunity_ratio = equal_oportunity_unpriv/equal_oportunity_priv
    elif equal_oportunity_priv < equal_oportunity_unpriv:
        equal_oportunity_ratio = equal_oportunity_priv/equal_oportunity_unpriv

    if predictive_parity_priv >= predictive_parity_unpriv:
        predictive_parity_ratio = predictive_parity_unpriv/predictive_parity_priv
    elif predictive_parity_priv < predictive_parity_unpriv:
        predictive_parity_ratio = predictive_parity_priv/predictive_parity_unpriv

    if predictive_equality_priv >= predictive_equality_unpriv:
        predictive_equality_ratio = predictive_equality_unpriv/predictive_equality_priv
    elif predictive_equality_priv < predictive_equality_unpriv:
        predictive_equality_ratio = predictive_equality_priv/predictive_equality_unpriv

    if accuracy_equality_priv >= accuracy_equality_unpriv:
        accuracy_equality_ratio = accuracy_equality_unpriv/accuracy_equality_priv
    elif accuracy_equality_priv < accuracy_equality_unpriv:
        accuracy_equality_ratio = accuracy_equality_priv/accuracy_equality_unpriv

    equalized_odds = f'TPR: {round(equal_oportunity_priv, 2)} vs {round(equal_oportunity_unpriv, 2)} <br> FPR: {round(predictive_equality_priv,2)} vs {round(predictive_equality_unpriv,2)}'

    data = {'Fairness Metrics': ['Chance of receiving the positive class - privileged',
                                'Chance of receiving the positive class - unprivileged',
                                'Statistical Parity Ratio (SPR)',
                                'True Positive Rate - privileged',
                                'True Positive Rate - unprivileged',
                                'Equal Opportunity Ratio (EOR)',
                                'Positive Predictive Value - privileged',
                                'Positive Predictive Value - unprivileged',
                                'Predictive Parity Ratio (PPR)',
                                'False Positive Rate - privileged',
                                'False Positive Rate - unprivileged',
                                'Predictive Equality Ratio (PER)',
                                'Accuracy - privileged',
                                'Accuracy - unprivileged',
                                'Accuracy Equality Ratio (AER)',
                                'Equalized Odds'],
            'Scores': [round(statistical_parity_priv, 2),
                        round(statistical_parity_unpriv, 2),
                        round(statistical_parity_ratio,2),
                        round(equal_oportunity_priv, 2),
                        round(equal_oportunity_unpriv, 2),
                        round(equal_oportunity_ratio, 2),
                        round(predictive_parity_priv,2),
                        round(predictive_parity_unpriv,2),
                        round(predictive_parity_ratio,2),
                        round(predictive_equality_priv,2),
                        round(predictive_equality_unpriv,2),
                        round(predictive_equality_ratio,2),
                        round(accuracy_equality_priv,2),
                        round(accuracy_equality_unpriv,2),
                        round(accuracy_equality_ratio,2),
                        f'TPR: {round(equal_oportunity_priv, 2)} vs {round(equal_oportunity_unpriv, 2)}. FPR: {round(predictive_equality_priv,2)} vs {round(predictive_equality_unpriv,2)}']
            }
    df = pd.DataFrame(data).set_index('Fairness Metrics')

    return pd.DataFrame(data).set_index('Fairness Metrics')

X_test['labels'] = y_test

fairness_df = calc_fair(model_lr, X_test, 'race', 'Caucasian', 'African-American', 'labels')
display(Markdown(fairness_df.to_markdown()))

| Fairness Metrics                                      | Scores                               |
|:------------------------------------------------------|:-------------------------------------|
| Chance of receiving the positive class - privileged   | 0.96                                 |
| Chance of receiving the positive class - unprivileged | 0.8                                  |
| Statistical Parity Ratio (SPR)                        | 0.83                                 |
| True Positive Rate - privileged                       | 0.99                                 |
| True Positive Rate - unprivileged                     | 0.91                                 |
| Equal Opportunity Ratio (EOR)                         | 0.93                                 |
| Positive Predictive Value - privileged                | 0.86                                 |
| Positive Predictive Value - unprivileged              | 0.75                                 |
| Predictive Parity Ratio (PPR)                         | 0.87                                 |
| False Positive Rate - privileged                      | 0.83                                 |
| False Positive Rate - unprivileged                    | 0.59                                 |
| Predictive Equality Ratio (PER)                       | 0.7                                  |
| Accuracy - privileged                                 | 0.86                                 |
| Accuracy - unprivileged                               | 0.74                                 |
| Accuracy Equality Ratio (AER)                         | 0.87                                 |
| Equalized Odds                                        | TPR: 0.99 vs 0.91. FPR: 0.83 vs 0.59 |

Let us review these results:

1.  **Chance of receiving the positive class:**
    
    -   Privileged group: 0.96
    -   Unprivileged group: 0.8
    -   **Analysis:** There is a significant disparity in the chance of receiving the positive class between the privileged and unprivileged groups, indicating potential bias.

2.  **Statistical Parity Ratio (SPR):**
    
    -   SPR: 0.83
    -   **Analysis:** An SPR less than 1 suggests that a positive outcome is less likely for the unprivileged group, indicating a disparity in fairness.

3.  **True Positive Rate (TPR):**
    
    -   Privileged group: 0.99
    -   Unprivileged group: 0.91
    -   **Analysis:** While the TPR for both groups is relatively high, there is still a notable difference, indicating a potential disparate impact on true positives.

4.  **Equal Opportunity Ratio (EOR):**
    
    -   EOR: 0.93
    -   **Analysis:** An EOR close to 1 suggests a relatively balanced opportunity for positive outcomes.

5.  **Positive Predictive Value (PPV):**
    
    -   Privileged group: 0.86
    -   Unprivileged group: 0.75
    -   **Analysis:** The disparity in PPV values suggests differences in the accuracy of positive predictions between the two groups.

6.  **Predictive Parity Ratio (PPR):**
    
    -   PPR: 0.87
    -   **Analysis:** PPR less than 1 indicates potential disparities in the ratio of true positives to the total predicted positives between groups.

7.  **False Positive Rate (FPR):**
    
    -   Privileged group: 0.83
    -   Unprivileged group: 0.59
    -   **Analysis:** The higher FPR for the privileged group indicates a potential disparity in false positives between groups.

8.  **Predictive Equality Ratio (PER):**
    
    -   PER: 0.7
    -   **Analysis:** PER less than 1 suggests potential disparities in the ratio of false positives to the total predicted negatives between groups.

9.  **Accuracy:**
    
    -   Privileged group: 0.86
    -   Unprivileged group: 0.74
    -   **Analysis:** There is a noticeable difference in overall accuracy between the two groups, indicating potential bias.

10.  **Accuracy Equality Ratio (AER):**
    
    -   AER: 0.87
    -   **Analysis:** AER less than 1 suggests disparities in overall accuracy between the two groups.

11.  **Equalized Odds:**
    
    -   TPR: 0.99 vs. 0.91
    -   FPR: 0.83 vs. 0.59
    -   **Analysis:** The TPR and FPR values indicate disparities in both true positive and false positive rates between the privileged and unprivileged groups.

> **Note:** To learn more about ML Fairness, and how these scores are computed visit our directory on [ML Fairness](https://github.com/Nkluge-correa/TeenyTinyCastle/tree/master/ML-Fairness).

Now, let's create an `Explainer` around our model using [Dalex](https://dalex.drwhy.ai/python/).

> **Note:** The `dalex` library is a versatile tool designed to analyze and interpret the behavior of machine learning models. Its main feature is the Explainer object, which acts as a wrapper around predictive models, enabling users to gain insights into model intricacies. By providing model-level and prediction-level explanations, dalex aids in understanding the inner workings of complex models. Additionally, the library offers fairness methods and interactive exploration dashboards, empowering users to explore, compare, and enhance the transparency of their machine learning models.

In [42]:
!pip install dalex -q

import dalex as dx

# Wrap the model with an `Explainer`
model_lr_exp = dx.Explainer(model_lr,
                            X, y, label='Logistic Regression explainer',
                            model_type='binary classification')

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.1/1.0 MB[0m [31m3.4 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━[0m [32m0.5/1.0 MB[0m [31m7.7 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.0/1.0 MB[0m [31m10.2 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for dalex (setup.py) ... [?25l[?25hdone
Preparation of a new explainer is initiated

  -> data              : 17019 rows 12 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 17019 values
  -> model_class       : sklearn.l

Now, we will set up a break down plot of two different samples to finish our interpretability analysis. One of the samples has the sensitive attribute "`African-American`", and the other has the attribute "`Caucasian`".

> **Note:** To learn more about Break-Down plots, visit this [tutorial](https://github.com/Nkluge-correa/TeenyTinyCastle/blob/master/ML-Explainability/Tabular/interpreter_for_tabular.ipynb).

In [51]:
# Print the probability distribution for a sample [[High, Low]]
print('Probabilities for sample_1 (African-American): ', model_lr.predict_proba(pd.DataFrame(df[df['label'] == 0]\
        .drop('label', axis = 1).iloc[57]).transpose()))

# Display the features of the sample
display(df[df['label'] == 0].drop('label', axis = 1).iloc[27])

print('Probabilities for sample_1 (Caucasian): ', model_lr.predict_proba(pd.DataFrame(df[df['label'] == 1]\
        .drop('label', axis = 1).iloc[14]).transpose()))

display( df[df['label'] == 1].drop('label', axis = 1).iloc[14])


Probabilities for sample_1 (African-American):  [[0.96093348 0.03906652]]


sex                                  Female
age_cat                        Less than 25
race                       African-American
juv_fel_count                             0
juv_misd_count                            0
juv_other_count                           0
priors_count                              2
days_b_screening_arrest                -1.0
c_days_from_compas                      1.0
c_charge_degree                        (F2)
is_recid                                  0
is_violent_recid                          0
Name: 102, dtype: object

Probabilities for sample_1 (Caucasian):  [[0.04934203 0.95065797]]


sex                           Female
age_cat                      25 - 45
race                       Caucasian
juv_fel_count                      0
juv_misd_count                     0
juv_other_count                    0
priors_count                       0
days_b_screening_arrest         -1.0
c_days_from_compas               1.0
c_charge_degree                 (M1)
is_recid                           0
is_violent_recid                   0
Name: 14, dtype: object

In the cell below, using the samples from the African-American and Caucasian groups, the logistic regression explainer (model_lr_exp) is then employed to calculate prediction breakdowns for each sample using the `break_down` method. Subsequently, the prediction breakdowns are visualized using Plotly.

In [52]:
# Format the sample
sample_african_american = df[df['label'] == 0].drop('label', axis = 1).iloc[57]

# Pass the sample by the `Explainer`
bd_sample_1 = model_lr_exp.predict_parts(sample_african_american,
                                         type='break_down')

# Plot the results!
fig = bd_sample_1.plot(show=False)

# Some extra styling
fig.update_layout(
    template='plotly_dark',
    title='sample_african_american (Logistic Regression Explainer)',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)',
    font_color='white')

fig.show()

sample_caucasian = df[df['label'] == 1].drop('label', axis = 1).iloc[14]

bd_sample_2 = model_lr_exp.predict_parts(sample_caucasian,
                                         type='break_down')

fig = bd_sample_2.plot(show=False)

fig.update_layout(
    template='plotly_dark',
    title='sample_caucasian (Logistic Regression Explainer)',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)',
    font_color='white')

fig.show()

According to these graphs, `caucasian` is the third most influential feature for attributing a beneficial label. Meanwhile, African-American is the fourth most influential feature for attributing the high-risk prediction. A fair algorithm should not even consider such attributes, or they should not influence the decision-making process. But this is not the case in our example. In summary, this visual analysis of prediction breakdowns reveals that the model in this example is influenced by sensitive attributes such as race, underscoring the importance of addressing fairness considerations in machine learning models.

----

Return to the [castle](https://github.com/Nkluge-correa/TeenyTinyCastle).