
# Creating _Ceteris-paribus Profiles_ with the COMPAS dataset

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).

**A possible criticism against Fairness measures based only on statistical variations between groups and subgroups is that "_correlation is not causation._"**

**If you think about it, could we say that the reason individual $X$ (belonging to the _unprivileged group_) was classified for the negative class is that the classifier in question more often favors the privileged group for the positive class? If "_correlation is not causation_", we cannot. Statistical fairness indices (_Statistical Parity, Equalized Probabilities, Equal Opportunity_, etc.) say something about the "_population_" and not the "_individual_."**

**To address this issue, another class of fairness metrics (and interpretability tools) exists. In ML Fairness, these tools/methods are called "[Causality-Based Fairness](https://www.frontiersin.org/articles/10.3389/fdata.2022.892837/full)", and in ML Interpretability (XAI), we call these "_[What-if models](https://ema.drwhy.ai/ceterisParibus.html#ref-ICEbox)_" or "_Individual Conditional Expectations_."** 

![what-it-gif](https://c.tenor.com/Ymg4RcW4klYAAAAC/what-if-neon.gif)

**In this notebook, we will explore the [Compas Dataset](https://www.kaggle.com/datasets/danofer/compass?select=cox-violent-parsed_filt.csv) with two methods: Ceteris-paribus Profiles (_a "what-if" model_) and Counterfactual Fairness (_a Causality-Based Fairness metric_).**

**For another interpretability/fairness analysis of a classifier trained on the COMPAS dataset, go to [this notebbok](xxx).**

**About this dataset:**

> *Correctional Offender Management Profiling for Alternative Sanctions* (COMPAS) is a case management and decision support tool developed and owned by Northpointe (now [Equivant](https://www.equivant.com/)) used by USA courts to assess the likelihood of a defendant becoming a recidivist. If you would like to know more about the COMPAS dataset and its algorithm, go to the ProPublica report on "[How they did it](https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm)". In this notebook, we will be using *COMPAS Recidivism Racial Bias* (available on [Kaggle](https://www.kaggle.com/datasets/danofer/compass?select=cox-violent-parsed_filt.csv)). More specifically, we will be using the parsed data from this dataset (`cox-violent-parsed_filt.csv`), which contains 18316 samples.

**To create a classifier from scratch, we first need to get rid of the labels (scores and categories) that the original algorithm produced. For better performance purposes, we are also excluding features that have more than $10\%$ of their total bulk as missing values ("`NaN`"). This resulted in a final sample size of $17019$.** 

**In the end, we are left with a dataset containing 12 features + the label. For simplicity, we are merging the "*Low*" and "*Medium*" labels, to turn this classification task into a binary problem. "*High Risk*" samples represent only $25\%$ of our dataset, and these are exactly the cases we want to better distinguish.**

In [1]:
import pandas as pd

df = pd.read_csv("data/COMPAS.csv")

for column in df.columns:
    nan = df[column].isna().sum()
    if round((nan / len(df[column])) * 100, 2) > 10.0:
        print(f'Feature {column} : {round((nan / len(df[column])) * 100, 2)}% is NaN values.')

def turn_to_binary(score):
    if score == 'Low' or score == 'Medium':
        return 1
    else:
        return 0

df['label'] = df['score_text'].apply(turn_to_binary)

df = df[['sex', 'age_cat', 'race',
        'juv_fel_count', 'juv_misd_count',
        'juv_other_count', 'priors_count',
        'days_b_screening_arrest', 'c_days_from_compas',
        'c_charge_degree', 'is_recid', 'is_violent_recid',
        'label']].dropna()
        
with pd.option_context('display.max_columns', None):                     
    display(df)

Feature id : 39.94% is NaN values.
Feature r_charge_degree : 54.05% is NaN values.
Feature r_days_from_arrest : 65.28% is NaN values.
Feature r_offense_date : 54.05% is NaN values.
Feature r_charge_desc : 54.81% is NaN values.
Feature r_jail_in : 65.28% is NaN values.
Feature violent_recid : 100.0% is NaN values.
Feature vr_charge_degree : 92.69% is NaN values.
Feature vr_offense_date : 92.69% is NaN values.
Feature vr_charge_desc : 92.69% is NaN values.


Unnamed: 0,sex,age_cat,race,juv_fel_count,juv_misd_count,juv_other_count,priors_count,days_b_screening_arrest,c_days_from_compas,c_charge_degree,is_recid,is_violent_recid,label
0,Male,Greater than 45,Other,0,0,0,0,-1.0,1.0,(F3),0,0,1
1,Male,Greater than 45,Other,0,0,0,0,-1.0,1.0,(F3),0,0,1
3,Male,25 - 45,African-American,0,0,0,0,-1.0,1.0,(F3),1,1,1
4,Male,Less than 25,African-American,0,0,1,4,-1.0,1.0,(F3),1,0,1
5,Male,Less than 25,African-American,0,0,1,4,-1.0,1.0,(F3),1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
18311,Female,25 - 45,African-American,0,0,0,5,-1.0,1.0,(M1),0,0,1
18312,Male,Greater than 45,Other,0,0,0,0,-1.0,1.0,(F2),0,0,1
18313,Female,25 - 45,African-American,0,0,0,3,-1.0,1.0,(M1),0,0,1
18314,Female,Less than 25,Hispanic,0,0,0,2,-2.0,2.0,(F3),1,0,1


**Now, let us try to se how our target is related to the sensitive atributes of our dataset (`Age`, `Race`, and `Sex`).**

In [2]:
high_risk = []
low_risk = []

for element in list(df['sex'].unique()):
    a = df[df['sex'] == element]['label'].value_counts()[0]
    b = df[df['sex'] == element]['label'].value_counts()[1]
    high_risk.append(a)
    low_risk.append(b)

import plotly.graph_objects as go

fig = go.Figure(data=[
    go.Bar(name='High Risk', x=list(df['sex'].unique()), y=high_risk),
    go.Bar(name='Low Risk', x=list(df['sex'].unique()), y=low_risk)
])

fig.update_layout(
    barmode='group',
    template='plotly_dark',  
    xaxis_title="<b>Sex</b>",
    yaxis_title="<b>Risk by Sex</b>",
    title='Distribution of <i>Risk Scores</i> by "Sex"',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)',
    )
fig.show()

high_risk = []
low_risk = []

for element in list(df['age_cat'].unique()):
    a = df[df['age_cat'] == element]['label'].value_counts()[0]
    b = df[df['age_cat'] == element]['label'].value_counts()[1]
    high_risk.append(a)
    low_risk.append(b)

import plotly.graph_objects as go

fig = go.Figure(data=[
    go.Bar(name='High Risk', x=list(df['age_cat'].unique()), y=high_risk),
    go.Bar(name='Low Risk', x=list(df['age_cat'].unique()), y=low_risk)
])

fig.update_layout(
    barmode='group',
    template='plotly_dark',  
    xaxis_title="<b>Age</b>",
    yaxis_title="<b>Risk by Age</b>",
    title='Distribution of <i>Risk Scores</i> by "Age"',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)',
    )
fig.show()

high_risk = []
low_risk = []

for element in list(df['race'].unique()):
    a = df[df['race'] == element]['label'].value_counts()[0]
    b = df[df['race'] == element]['label'].value_counts()[1]
    high_risk.append(a)
    low_risk.append(b)

import plotly.graph_objects as go

fig = go.Figure(data=[
    go.Bar(name='High Risk', x=list(df['race'].unique()), y=high_risk),
    go.Bar(name='Low Risk', x=list(df['race'].unique()), y=low_risk)
])

fig.update_layout(
    barmode='group',
    template='plotly_dark',  
    xaxis_title="<b>Race</b>",
    yaxis_title="<b>Risk by Race</b>",
    title='Distribution of <i>Risk Scores</i> by "Race"',
    paper_bgcolor='rgba(0, 0, 0, 0)',
    plot_bgcolor='rgba(0, 0, 0, 0)',
    )
fig.show()

**Let us now see if our sensitive attributes are correlated with `Risk`. If they are, this is already a signed that our future model could inherit these bias against a specific unprivileged class.**

**To be able to calculate correlations, let us transform all categorical values into numbers.**

In [3]:
corr_df = df.copy()

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
 
for column in list(set(df.columns) - set(df._get_numeric_data().columns)):
    corr_df[column] = le.fit_transform(corr_df[column])

import plotly.express as px

fig = px.imshow(corr_df.corr(numeric_only=True).values,
                labels=dict(x="Features", y="Features"),
                x=list(corr_df.columns),
                y=list(corr_df.columns),
                text_auto=True
                )
fig.update_xaxes(side='top')
fig.update_layout(template='plotly_dark',
                  title='Correlation Matrix',
                  coloraxis_showscale=False,
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')
fig.show()

**According to the correlation scores, `race` has an alarming $0.22$ correlation with risk. Let us see how this will impact our future model.**

**To deal with the classification problem, we will create a `RandomForestClassifier` using the `scikit-learn` module.**

In [4]:
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.compose import make_column_transformer
from sklearn.pipeline import make_pipeline
from sklearn.metrics import confusion_matrix

seed = 42

X, y = df[df.columns.values.tolist()[0:12]], df[df.columns.values.tolist()[-1]]
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.2,
    random_state=seed
)

preprocess = make_column_transformer(
    (StandardScaler(), ['juv_fel_count', 'juv_misd_count', 'juv_other_count', 'priors_count', 'days_b_screening_arrest', 'c_days_from_compas']),
    (OneHotEncoder(), ['sex', 'age_cat', 'race', 'c_charge_degree', 'is_recid', 'is_violent_recid']))

from sklearn.ensemble import RandomForestClassifier

model_rf = make_pipeline(
    preprocess,
    RandomForestClassifier(max_depth=3, n_estimators=500))
model_rf.fit(X_train, y_train.values.ravel())
score = model_rf.score(X_test, y_test.values.ravel())
print(f'Accuracy (Random Forest): ' + '{:.2f}'.format(score * 100) + ' %')

preds = model_rf.predict(X_test)
matrix = confusion_matrix(y_test.values.ravel(), preds)

import plotly.express as px
fig = px.imshow(matrix,
                labels=dict(x="Predicted", y="True label"),
                x=['High', 'Low'],
                y=['High', 'Low'],
                text_auto=True
                )
fig.update_xaxes(side='top')
fig.update_layout(template='plotly_dark',
                  title='Confusion Matrix (Random Forest Model)',
                  coloraxis_showscale=False,
                  paper_bgcolor='rgba(0, 0, 0, 0)',
                  plot_bgcolor='rgba(0, 0, 0, 0)')
fig.show()

Accuracy (Random Forest): 76.73 %


## Ceteris paribus & Counterfactual Fairness

**_Ceteris paribus_ is a principle (and Latin phrase) meaning “_all else unchanged_”. Think of it this way:** 

> **"_if everything else was the same, minus this change, then what?_"**

**Ceteris paribus methods focus on evaluating the effect of a selected explanatory variable in terms of changes in a model’s prediction induced by changes in the variable’s values, i.e., _what would be the model prediction if this single variable is different?_ The main goal of this methodology is to understand how changes in the values of the variable affect the model’s predictions.**

**This methodology has _appeal for the field of XAI_ as it can explain a single classification by _not relying on statistical evaluations of an entire population_, but rather on the _causal influence_ that certain _features_ have on the _classification of a model_.**

**A CP profile is nothing more than a profile of how the classification of a model varies concerning a change in a single exploratory variable/feature. And this can be used directly to measure something called [Counterfactual Fairness](https://arxiv.org/abs/1703.06856).**

**Counterfactual Fairness has a very intuitive and simple definition of fairness:**

> **An algorithm is said to be counterfactually fair if, _and only if_, the probability that the individual $X$, member of a group $a$, would be unchanged, even if we lived in a world where individual $X$ was of the group $b$.**

**We could relax the "unchanged" aspect to something like, "_the individual would still be classified for the same class as before._"**

**Now let's implement a function that produces a CP profile for us. Then, we will use this function to evaluate if the trained model is "_counterfactually fair_."**

In [5]:
import plotly.graph_objects as go

def make_cp_profile(df, feature_name, sample, model):

    feature_values = list(df[feature_name].unique())
    sample_features = pd.DataFrame(sample).transpose().reset_index(drop=True)

    for value in feature_values:
        sample_features = pd.concat([sample_features, pd.DataFrame(sample_features.iloc[-1]).transpose()])
        sample_features.iloc[-1][feature_name] = value

    preds = model.predict_proba(sample_features.reset_index(drop=True).drop_duplicates())
    sample_features = sample_features.reset_index(drop=True).drop_duplicates()

    scores = []
    colors = []

    for i in range(len(preds)):

        if preds[i][0] > preds[i][1]:

            scores.append(-abs(preds[i][0] * 100))
            colors.append('red')

        else:

            scores.append(preds[i][1] * 100)
            colors.append('green')

    sample_features['model_score'] = scores
    sample_features['colors'] = colors
    
    fig = go.Figure(go.Bar(
        x=sample_features['model_score'],
        y=sample_features[feature_name],
        orientation='h',
        marker_color = sample_features.colors))

    fig.update_xaxes(ticksuffix = "%",
                    griddash='dash')

    fig.add_annotation(text="Positive Class",
                  xref="paper", yref="paper",
                  x=.9, y=1, showarrow=False)

    fig.add_annotation(text="Negative Class",
                  xref="paper", yref="paper",
                  x=.1, y=1, showarrow=False)

    fig.update_layout(
        xaxis=dict(
            tickmode='linear',
            tick0=0,
            dtick=10
        ),
        xaxis_range=[-100,100],
        template='plotly_dark',
        title_text=f'Ceteris-paribus Profile (Feature --> {feature_name})',
        paper_bgcolor='rgba(0, 0, 0, 0)',
        plot_bgcolor='rgba(0, 0, 0, 0)'

    )
    
    return  fig.show()

**The above function takes as input the dataset used for training, a sample to be evaluated, an explanatory variable (feature), and a model trained on the dataset:**

**We are using the dataset to estimate all the unique values that each variable can possess. So, when we choose a variable (e.g., "_race_") and a sample (e.g., subject $X$), we create copies of that sample, where each differs only in the value of one variable, having all the values that that variable can have. If our chosen variable were "_gender_", and the dataset contained only the values "_Male_" and "_Female_", we would have only two samples: one where $X$ is "_Male_", and one where $X$ is "_Female_".**

**With all possible variations of a single variable for the same sample, _we rank the variations with the trained model_. For simplicity's sake, the above function will generate a _red bar_ if the sample was assigned to the _negative class_ and a _green bar_ for the _positive class_.**

**Let's know choose a sample to evaluate.**

In [6]:
print("Analyzing Sample: \n\n", X_test.iloc[12])

Analyzing Sample: 

 sex                                    Male
age_cat                             25 - 45
race                       African-American
juv_fel_count                             2
juv_misd_count                            0
juv_other_count                           0
priors_count                             18
days_b_screening_arrest                -1.0
c_days_from_compas                      1.0
c_charge_degree                        (M1)
is_recid                                  1
is_violent_recid                          0
Name: 6246, dtype: object


**Now let's evaluate this sample, permutating the selected features by their possible unique values, for "_race_", "_priors_count_", and "_age_cat_".**

In [7]:
make_cp_profile(X, 'race', X_test.iloc[12], model_rf)
make_cp_profile(X, 'priors_count', X_test.iloc[12], model_rf)
make_cp_profile(X, 'age_cat', X_test.iloc[12], model_rf)

**Using the principle of _Ceteris paribus_, we can arrive at a very similar interpretation to our [another notebook](xxx), using a completely different tool.**

**According to the generated graphs, when we talk about the feature:**

- **_race_: the sample is only classified to the negative class if the value is "_African-American_". The sample with the highest probability for the positive class is "_Caucasian_".**
- **_priors_count_: if the sample has more than _6 priors_count_, the sample is classified as negative.**
- **_age_cat_: only the category "_older than 45_" was classified for the positive class.**

**This interpretation is specific to this sample, and shows how "_if individual X were Caucasian_", all else being equal, he would be classified as Low Risk, rather than High Risk (being "_African-American_").**

**Since race is a sensitive/protected attribute, this model cannot be considered "fair" according to the definition of Counterfactual Fairness.**

**As a final provocation, we ask whether "_s the monotonicity principle used in what-if models metaphysically acceptable?_" That is, can we "_freeze_" all the other features that represent a sample, and _change only one?_**

**For a sociological and metaphysical discussion of this subject, we recommend the reader to [this study](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9099231/).** ⚖️🔎

----

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).

