# Applying model-agnostic explanations to classifiers with `dalex`  

<a href="https://colab.research.google.com/drive/1TQ8_lx3cNMGxMB7L8OKpkAB40ntUluvN" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab">
</a>

Return to the [castle](https://github.com/Nkluge-correa/TeenyTinyCastle).

According to "_[The mythos of model interpretability](https://arxiv.org/pdf/1606.03490.pdf)_", these are the properties of an "_interpretable model_":

> A human can repeat (_"simulatability"_) the computation process with a full understanding of the algorithm (_"algorithmic transparency"_) and every individual part of the model owns an intuitive explanation (_"decomposability"_).

Explainable AI (XAI) is an approach to artificial intelligence that aims to achieve this level of interpretability. In it, we seek to create transparent, interpretable models that can provide human-understandable explanations for their decisions or predictions. By providing clear and interpretable explanations for the behavior of AI models, XAI can help to increase the transparency and accountability of these systems, which is particularly important in applications such as healthcare, finance, and law enforcement, where decisions made by AI models can have significant impacts on people's lives.

![image](https://www.lambdatest.com/resources/images/testing-in-black-box.png)

[Source](https://www.lambdatest.com/learning-hub/black-box-testingS).

This notebook will explore several interpretability techniques that can be used (especially well) against shallow ML models (**break-down plots**, **interactive break-down plots**, and **SHAP**). To have a model to explore, we will train an ML system on the `German Scoring Dataset`, which was created by Professor Dr. Hans Hofmann.

> **Note:** This dataset has information about 1000 individuals regarding their credit score (bad = 0, good = 1), all set as a binary classification problem. We will not be using the full 21 features of the [original dataset](https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data), but a [reduced version](https://www.kaggle.com/datasets/kabure/german-credit-data-with-risk) with only ten features + the target.

> **Note**: all datasets and models related to the course and repo are in the Hub. 🤗

Below, we load our dataset and perform pre-processing (e.g., substituting missing values for their `average value` or `mode`) to help train our model.

In [1]:
!pip install datasets -q

import pandas as pd
from datasets import load_dataset

# load the dataset from the hub
dataset = load_dataset("AiresPucrs/german-credit-data", split = 'train')
df = dataset.to_pandas()

for col in df.columns:
    if df[col].dtypes == 'object':
        # Replace categorical missing data with the mode
        df[col] = df[col].fillna(df[col].value_counts().index[0])
    elif df[col].dtypes == 'int64':
        # Replace numerical missing data with the mean
        df[col] = df[col].fillna(round(df[col].mean()))

display(df)

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m507.1/507.1 kB[0m [31m5.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m115.3/115.3 kB[0m [31m9.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8/134.8 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8/134.8 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[?25h

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/613 [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/14.0k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/1000 [00:00<?, ? examples/s]

Unnamed: 0,Age,Sex,Job,Housing,Saving accounts,Checking account,Credit amount,Duration,Purpose,Risk
0,67,male,2,own,little,little,1169,6,radio/TV,good
1,22,female,2,own,little,moderate,5951,48,radio/TV,bad
2,49,male,1,own,little,little,2096,12,education,good
3,45,male,2,free,little,little,7882,42,furniture/equipment,good
4,53,male,2,free,little,little,4870,24,car,bad
...,...,...,...,...,...,...,...,...,...,...
995,31,female,1,own,little,little,1736,12,furniture/equipment,good
996,40,male,3,own,little,little,3857,30,car,good
997,38,male,2,own,little,little,804,12,radio/TV,good
998,23,male,2,free,little,little,1845,45,radio/TV,bad


Now, we use the `LabelEncoder()` class (from `scikit-learn`) to turn our labels into numerical values.

In [2]:
from sklearn.preprocessing import LabelEncoder

encoder = LabelEncoder()

df['Risk'] = encoder.fit_transform(df['Risk']) # good becomes 1, and bad becomes = 0

display(df)

Unnamed: 0,Age,Sex,Job,Housing,Saving accounts,Checking account,Credit amount,Duration,Purpose,Risk
0,67,male,2,own,little,little,1169,6,radio/TV,1
1,22,female,2,own,little,moderate,5951,48,radio/TV,0
2,49,male,1,own,little,little,2096,12,education,1
3,45,male,2,free,little,little,7882,42,furniture/equipment,1
4,53,male,2,free,little,little,4870,24,car,0
...,...,...,...,...,...,...,...,...,...,...
995,31,female,1,own,little,little,1736,12,furniture/equipment,1
996,40,male,3,own,little,little,3857,30,car,1
997,38,male,2,own,little,little,804,12,radio/TV,1
998,23,male,2,free,little,little,1845,45,radio/TV,0


As always, let us split the data set into `training` and `testing` groups so we can later evaluate the performance of our models.


In [3]:
import numpy as np
from sklearn.model_selection import train_test_split

# We are setting a seed to allow reproducible results
seed = np.random.seed(42)

# Separate features and target
X, y = df[df.columns.values.tolist()[0:9]], df[df.columns.values.tolist()[-1]]

# Split!
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.2,
    random_state=seed
)

Below, we are creating a `pipeline` to scale the numerical values to the same range (`StandardScaler()`) and one-hot-encode the categorial values into sparse binary vectors (`OneHotEncoder()`). For this, we are using some imported functions from the `Scikit-learn` library.

> **Note:** This is a standard practice in ML, given that normalizing feature vectors tends to improve model performance.


In [4]:
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import make_column_transformer
from sklearn.pipeline import make_pipeline

preprocess = make_column_transformer(
    (StandardScaler(), ['Age', 'Job', 'Credit amount', 'Duration']),
    (OneHotEncoder(), ['Sex', 'Housing', 'Saving accounts', 'Checking account', 'Purpose']))


Below, we create a series of different models to compare their performance, and also have different objects to explore in our interpretability analysis. We will create the following models:

- Logistic Regression (LR): _LR is a statistical model that models the probability of one event (out of two alternatives) taking place by having the log-odds (the logarithm of the odds) for the event be a linear combination of one or more independent variables ("predictors"). To learn more, read "[Logistic regression and artificial neural network classification models: a methodology review](https://www.sciencedirect.com/science/article/pii/S1532046403000340)."_

In [5]:
from sklearn.linear_model import LogisticRegression

model_lr = make_pipeline(preprocess, LogisticRegression(penalty='l2'))

model_lr.fit(X_train, y_train.values.ravel())

score = model_lr.score(X_test, y_test.values.ravel())

print(f'Accuracy (Logistic Regression): {score * 100:.2f}%')


Accuracy (Logistic Regression): 70.00%


- Random Forest (RF): _Random forests or random decision forests is an ensemble learning method for classification, regression, and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees. To learn more, read "[A Random Forest Guided Tour](https://arxiv.org/abs/1511.05741)."_


In [6]:
from sklearn.ensemble import RandomForestClassifier

model_rf = make_pipeline( preprocess, RandomForestClassifier(max_depth=3, n_estimators=500))

model_rf.fit(X_train, y_train.values.ravel())

score = model_rf.score(X_test, y_test.values.ravel())

print(f'Accuracy (Random Forest): {score * 100:.2f}%')


Accuracy (Random Forest): 71.00%


- Gradient Boosting (GB): _Gradient boosting is a machine learning technique used in regression and classification tasks, among others. It gives a prediction model in the form of an ensemble of weak prediction models, which are typically decision trees. To larn more, read "[Gradient boosting machines, a tutorial](https://www.frontiersin.org/articles/10.3389/fnbot.2013.00021/full)."_


In [7]:
from sklearn.ensemble import GradientBoostingClassifier

model_gbc = make_pipeline(preprocess, GradientBoostingClassifier(n_estimators=100))

model_gbc.fit(X_train, y_train.values.ravel())

score = model_gbc.score(X_test, y_test.values.ravel())

print(f'Accuracy (Gradient Boosting Classifier):  {score * 100:.2f}%')

Accuracy (Gradient Boosting Classifier):  70.00%


- Support-vector machine (SVM): _SVMs are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. SVM maps training examples to points in space to maximize the width of the gap between the two categories. To learn more, read "[A Tutorial on Support Vector Machines for Pattern Recognition](https://www.di.ens.fr/~mallat/papiers/svmtutorial.pdf)."_


In [8]:
from sklearn.svm import SVC

model_svm = make_pipeline(preprocess, SVC(probability=True))

model_svm.fit(X_train, y_train.values.ravel())

score = model_svm.score(X_test, y_test.values.ravel())

print(f'Accuracy (Support-vector machine): {score * 100:.2f}%')


Accuracy (Support-vector machine): 70.50%


Now we compare the model's predictions by using two generated samples: `Bob` and `Charles`.

> **Note:** To better understand what these features represent, read the [dataset card](https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data).


In [9]:

sample_1 = pd.DataFrame({'Age': [23],
                         'Sex': ['male'],
                         'Job': [2],
                         'Housing': ['rent'],
                         'Saving accounts': ['little'],
                         'Checking account': ['little'],
                         'Credit amount': [10000],
                         'Duration': [45],
                         'Purpose': ['car'],
                         },
                        index=['Bob'])

sample_2 = pd.DataFrame({'Age': [45],
                         'Sex': ['male'],
                         'Job': [2],
                         'Housing': ['own'],
                         'Saving accounts': ['moderate'],
                         'Checking account': ['moderate'],
                         'Credit amount': [4000],
                         'Duration': [24],
                         'Purpose': ['repairs'],
                         },
                        index=['Charles'])


def what_does_the_models_think(df):
    """
    A simple function to query our models.
    """
    sample = df

    models = [
        ("LR", model_lr),
        ("RF", model_rf),
        ("GB", model_gbc),
        ("SVM", model_svm)
    ]

    for model_name, model in models:
        prob_0, prob_1 = model.predict_proba(sample)[0]

        credibility = "not credible" if prob_0 > prob_1 else "is credible"
        confidence = prob_0 if prob_0 > prob_1 else prob_1

        print(f'{model_name}: {sample.index[0]} {credibility}...{confidence * 100:.2f} %')


print(f'Predictions for {sample_1.index[0]}:\n')

what_does_the_models_think(sample_1)

print(f'\nPredictions for {sample_2.index[0]}:\n')

what_does_the_models_think(sample_2)


Predictions for Bob:

LR: Bob not credible...64.93 %
RF: Bob not credible...51.78 %
GB: Bob not credible...80.85 %
SVM: Bob not credible...62.76 %

Predictions for Charles:

LR: Charles is credible...69.42 %
RF: Charles is credible...72.60 %
GB: Charles is credible...76.23 %
SVM: Charles is credible...74.86 %


For the rest of this tutorial, we want to explore why these models think Bob is not credible while Charles is.

### Creating Explainers with `Dalex`

[Dalex](https://pypi.org/project/dalex/) is a package that helps to explore and explain the behavior of the machine learning model. It provides tools for model-agnostic global and local explanations, as well as model performance diagnostics, being a part of the [DrWhy.AI](https://github.com/ModelOriented/DrWhy) universe of tools for responsible and interpretable machine learning.

One of the main objects of `dalex` is the `Explainer`, which is basically a wrapper around a predictive model. Wrapped models may then be explored and compared with model-level and predict-level explanations. As soon as the explainer object is created, you will receive some metadata about the development of the wrapped model.

In [10]:
!pip install dalex -q

import dalex as dx

model_lr_exp = dx.Explainer(model_lr,
                            X, y, label='Logistic Regression explainer',
                            model_type='binary classification')


model_svm_exp = dx.Explainer(model_svm,
                             X, y, label='Support-vector machine explainer',
                             model_type='binary classification')


[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.2/1.0 MB[0m [31m5.0 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━[0m [32m0.6/1.0 MB[0m [31m7.9 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━[0m [32m1.0/1.0 MB[0m [31m9.3 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for dalex (setup.py) ... [?25l[?25hdone
Preparation of a new explainer is initiated

  -> data              : 1000 rows 9 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 1000 values
  -> model_class       : skle

Now, let us explore some XAI techniques pre-built into `dalex`.

## Break-down plots (`BD`)

Imagine you have a magic cake and want to know why it tastes so good. The cake has different ingredients like flour, sugar, eggs, and chocolate. Break-down plots are like a recipe card that shows exactly how much each ingredient contributes to the overall deliciousness of the cake.

Now, let's relate this to real life. In data science and machine learning, we have models that make predictions or decisions, just like a chef creates a cake. Break-down plots help us understand why a model made a specific prediction for a particular input.

Here's how it works:

1. **Choose a Prediction:** Our model predicts whether it will rain tomorrow.
2. **Ingredients (Features):** Think of features like ingredients in our cake. For weather prediction, features like temperature, humidity, and wind speed could be features.
3. **Magic Recipe Card (Break-down Plot):** The break-down plot is like a magic recipe card showing how each feature (ingredient) contributes to the final prediction. It breaks down the prediction into individual parts.
4. **Understanding the Magic:** If the break-down plot says temperature contributes a lot to the prediction, it's like saying, "Hey, the warm temperature today is a big reason why the model thinks it will rain tomorrow."

So, break-down plots help us see which features are more important in influencing a model's decision, just like a recipe card helps us understand why a cake tastes the way it does. All of the techniques explored in this notebook are trying to access precisely this.

Let us re-use our sample (Bob and Charles) to create some BD plots. Remember, according to our classifiers, Bob is not credible (sample 1), while Charles is (sample 2).


In [16]:
def plot_breakdown(model, sample):
    bd_sample = model.predict_parts(sample, type='break_down')
    fig = bd_sample.plot(show=False)
    fig.update_layout(template='ggplot2', font_color='black', title=f"BD plot - {sample.index[0]}")
    fig.show()

plot_breakdown(model_lr_exp, sample_1)
plot_breakdown(model_lr_exp, sample_2)


According to our BD plots, `Age` is an important factor. However, given that age is a sensitive attribute, this may be a problem for our model regarding its **fairness**.

However, one of the limitations of BD plots is that they only consider additive attributions, i.e., they overlook the dependence of features on other features. To address the issue of the dependence of the variable-importance measure, we can use Interactive Break-down plots.

## Interactive Break-down plots (`iBD`)

Interaction (deviation from additivity) means that the effect of an explanatory features depends on the value(s) of other features(s). When we have interactions, the order in which features are analyzed matters, which can change a predictor's contribution score. In short, iBDs let us consider these interactions when investigating a model.

> **Note: To learn more, we recommend "_[iBreakDown: Uncertainty of Model Explanations for Non-additive Predictive Models](https://arxiv.org/abs/1903.11420v1)_".**

Let us pass our samples again by our explainers, but instead of performing vanilla BD, use the `break_down_interactions` type and allow all ten features to interact.

In [18]:
def plot_interactive_breakdown(model_exp, sample, title):
    ibd_sample = model_exp.predict_parts(
        sample, type='break_down_interactions', interaction_preference=10)

    fig = ibd_sample.plot(show=False)
    fig.update_layout(
        template='ggplot2', font_color='black',
        title=f'{title} Classification (Support-vector machine)',
    )

    fig.show()

plot_interactive_breakdown(model_svm_exp, sample_1, 'Bob')
plot_interactive_breakdown(model_svm_exp, sample_2, 'Charles')



According to this iBD analysis, our results still hold, which shows that `Age` is again a very determinant feature for this model's classification.

One of the downsides of iBD is its computational complexity, which grows quadratically with the number of features, making it time-consuming in models with many explanatory variables. Another explanation technique that can handle this computational bottleneck is the Shapley Additive Explanations method.

## Shapley Additive Explanations (`SHAP`) for Average Attributions

`SHAP` values are another approach we can use to explore the inner workings of ML models. It is based on the idea of averaging the value of a variable's attribution over all (or a large number of) possible orderings of these variables (an idea closely linked to "_[Shapley values](https://en.wikipedia.org/wiki/Shapley_value)_"), which help us to get a more precise result that does not overlook the ordering of variables dependence that BD plots have and iBD plots try to overcome.

> **Note: To learn more, we recommend "_[An Efficient Explanation of Individual Classifications Using Game Theory](http://dl.acm.org/citation.cfm?id=1756006.1756007)_".**

In [19]:
def plot_shap(model_exp, sample, title):
    shap_sample = model_exp.predict_parts(sample, type='shap')

    fig = shap_sample.plot(show=False)
    fig.update_layout(
        template='ggplot2', font_color='black',
        title=f'{title} Classification (Logistic Regression)',
    )

    fig.show()

plot_shap(model_lr_exp, sample_1, 'Bob')
plot_shap(model_lr_exp, sample_2, 'Charles')

After three explanatory methods pointing out that `Age` is a deciding feature, we might now solidify our hypothesis that something problematic is happening in this model. Regarding our SHAP approach, it is worth mentioning that, for large models, the calculation of Shapley values takes time, making their use somewhat impractical. However, sub-sampling can address this issue (i.e., choosing a smaller collection of features to explore).

Another XAI technique we could cover is Local Interpretable Model-agnostic Explanations. However, we will leave this to the other tutorials.

> **Note: Learn about LIME with our tutorial [on language models](https://github.com/Nkluge-correa/TeenyTinyCastle/blob/master/ML-Explainability/NLP/lime_for_NLP.ipynb) and [CNNs](https://github.com/Nkluge-correa/TeenyTinyCastle/blob/master/ML-Explainability/CV/CNN_attribution_maps_with_LIME.ipynb).**

---

Return to the [castle](https://github.com/Nkluge-correa/TeenyTinyCastle).
