<a href="https://colab.research.google.com/github/leolorenzoii/ml2_interpretability/blob/main/notebooks/01_Model_Interpretability_and_Shapley_Values.ipynb" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" align="left" alt="Open In Colab"/>
</a>

# Introduction to Model Interpretability

Throughout the **Machine Learning 1** course, we learned all about different machine learning algorithms- how they work, how we can hypertune their hyperparameters, and how we can select the optimal model through a cross validation strategy. In the end, we can now train a machine learning model and use it to make predictions on unseen data (see Figure <a href='#fig:ml-nutshell'>1</a>).

<a name='fig:ml-nutshell'></a>
<div>
<img src="images/ml-nutshell.png" align="left" width="600"/>
</div>

<br style="clear:both" />

<div>
    <p style="font-size:12px;font-style:default;">
        <b>Figure 1. Training and testing a machine learning model in a nutshell.</b><br>
        We learned how to train a machine learning model and use it to predict unseen data.
    </p>
</div>

<br style="clear:both" />

However, our methodology is currently geared more towards only optimizing the predictive power of our models and less on how we can use our models for inference. In times where stakeholders require us to explain the predictions of our machine learning model, our current methodology will be insufficient. In particular, we lack the capacity to answer questions such as:

- How does one (or more) feature impact the predictions of the model?
- What is the role that each feature value play in each individual predictions?
- How can we explain the models predictions in a more useful manner for our stakeholders?

We will deal with methods that improves the explainability of our models in the next series of notebooks. For our first notebook on interpretability, a brief introduction on model interpretability and its importance will be emphasized. We also show in the concluding section of this notebook how we can incorporate these explainability methods on our machine learning pipeline.

## What is model interpretability?

In Christoph Molnar's [Interpretable Machine Learning](#ref:molnar) book [[2]](#ref:molnar), he collated two definitions of **interpretability**. A non-mathematical one:

> *Interpretability is the degree to which a* ***human*** *can* ***understand*** *the cause of a* ***decision***. [[3]](#ref:miller)

And a mathematical one:

> *Interpretability is the degree to which a* ***human*** *can consistently* ***predict*** *the* ***model’s result***. [[4]](#ref:kim)

In both definitions we see three important elements: the **human**, the **understanding**, and the **decision**. Thus, in the same way, when we construct our definition for *interpretability* in the context of machine learning:

> ***Model interpretability*** *refers to the degree in which the behaviors and tendencies of statistical and machine learning models are understandable to humans.*

Notice here that there is a *flexibility* in the definition of model interpretability. Indeed, defining how explicable a machine learning model is depends on the needs and requirements of a project and the different stakeholders.

Nevertheless, as ethical machine learning practitioners and data science leaders, model interpretability MUST be integrated as early as possible in the development process and should NOT just be taken as an afterthought.

## Why interpretability?

Now, one might ask, why bother with interpretability? Wouldn't a high model performance would ultimately yield to higher business value? This is quite a arguable topic! In fact, in 2017, the [Neural Information Processing Systems](https://nips.cc/) conference in 2017 had its first ever ***The Great AI Debate*** with the topic ***Is interpretability necessary for machine learning?*** [[6]](#ref:great-ai-debate) *(you are highly encourage to watch this engaging and insightful discussion* 🙂).

In the video, it was shown that model interpretability is crucial especially for some applications where quirky patterns from the data may be learned by the model. Indeed, having an interpretability pipeline in your project gives you, the Data Scientist, the ability to debug your model and identify issues early on. Thus, giving you a chance to improve your model. Furthermore, it has added benefits for other stakeholders that is interested and affected by your machine learning model (see Figure [4](#fig:stakeholders)).

<a name='fig:stakeholders'></a>
<div>
<img src="images/interpretability-stakeholders.png" align="left" width="700"/>
</div>

<br style="clear:both" />
<br style="clear:both" />

<div>
    <p style="font-size:12px;font-style:default;">
        <b>Figure 4. Benefits of model interpretability to various stakeholders of a machine learning project.</b><br>
           Model interpretability benefits the data scientist, business decision makers, approving authorities, and business customers.
    </p>
</div>

For **data scientists**, being able to explain the model to other stakeholders is also one benefit of having model explicability. The better you can explain the model to other people in the business, the greater its chance of being adopted and the trust given to the model by other stakeholders. With model interpretability, **business executives** now has the option to provide transparency to its end-users. Furthermore, it helps them justify the business case for the investment and identify other potential extensions and business use-case for the project. **Approving authorities** also benefit from model interpretability, by having a clear understanding of the risk the business is going to take in adopting the model, understanding the impact of the model decisions to humans, and anticipate any legal or regulatory issues that the model may face. Finally, **customer** experience and decision making can also be improved if they understand why a model gives a certain prediction.

<div class="alert alert-info">

**Points for Discussion**

Here, we outlined the benefits of having an interpretability pipeline in our machine learning projects. Can you think of cases where having model interpretability is NOT preferred? Give a particular instance where having model interpretability can do more harm than good.

</div>

## What makes a good explanation?

Before we begin explaining predictions of machine learning models, it helps to understand what makes an explanation good an acceptable for humans. We are making explanations, after all, for a humans to be digested. This help us better frame the model explanations we get from model interpretability methods.

Here are some of the few important characteristics of a good explanation *(see Chapter 3.6 of Molnar's Interpretable Machine Learning for a complete list [[2]](#ref:molnar))*:

1. **Contrastive** - Model explanations must be able to answer *why a given prediction has been made in place of another prediction*. For example, in a model that recommends whether an individual be given a loan or not, a good explanation must be able to tell *what factors should/could I change to alter the model prediction*.
2. **Selective** - We *humans are only capable to comprehend 2 to 3 variables at a time*. Thus, a good model explanation must be able to *list the important drivers to explain an outcome*. Imagine having an interpretable model such as linear regression or decision trees, but an explanation that looks at hundreds or thousands of variables, it would be really hard for any human to digest that explanation!
3. **Consistent with prior beliefs** - Model explanations are greatly affected by how people perceived them. As such, when *an explanation is consistent with the prior beliefs of an individual*, they tend to favor such explanation ( also known as **confirmation bias**). This is not to discredit any novel or serendipitous discoveries of the model explanation. However, having an explanation that is in line with a domain expert, helps in the model's adoption. Furthermore, if a model is found to exhibit behavior inconsistent with the domain expert's belief, we can enforce constraints on the model or use a linear model that has the required property.

<div class="alert alert-info">

**Points for Discussion**

Among the interpretability methods that you currently know, can you create model explanations that satisfies all of the three characteristics we discussed above?

</div>

## Types of Explainability Methods

Model interpretability methods can be classified according to three different criteria (see Figure 5) [[7]](#ref:bbox-peek):

<a name='fig:taxonomy'></a>
<div>
<img src="images/taxonomy.png" align="left" width="550"/>
</div>

<br style="clear:both" />
<br style="clear:both" />

<div>
    <p style="font-size:12px;font-style:default;">
        <b>Figure 5. Taxonomy of different explainability methods.</b><br>
           Model interpretability or explainability methods can be intrinsic or post-hoc, model-specific or model-agnostic, local or global. Model-agnostic methods can be further classified whether they use surrogate models or are just visualizations of the behavior of the black-box model.
    </p>
</div>

First, we can classify whether the method stems from the model being **intrinsically** interpretable. If the model is not intrinsically interpretable, then the explanation method can be applied **post-hoc** or post-model training. Examples of intrinsically interpretable models include: Linear models, Decision Tree, and Rule-based models ([RuleFit](https://github.com/rohan-gt/rulefit) [[8]](#ref:rulefit)).

We can further classify the method whether it is **model-specific** or **model-agnostic** (intrinsic explainability methods are model-specific by definition). Model agnostic means that the explainability method can be applied to any black box models. They can be further divided into whether they apply **surrogate** models (e.g., LIME or SHAP Kernel) or is a **visualization** of the behavior of the black box model (e.g., Partial Dependence Plots, Individual Conditional Expectations, or Accumulated Local Effects) [[9]](#ref:med-image).

Finally, we can classify an explainability method whether it is a **global** explanation - i.e., it explains the whole model behavior (ex. feature importance and summary visualizations), or whether it is a **local** explanation, i.e., it explains a particular instance in the test or train dataset.

## Predictive vs Explanatory Modeling

Two of the models we introduced in machine learning 1 can actually be considered as an intrinsically explainable method, i.e., Decision Trees and Linear Regression models. However, we've only presented them in the context of maximizing the prediction accuracy that we get from these models. For us to reframe their usage for explainability, we need to first differentiate what we mean by using models for prediction versus explanation [[12]](#ref:predict-explain).

- **Predictive Modeling** - *process of applying a statistical model or machine learning model to data for the purpose of predicting new or future observations.* As such, the goal for predictive modeling is to minimize the combination of bias and estimation variance to obtain the required empirical precision.

- **Explanatory Modeling** - *process of applying statistical models to data for testing causal hypothesis about the theoretical constructs of the data generation process*. The focus for explanatory modeling is to minimize bias to obtain the most accurate representation of the underlying theory.

To see how these differ, let's look at two different pipelines in the succeeding subsections - one will focus on predictive modeling (as usual), while the other would heavily focus on explanatory modeling.

We'll set our data generation process to be linear for our analysis to emphasize the difference between the two pipeline (see Equation \ref{eq:data-gen}).

\begin{equation}
y = \beta_2 X_2 ^ 3 + \beta_1 X_1 + \beta_0 \tag{1} \label{eq:data-gen}
\end{equation}

In [1]:
import numpy as np

# Sklearn make regressions allows us to generate a random regression problem
from sklearn.datasets import make_regression

# Generate 2000 samples using the model:
# y = b_2 X_2^3 + b_1 X_1 + 0.77
# while adding 4.50 noise to the target.
#
# Notice we name the output data matrix as X_processed, this is because
# we will take the cube root of second feature to simulate a linear
# dependence on X_2^3.
X_processed, y, coef = make_regression(
    n_samples=200,
    n_features=2,
    bias=0.77,
    coef=True,
    noise=4.50,
    random_state=1337
)
X = np.array([X_processed[:, 0], np.cbrt(X_processed[:, 1])]).T

### Predictive Modeling

A predictive modeling pipeline would proceed as follows (for the model experimentation and evaluation step):

1. Hold-out test set segregation - setting out a hold-out test set for evaluation
2. Model selection - shortlisting which models to use
3. Cross validation design - how to perform cross validation
4. Hyperparameter tuning - deciding which optimal model to use

Let's use 20% of the datapoints as our hold-out set.

In [2]:
from sklearn.model_selection import train_test_split

X_trainval, X_holdout, y_trainval, y_holdout = train_test_split(
    X, y, test_size=0.20, random_state=1337)

Suppose we decided to use the following models:

1. Ridge Regression
2. Linear SVM
3. Nonlinear SVM

Then hypertune the `C` or `alpha` of the model along the range of `[1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01, 1.e+02, 1.e+03, 1.e+04, 1.e+05]`.


In [3]:
from sklearn.pipeline import Pipeline
from sklearn.linear_model import Ridge
from sklearn.svm import SVR

# Define C and alpha hyperparameter range
C_range = np.logspace(-5, 5, 11)

# Prepare the pipeline and parameter grid
pipe = Pipeline([('clf', Ridge())])
param_grid = [
    {'clf': [Ridge(random_state=1337)], 'clf__alpha': C_range},
    {'clf': [SVR()], 'clf__kernel': ['linear', 'rbf'], 'clf__C': C_range}
]

Next, let's find the optimal model for this case using a 5-fold cross validation strategy using `r2` as our metric.

In [6]:
from sklearn.model_selection import GridSearchCV

grid_search = GridSearchCV(
    estimator=pipe,
    param_grid=param_grid,
    cv=5,
    scoring='r2',
    n_jobs=-1,
    return_train_score=True
)
grid_search.fit(X_trainval, y_trainval)
print(f"The best model is: {grid_search.best_params_}")

The best model is: {'clf': SVR(C=10000.0), 'clf__C': 10000.0, 'clf__kernel': 'rbf'}


In [8]:
grid_search.cv_results_

{'mean_fit_time': array([0.00350323, 0.00332069, 0.00247359, 0.00553446, 0.00343471,
        0.00225654, 0.00148997, 0.00236902, 0.0020648 , 0.0022121 ,
        0.0022768 , 0.00412893, 0.00658078, 0.00347638, 0.00460396,
        0.00375361, 0.0042872 , 0.00344114, 0.00436811, 0.00286674,
        0.00417042, 0.00502763, 0.00344095, 0.00413661, 0.00340056,
        0.00638208, 0.00672388, 0.01899447, 0.02276602, 0.05556083,
        0.17709327, 0.35269918, 1.4214982 ]),
 'std_fit_time': array([1.31848631e-03, 1.27527360e-03, 3.69150624e-04, 2.49912965e-03,
        7.51152547e-04, 1.02465018e-03, 4.11271361e-04, 4.13743790e-04,
        3.42379094e-04, 3.22542728e-04, 3.30392019e-04, 1.70472318e-04,
        3.82092174e-03, 7.82138245e-04, 1.91292209e-03, 6.67074263e-04,
        1.28902365e-03, 7.72597610e-04, 1.14362327e-03, 5.46393279e-04,
        1.12195306e-03, 2.82226097e-03, 6.60330780e-04, 6.30890005e-04,
        6.90575140e-04, 2.74195053e-03, 1.87059907e-03, 6.24161237e-03,
        4

More often than not, when we focus on predictive accuracy, we sacrifice theoretical accuracy for improved empirical precision. Indeed, when it comes to choosing the model with the highest predictive accuracy, the optimal model are the least interpretable. (see Figure <a href='#fig:accuracy-interpretability'>2</a>) [[1]](#ref:interpret-ml).

<a name='fig:accuracy-interpretability'></a>
<div>
<img src="images/accuracy-interpretability-trade-off.PNG" align="left" width="450"/>
</div>

<br style="clear:both" />

<div>
    <p style="font-size:12px;font-style:default;">
        <b>Figure 2. Machine learning model accuracy and interpretability tradeoff.</b><br>
           Models that are highly accurate are the least interpretable, while models that are highly interpretable have a sub-par accuracy [<a href='#ref:interpret-ml'>1</a>].
    </p>
</div>


## References

<a name='ref:interpret-ml'></a> [1] Guo, Mengzhuo, et al. "An interpretable machine learning framework for modelling human decision behavior." *arXiv preprint arXiv:1906.01233* (2019).

<a name='ref:molnar'></a> [2] Molnar, Christoph. “Interpretable machine learning. A Guide for Making Black Box Models Explainable”, 2019. https://christophm.github.io/interpretable-ml-book/.

<a name='ref:miller'></a> [3] Miller, Tim. “Explanation in artificial intelligence: Insights from the social sciences.” *arXiv Preprint arXiv:1706.07269.* (2017)

<a name='ref:kim'></a> [4] Kim, Been, Rajiv Khanna, and Oluwasanmi O. Koyejo. “Examples are not enough, learn to criticize! Criticism for interpretability.” *Advances in Neural Information Processing Systems* (2016).

<a name='ref:crisp-dm'></a> [5] Kelleher, John D., Brian Mac Namee, and Aoife D’Arcy. "Machine Learning for Predictive Analytics: The Predictive Data Analytics Project Lifecycle: CRISP-DM." *Fundamentals of machine learning for predictive analytics*, The MIT Press, 2020, pp. 15-17.

<a name='ref:great-ai-debate'></a> [6] NeurIPS 2017. “The Great AI Debate - NIPS2017 - Yann LeCun.” *YouTube*, uploaded by The Artificial Intelligence Channel, 1 February 2018, https://youtu.be/93Xv8vJ2acI.

<a name='ref:bbox-peek'></a> [7] Adadi, Amina, and Mohammed Berrada. "Peeking inside the black-box: a survey on explainable artificial intelligence (XAI)." *IEEE access 6* (2018): 52138-52160.

<a name='ref:rulefit'></a> [8] Friedman, Jerome H., and Bogdan E. Popescu. "Predictive learning via rule ensembles." The Annals of Applied Statistics 2.3 (2008): 916-954.

<a name='ref:med-image'></a> [9] Singh, Amitojdeep, Sourya Sengupta, and Vasudevan Lakshminarayanan. "Explainable deep learning models in medical image analysis." *Journal of Imaging 6.6* (2020): 52.

<a name='ref:shap-how'></a> [10] Mazzanti, Samuele. "SHAP Values Explained Exactly How You Wished Someone Explained to You." *Towards Data Science*, 04 Apr. 2020, https://towardsdatascience.com/shap-explained-the-way-i-wish-someone-explained-it-to-me-ab81cc69ef30

<a name='ref:shapley-handbook'></a> [11] Algaba, Encarnación, Vito Fragnelli, and Joaquín Sánchez-Soriano, eds. *Handbook of the Shapley value*. CRC Press, 2019.

<a name='ref:predict-explain'></a> [12] Shmueli G. *To explain or to predict?*. Statistical science. 2010 Aug;25(3):289-310.

