# 03 - Create explainers

In the previous part we've trained and validated a model, but can we really trust it? Often, when a model predicts the wrong thing, you're left wondering, why did it make this decision? This is where explainers can help out. Ultimately, explainers help you build trust. 

In this notebook we're going to train an explainer and use it to explore the model. We'll cover the following topics:

* [Loading the trained model](#loading-the-trained-model)
* [Loading the test dataset](#loading-the-test-dataset)
* [Creating a blackbox performance explainer](#creating-a-blackbox-performance-explainer)
* [Creating a blackbox model explainer](#creating-a-blackbox-model-explainer)
* [Creating a local model explainer](#creating-a-local-model-explainer)

Unlike the previous steps of the tutorial, this step contains prefilled code as explainers is a complete course on its own.

Let's get started by loading the trained model from the previous step in the tutorial.

## Loading the trained model

In the previous step we've saved our model using joblib. Let's load it back up again.

In [56]:
import joblib

In [57]:
classifier = joblib.load('../models/classifier.bin')

Now that we have the model, let's load up a testing set to create the explainers for the model.

## Loading the test dataset

We're going to use the test dataset, as it provides the best independent explanations for the model.
Let's use pandas to load up the dataset and extract the features and outputs of the model.

In [58]:
import pandas as pd

In [59]:
df_train = pd.read_csv('../data/processed/train.csv')
df_test = pd.read_csv('../data/processed/test.csv')

In [60]:
feature_names = [
    'LIMIT_BAL',
    'SEX',
    'EDUCATION',
    'MARRIAGE',
    'AGE',
    'PAY_0', 'PAY_2', 'PAY_3', 'PAY_4', 'PAY_5', 'PAY_6',
    'BILL_AMT1', 'BILL_AMT2', 'BILL_AMT3', 'BILL_AMT4', 'BILL_AMT5', 'BILL_AMT6',
    'PAY_AMT1', 'PAY_AMT2', 'PAY_AMT3', 'PAY_AMT4', 'PAY_AMT5', 'PAY_AMT6'
]

target_name = 'default.payment.next.month'

In [61]:
x_test = df_test[feature_names]
y_test = df_test[target_name]
x_train = df_train[feature_names]
y_train = df_train[target_name]

With the dataset loaded, let's take a look at the first explainer.

## Creating a blackbox performance explainer
The first explainer we're going to create is a performance explainer. This performance explainer provides you with insights into how the performance of the model came to be. We're using an [ROC](https://en.wikipedia.org/wiki/Receiver_operating_characteristic) explainer.

In [62]:
from interpret.perf import ROC
from interpret import show

In [63]:
explain_perf = ROC(classifier.predict_proba, feature_names=feature_names).explain_perf(x_test, y_test, name='Performance')
show(explain_perf)

The performance explainer shows the Receiver Operator Curve as we've used it in the previous step of the tutorial. Here's how to interpret it.

### Interpreting the ROC curve (Top chart)
At the top there's the ROC curve. The orange line represents the classifier as we trained it.
The y-axis represents the percentage of values correctly predicted (true-positive rate)
The x-axis represents the percentagee of values incorrectly predicted as 1 (false-positive rate).

The orange line should follow a course as close as possible to the top left corner. Meaning that we have 0% false positives and 100% true positives.
In this case it won't, because the model contains errors picked up from the input data.

### Interpreting the histogram (Bottom chart)
Underneath the first chart, there's a histogram for the predictions. It shows a distribution of the predicted outcomes.
You can use this to see how well the model splits between 0 and 1. Note that values below 0.5 are rounded to 0 and values above 0.5 to 1 when you use the   
model in your application.

In this case you'll see a high number of predictions close to zero and a lower number of predictions close to zero. Finally, there's a wide variety of predictions that fall in between the two classes that we specified. This means our classifier is unsure what to do in quite a lot of cases.

Now that we have seen the performance, let's see what features have the most effect on the outcome of the model.

## Creating a blackbox model explainer
Let's explore our model further using a blackbox explainer. We're going to visualize what features have the most effect on the output of the model overall.
For this, we can use the MorrisSensitivity explainer. 

In [64]:
from interpret.blackbox import MorrisSensitivity

In [65]:
explain_global = MorrisSensitivity(classifier.predict_proba, x_train, feature_names=feature_names).explain_global(name='Sensitivity')
show(explain_global)

The chart shows the relative impact of a feature in the output of the model. This chart should be interpreted as follows:

Each feature has a value (mouse over it, you can read it) it adds to the output or subtracts from the output.
This value in this case is really small, since we're dealing with a scale from 0 to 1.

In practice, you can look at this chart and see that the PAY_0 column has the most influence on the outcome of the model.

In the case of our model we could say that:

* If you ever missed a payment in the first month, it's more likely it will happen again.
* The difference in value between `PAY_0` and `PAY_2` is not that large, which means that when you miss your second payment, 
  it's even more likely you'll miss a payment again in the future. The effect of the length of missed payment terms is diminishing over time.
* After the payment terms, we see that age also has an effect.

Given the explainer, we can see that it's probably best to just look at information about missed payments in the past, rather than age, gender, and education level. Which makes sense when you apply common sense.


Now that we have an overall explanation of our model, we can start to zoom in on individual cases. Let's explore a local explainer next.

## Creating a local model explainer
Customers often want to know, why did the model make this prediction especially when the model produces the wrong output.
Using a local explainer, you can see why a model produced a certain outcome for specific inputs.

We're using the LIME explainer for this purpose. It allows us to see the impact of input features on a specific outcome.


In [66]:
from interpret.blackbox import LimeTabular

In [67]:
explain_local = LimeTabular(classifier.predict_proba, x_train, feature_names=feature_names).explain_local(x_test.iloc[:5,:], y_test.iloc[:5])

In [68]:
show(explain_local)

> Note: To view individual explanations, select a prediction from the dropdown in the widget.

The local explainer helps you understand the impact of the various input values on the outcome of the model for a single sample.
Please note that the feature importance might be quite different from the global explanation, because the global explanation assumes a lot about the shape of the data. It uses information about the mean and standard deviation of the features.


## Summary

In this notebook we've used various explainers to get a better understanding of the model in relation to the data that was used to train and validate the model. As you've noticed, you'll need different explainers to answer different questions:

* Use performance explainers to understand the overall performance.
* Use global explainers to get a sense of what features are most important for your model.
* Use local explainers to understand why a certain case produced an (un)expected output.

We hope you liked this tutorial! If you have any questions, don't hesitate to leave an issue on github or drop a note on twitter @willem_meints.