## Motivation

"Explainable AI (XAI) refers to methods and techniques in the application of artificial intelligence technology (AI) such that the results of the solution can be understood by humans." 

Source - [Wikipedia](https://en.wikipedia.org/wiki/Explainable_artificial_intelligence)

I discovery this lib called [SHAP](https://github.com/slundberg/shap) that can help you understand how a prediction is made.

This notebook book is just an example of "how to use" very similar than the documentations if you prefer you just can go there and see by your self.

In [None]:
!pip install nb_black -q

In [None]:
%load_ext nb_black

# Importing dataset and Model to use SHAP library.

Importing the dataset 'churn modeling' to try explain what can help or not.

#### Data porfile
- There is 10 columns;
- No missing values;
- Exited column is the target;

#### Columns meaning
- CreditScore: Customer score in financial context;
- Geography: Represets the customer contry;
- Gender: Just customer's sex;
- Age: Just Age;
- Tenure: How much time as customer;
- Balance: How much money in the bank;
- NumOfProducts: How much products the customer uses;
- HasCrCard: Does have the customer a credit card?
- IsActiveMember: Is the customer an active member?
- EstimetedSalary: How much is the customer salary?
- Exited: Client churn flag

In [None]:
import numpy as np  # linear algebra
import pandas as pd  # data processing, CSV file I/O (e.g. pd.read_csv)
import plotly.express as px
import plotly.figure_factory as ff
import os
from sklearn.model_selection import train_test_split
import tensorflow as tf

data = pd.read_csv("/kaggle/input/churn-modeling-dataset/Churn_Modelling.csv").drop(
    ["RowNumber", "CustomerId", "Surname"], axis=1
)
data.head()

## Data formatation

- StandardScaler -> Standardize features by removing the mean and scaling to unit variance The standard score of a sample x is calculated as: z = (x - u) / s.

- LabelEncoder -> Encode target labels with value between 0 and n_classes-1.

- OneHotEncoder -> Encode categorical features as a one-hot numeric array.


The data transformation is not my focus now so I just got the work that I did in this [notebook](https://www.kaggle.com/mcarujo/churn-prediction-ann-over-under-sampling) to be use with SHAP.

In [None]:
from sklearn.preprocessing import OneHotEncoder, MinMaxScaler, LabelEncoder

enc = OneHotEncoder(handle_unknown="ignore")
minmax_scaler = MinMaxScaler()
label_encoder = LabelEncoder()

X = np.concatenate(
    (
        ## OneHotEncoder
        enc.fit_transform(data[["Geography"]]).toarray(),
        ## Stander Scaler
        minmax_scaler.fit_transform(
            data[
                [
                    "CreditScore",
                    "Age",
                    "Tenure",
                    "Balance",
                    "NumOfProducts",
                    "EstimatedSalary",
                ]
            ]
        ),
        ## LabelEncoder
        label_encoder.fit_transform(data[["Gender"]]).reshape(-1, 1),
        ## No formatation
        data[["HasCrCard", "IsActiveMember"]].values,
    ),
    axis=1,
)

y = data.Exited.values
X.shape

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.05, random_state=42, stratify=y
)

Geting the name of our new columns after transformed...

In [None]:
columns = (
    [el for el in enc.categories_[0]]
    + ["CreditScore", "Age", "Tenure", "Balance", "NumOfProducts", "EstimatedSalary",]
    + ["Gender"]
    + ["HasCrCard", "IsActiveMember"]
    + ["Exited"]
)

## Correlation Matrix
Looking at the correlation matrix is not so easy to how the impact of the features in the target variable, however, we can see a few of them.

Germany is a country where people like to churn in this dataset.

'Age' and 'Exited' have a relationship.

'Balance' has a relation with 'Germany' then because this with 'Exited'.

'France' and 'EstimatedSalary' have a relation with 'Exited' too.

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

table = pd.DataFrame(np.concatenate([X, y.reshape(-1, 1)], axis=1))
table.columns = columns
table = table.corr()
with sns.axes_style("white"):
    mask = np.zeros_like(table)
    mask[np.triu_indices_from(mask)] = True
    plt.figure(figsize=(10, 10))
    sns.heatmap(
        round(table, 2),
        cmap="Reds",
        mask=mask,
        vmax=table.max().max(),
        vmin=table.min().min(),
        linewidths=0.5,
        annot=True,
        annot_kws={"size": 12},
    ).set_title("Correlation Matrix App behavior dataset")

columns.remove("Exited")

## RandomForestClassifier

In my last notebook used an ANN, however, get a simpler model is good in this case. Random Forest, for example, is simpler than ANN and also faster. :)

In [None]:
from sklearn.ensemble import RandomForestClassifier


model_rfc = RandomForestClassifier()
model_rfc.fit(X_train, y_train)

Just using the training dataset to fit because I would like to see how the SHAPE can explain a prediction of a data which never was seen before.

# SHAP

HAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions (see papers for details and citations).

[Docs here.](https://shap.readthedocs.io/en/latest/examples.html#kernel-explainer)


Just taking one sample of our test dataset to be predicted and explained.


In [None]:
sample = pd.DataFrame(X_test[0]).T
sample.columns = columns
sample

### Explainer, who should explain what happens here.
TreeExplainer recevies your model then start to explain how works your model. There are similars Explainers...
- [TreeExplainer](https://shap.readthedocs.io/en/latest/generated/shap.TreeExplainer.html#shap.TreeExplainer) -> Uses Tree SHAP algorithms to explain the output of ensemble tree models.
- [Kernel Explainer](https://shap.readthedocs.io/en/latest/generated/shap.KernelExplainer.html#shap.KernelExplainer) -> Uses the Kernel SHAP method to explain the output of any function.
- [Deep Explainer](https://shap.readthedocs.io/en/latest/generated/shap.DeepExplainer.html#shap.DeepExplainer) -> Meant to approximate SHAP values for deep learning models.
- [Gradient Explainer](https://shap.readthedocs.io/en/latest/generated/shap.GradientExplainer.html#shap.GradientExplainer) -> Explains a model using expected gradients (an extension of integrated gradients).
- [Linear Explainer](https://shap.readthedocs.io/en/latest/generated/shap.LinearExplainer.html#shap.LinearExplainer) -> Computes SHAP values for a linear model, optionally accounting for inter-feature correlations.
- [Partition Explainer](https://shap.readthedocs.io/en/latest/generated/shap.PartitionExplainer.html#shap.PartitionExplainer) -> Uses the Partition SHAP method to explain the output of any function.

In [None]:
import shap

shap.initjs()  # Just to create better graphs =D
explainer = shap.TreeExplainer(model_rfc)

For me 'explainer' is a variable which is able to explain a prediction, in an informal way, is just that. 

Just to have in mind the model prediction on the next cell.

In [None]:
prediction = model_rfc.predict_proba(sample)
print("Direct print:", prediction)
print(
    "Probability to be class 0:",
    prediction[0][0],
    "\nProbability to be class 1:",
    prediction[0][1],
)

#### Just to image how are Model predict our sample

SHAP start the analisys from a 'baseline' in terms of class prediction.

expected_value return our baselines and from there we see the impact of our features.
> This is the reference value that the feature contributions start from. For SHAP values it should
> be the value of explainer.expected_value. 
> 
[Docs here.](https://github.com/slundberg/shap/blob/06c9d18f3dd014e9ed037a084f48bfaf1bc8f75a/shap/plots/force.py#L31)

In [None]:
print(explainer.expected_value)

The expected_value is an array with 2 floats: where the size of the array means our classes 0 and 1 (binary) and the float number itself means the probability of class N.
#### JUST A OPINION
> In my point of view, the baseline for the SHAP is strongly influenced by the bias in your model. Carujo.

In [None]:
print("How is the target balance?", y_train.mean())

## I wanna know about the sample...
Now let just se how we can 'understand' a prediction of a sample

In [None]:
shap_values = explainer.shap_values(sample.loc[0])
shap_values

I mean in a familiar way to understand....

In [None]:
pd.DataFrame(shap_values, columns=columns)

### BaseLine + Contribuitions = Probability Predicted

In [None]:
print("Direct prediction:", prediction)
aux = shap_values[0].sum() + explainer.expected_value[0]
print("Sum of Baseline + Feature Contribuitions:", aux)

### Plot possibilities

#### Probability to be class 0


In [None]:
shap.force_plot(explainer.expected_value[0], shap_values[0], sample)

#### Probability to be class 1

In [None]:
shap.force_plot(explainer.expected_value[1], shap_values[1], sample)

### Think about to put in production

When we train our model we also can train the SHAP to always be ready to exaplain as you need predict.

In [None]:
import shap
import pandas as pd
import numpy as np


# Train your shap to understand your model
def explain_train(model):
    return shap.TreeExplainer(model)


# Here you pass the return of the last function and also the dataframe with the columns model
def explain_this(explainer, sample):
    columns = sample.columns
    shap_values = explainer.shap_values(sample.iloc[0])
    aux = pd.DataFrame(shap_values, columns=columns)
    aux["_BASELINE"] = explainer.expected_value
    aux["_CLASSES"] = explainer.expected_value
    return aux

In [None]:
sample

In [None]:
explainer = explain_train(model_rfc)

In [None]:
shap_values = explain_this(explainer, sample)
shap_values

Going to the [documentations](https://github.com/slundberg/shap) you can see so many types of graphs that you can use and explore to help you in your work. 

I strongly recommend you go there and check the options.