# Explainable AI

Making models is really cool, but in practice, in businesses people often also want to know why a certain prediction was made. Understanding why predictions are made is the field of Explainable AI. It can be as important, and in some cases, even more important as making the most accurate prediction. 

SHAP (SHapley Additive exPlainations) is a game theoretic approach to explain the output of any machine learning model to increase transparency and interpretability of machine learning models. Consider a coooperative game with the same number of players as the name of features. SHAP will disclose the individual contribution of each player (or feature) on the output of the model, for each example or observation.

*Important: while SHAP shows the contribution or the importance of each feature on the prediction of the model, it does not evaluate the quality of the prediction itself.*

SHAP can thus be applied to all kinds of models. SHAP has different ways of working for different kinds of models, in this notebook we will first go through SHAP for tabular data. We will first make an XG Boost model, which is a tree model. We will use the breast_cancer dataset that has 30 variables and 1 target which is binary and shows whether the person has breast cancer or not. SHAP will help us understand which of these 30 variables made the largest difference in a single prediction. If we calculate the mean SHAP values over all these samples, we can say which of the variables are most important.

In [None]:
from sklearn import datasets
import pandas as pd
import numpy as np
import shap
import torch

In [None]:
from sklearn import datasets as sk_datasets
from typing import Any, TypedDict
Tensor = torch.Tensor

class Data(TypedDict):
    train: TensorDataset
    test: TensorDataset
    features: list[str]

class TensorDataset:
    """The main responsibility of the Dataset class is to
    offer a __len__ method and a __getitem__ method
    """

    def __init__(self, data: Tensor, targets: Tensor) -> None:
        self.data = data
        self.targets = targets
        assert len(data) == len(targets)

    def __len__(self) -> int:
        return len(self.targets)

    def __getitem__(self, idx: int) -> tuple:
        return self.data[idx], self.targets[idx]

def get_breast_cancer_dataset(
    train_perc: float,
) -> Data:
    npdata = sk_datasets.load_breast_cancer()
    featurenames = npdata.feature_names
    tensordata = torch.tensor(npdata.data, dtype=torch.float32)
    tensortarget = torch.tensor(npdata.target, dtype=torch.uint8)
    trainidx = int(len(tensordata) * train_perc)
    traindataset = TensorDataset(tensordata[:trainidx], tensortarget[:trainidx])
    testdataset = TensorDataset(tensordata[trainidx:], tensortarget[trainidx:])
    return {"train" : traindataset, "test": testdataset, "features" : list(featurenames)}


In [None]:
train_perc = 0.8
data = get_breast_cancer_dataset(train_perc)
traindataset = data["train"]
testdataset = data["test"]
featurenames = data["features"]

len(traindataset), len(testdataset), len(featurenames)

In [None]:
featurenames

In [None]:
from mads_datasets.base import BaseDatastreamer
from mltrainer.preprocessors import BasePreprocessor

preprocessor = BasePreprocessor()

trainstreamer = BaseDatastreamer(traindataset, batchsize=32, preprocessor=preprocessor).stream()
teststreamer = BaseDatastreamer(testdataset, batchsize=len(testdataset), preprocessor=preprocessor).stream()

In [None]:
X, Y = next(trainstreamer)
X.shape, Y.shape

In [None]:
import torch.nn as nn

class NeuralNetwork(nn.Module):
    def __init__(self, config: dict) -> None:
        super().__init__()
        self.linear = nn.Sequential(
            nn.Linear(config["input"], config["h1"]),
            nn.ReLU(),
            nn.Linear(config["h1"], config["h2"]),
            nn.Dropout(0.4),
            nn.ReLU(),
            nn.Linear(config["h2"], config["output"]),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        logits = self.linear(x)
        return logits

In [None]:
import torch.optim as optim
from pathlib import Path
from mltrainer import Trainer, metrics, TrainerSettings, ReportTypes


config = {
    "input" : 30,
    "h1" : 20,
    "h2" : 10,
    "output" : 2
}

model = NeuralNetwork(config)

loss_fn = torch.nn.CrossEntropyLoss()
accuracy = metrics.Accuracy()

log_dir= Path("../../models/test").resolve()

settings = TrainerSettings(
    epochs=50,
    metrics=[accuracy],
    logdir=log_dir,
    train_steps=int(train_perc * len(traindataset)) // 32,
    valid_steps=1,
    reporttypes=[ReportTypes.TENSORBOARD],
    scheduler_kwargs={"factor": 0.5, "patience": 5},
)

trainer = Trainer(
    model=model,
    settings=settings,
    loss_fn=loss_fn,
    optimizer=optim.Adam,
    traindataloader=trainstreamer,
    validdataloader=teststreamer,
    scheduler=optim.lr_scheduler.ReduceLROnPlateau
    )

trainer.loop()

We have a model! Now we can start using the SHAP values to analyze the model

Because we are using a Neural network, we are using the DeepExplainer.

In [None]:
import shap
import pandas as pd

X, Y = next(teststreamer)
explainer = shap.DeepExplainer(model, X)
shap_values = explainer.shap_values(X)

#make a dataframe of the data so that we can add the feature names in our plots
df = pd.DataFrame(X.numpy(), columns=featurenames)


### Visualize a single prediction

We can visualize a single prediction. 

For this we can use the force plot, which is a way to see the effect of each feature on the prediction, for a given observation. In this plot the positive SHAP values are displayed on the left side and the negative on the right side, as if competing against each other. The highlighted value is the prediction for that observation.



In [None]:
Y

In [None]:
print(f"The input data has shape {X.shape}")
print(f"This means we have {X.shape[0]} samples and {X.shape[1]} features")
print("The labels are either 0 or 1, so we have two classes")
vals = [f"{x:.2f}" for x in explainer.expected_value]
print(f"We have {vals} as the expected values for either class")

<ol>
    <li>The output value is the prediction for that observation </li>
    <li>The base value: the mean prediction, or mean(yhat)</li>
    <li>Red/blue: Features that push the prediction higher (to the right) are shown in red, and those pushing the prediction lower are in blue.</li>
    <li>The plot is centered on the x-axis at explainer.expected_value. All SHAP values are relative to the model's expected value like a linear model's effects are relative to the intercept.</li>
<ol>


In [None]:
import pandas as pd

#init javascript in order to display the visuals
shap.initjs()


category = 1
observation = 3

print(Y[observation])
shap_value = np.array(shap_values[category][observation, :])
features = df.iloc[observation,:]
shap.force_plot(explainer.expected_value[category], shap_value, features)

### Bar chart of mean importance

This takes the average of the SHAP value magnitudes across the dataset and plots it as a simple bar chart.

In [None]:
shap.summary_plot(shap_values[1], df, plot_type="bar")

### SHAP Summary Plot

Rather than use a typical feature importance bar chart, we use a density scatter plot of SHAP values for each feature to identify how much impact each feature has on the model output for individuals in the validation dataset. Features are sorted by the sum of the SHAP value magnitudes across all samples. It is interesting to note that the relationship feature has more total model impact than the captial gain feature, but for those samples where capital gain matters it has more impact than age. In other words, capital gain effects a few predictions by a large amount, while age effects all predictions by a smaller amount.

Note that when the scatter points donâ€™t fit on a line they pile up to show density, and the color of each point represents the feature value of that individual.

In [None]:
shap.summary_plot(shap_values[1], df)

So that's it for the tabular data. We can also use SHAP for images. See the next notebook for SHAP on image data.