# Explaining The ML models - XAI (eXplainable AI)

In [None]:
#Load the environment for the notebook from Project.toml and Manifest.toml

In [None]:
]instantiate

Interpretation techniques for the ML models focus on analysis how input features (or change in their values) affect the predictions. Many classic models exhibit 'built-in' high interpretability by design, but the predictions of complex models such as neural networks are opaque (hence the often used name 'black-box' models). 

In the recent years, explainability of the ML models (black-box in particular) became a popular topic and fueled many novel algorithms - the trend is often referred to as **Interpretable Machine Learning** or **Explainable Artificial Intelligence (XAI)**. 

Based on the applicability, the interpretability techniques can be categorised as follows:
 - **model-specific (intrinsic)** - tied to particular class of models, inherently available by design of the given algorithm, e.g. linear regression, logistic regression, decision trees
 - **model-agnostic** - applicable to many model families, mostly based on modifying the input data and 'probing' the influence on model predictions or quality

Additionally, the explainability algorithms may be broken down by the target they are applied to:
 - **prediction-level (local)** - provide explanation for prediction produced for a particular instance, useful if we want to understand the models behaviour on the per-case basis
 - **dataset-level (global)** - highlight overall feature influence on the model prediction

Every interpretability technique is characterized by both breakdowns, so we may have a model-specific global technique, model-agnostic local algorithm, etc. 

## Preparing the data

In [None]:
using CSV
using DataFrames
using Random

We'll use dataset about housing in suburbs of Boston. You can find more information about the dataset in the [UCI repository](https://archive.ics.uci.edu/ml/machine-learning-databases/housing/). Data is available for ingestion in the **Boston.csv** file in the directory of the notebook.

Attribute Information:

1. CRIM - per capita crime rate by town
2. ZN - proportion of residential land zoned for lots over 25,000 sq.ft.
3. INDUS - proportion of non-retail business acres per town
4. CHAS - Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
5. NOX - nitric oxides concentration (parts per 10 million)
6. RM - average number of rooms per dwelling
7. AGE - proportion of owner-occupied units built prior to 1940
8. DIS - weighted distances to five Boston employment centres
9. RAD - index of accessibility to radial highways
10. TAX - full-value property-tax rate per \$10,000
11. PTRATIO - pupil-teacher ratio by town
12. B - 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
13. LSTAT - \% lower status of the population
14. MEDV - Median value of owner-occupied homes in \$1000's

In [None]:
houses = CSV.read("Boston.csv", DataFrame);
first(houses)

Our task is to predict the median house value (MEDV) based on all other available features. 

In [None]:
# Let's remove the Column1 containing observation ID
houses = houses[:, Not(:Column1)];

In [None]:
feature_names = names(houses, Not(:medv))

## Model-specific interpretability

In [None]:
using GLM
using XGBoost
using Plots
using Statistics
using Random

We'll train two models with 'built-in' (inherent) explainability: 
1. Linear Regression
2. Gradient Boosted Trees

In the first example, we'll **directly interpret the values of the parameters** for the trained model - this approach is applicable to the whole family of [**Generalized Linear Models (GLMs)**](https://en.wikipedia.org/wiki/Generalized_linear_model) with appropiate transformations applied to the parameters and target variable values, e.g. logistic regression, Poisson regression.

**Gradient Boosted Trees** is part of the bigger family of **tree-based models** (which includes various Decision Trees algorithms and ensemble models such as Random Forest). For the tree-based models, we can calculate a **feature importance** - the metric of the features contribution to the quality of model fit to the training data. Often the  The parameter values are not analysed directly, instead we are extracting a statistic from training process which is relevant to understanding the model decisions. Values of feature importance are usually available in the attribute or method of the trained tree-based models.

Please note that we are not splitting the dataset into training, validation and test subsets as we are not picking the optimal specifications of the models. The models predictive power is checked on the training data to validate the output of interpretability algorithms.

### Linear regression

In [None]:
lin_reg = lm(term(:medv) ~ sum(term.([1; feature_names])), houses)

We can directly interpret calculated coefficients as the change in the predicted value (median house value in thousands of dollars) for a unit change in the value of the feature.

For example, prediction for median house value will increase by 3.800 dollars with each additional room in the estate. 

In [None]:
r2(lin_reg)

R2 is quite high, hence the model is performing well (on the training data). 
It's important to check the model quality while analysing the effect of the features - if the model quality is low we may get incorrect picture of which features are important for the task in general.

### Gradient Boosted Trees

In [None]:
X = Matrix(houses[!,Not(:medv)])
xgb_reg = xgboost(X, 40, label = houses.medv,
                    objective = "reg:squarederror", seed = 42)

In [None]:
#Calculating R^2 for trained XGBoost model
#It's quite high, we may feel comfortable to look on the features importance
R2(y, ŷ) = sum((ŷ .- mean(y)).^2)/sum((y .- mean(y)).^2)
R2(houses.medv, XGBoost.predict(xgb_reg, X))

In [None]:
f_imp = importance(xgb_reg, feature_names)

In [None]:
# Plot all variables and their feature importance on barplot (output from your own function)
bar(getproperty.(f_imp, :fname), getproperty.(f_imp, :gain), 
         ylab="Feature gain", legend=nothing, title="XGBoost feature importance")

Based on the built-in feature importance two the most important features for model predictions are:
 - LSTAT
 - RM

## Model-agnostic interpretability

Model-agnostic algorithms may be applied on top of many Machine Learning models. It is possible as they operate by modifying the data input and analysing the returned predictions - the inner workings of the model is not relevant in that context. The model-agnostic techniques can be further divided into:

Global techniques (explaining overall features effect on predictions):
- [Partial Dependence Plots (PDP)](https://www.jstor.org/stable/2699986)
- [Accumulated Local Effects (ALE)](https://arxiv.org/abs/1612.08468)
- [Permutation-based feature importance](https://scikit-learn.org/stable/modules/permutation_importance.html)

Local techniques (explaining particular prediction):
- [Individual Conditional Expectations (ICE)](https://arxiv.org/abs/1309.6392)
- [LIME (Local Interpretable Model-agnostic Explanations)](https://arxiv.org/abs/1602.04938)
- [Shapley Values and SHAP (SHapley Additive exPlanations)](https://arxiv.org/abs/1705.07874)

Let's explore one algorithm from each category: global **Permutation feature importance** and local **SHAP values**.

### Permutation-based feature importance

The technique relies on measuring effect of breaking the relation between independent features and target variable. The idea is quite straightforward - if the important feature is randomly shuffled, the model performance should drop significantly, correspondingly for the not imporant feature the effect on the model's quality will be small. 

The algorithm can be applied to all models (model-agnostic) as it modify the data by shuffling the subsequent features and analyse the evaluation metric calculated based on predictions on the distorted data. Also the model is global as it provide importance per feature on the whole dataset.

Let's implement our own algorithm for permutation feature importance based on RMSE metric.

In [None]:
RMSE(y, ŷ) = sqrt(mean((y-ŷ).^2))

In [None]:
function varimp(df::DataFrame, 
        model::Booster, 
        name::Symbol, 
        ref_rmse::Float64, 
        reps::Int = 10,
        random_seed::Int = 1)
    df_shuffle = copy(df[:,1:end-1])
    y = df[:,end]
    rmse = []
    for _ in 1:reps
        Random.seed!(random_seed)
        Random.shuffle!(df_shuffle[!, name])
        X = Matrix(df_shuffle)
        push!(rmse, RMSE(y,XGBoost.predict(xgb_reg, X)))
    end
    rmse = rmse .- ref_rmse
    return (mean(rmse), std(rmse))
end

In [None]:
xgb_rmse = RMSE(houses.medv,XGBoost.predict(xgb_reg, X))
perm_feat_imp = []
reps = 20
for name in feature_names
    (avg_rmse, std_rmse) = varimp(houses, xgb_reg, Symbol(name), xgb_rmse, reps, 42)
    push!(perm_feat_imp,(feature=name, rmse_change=avg_rmse, rmse_std=std_rmse))
end

In [None]:
perm_ft = DataFrame(perm_feat_imp)
sort!(perm_ft, :rmse_change, rev=true)

In [None]:
bar(perm_ft.feature,
    perm_ft.rmse_change,
    yerr = perm_ft.rmse_std,
    legend = nothing,
    ylab="RMSE change", title="Permutation-based feature importance ($reps reps)")

The outcome is aligned with the XGBoost feature importance, however there is relatively smaller difference between LSTAT, RM and less important features.

### SHAP values

**Shapley values** is a concept from the game theory regarding fair distribution of the payout in a game with multiple players. The payout is calculated based on averaged contributions in all possible _coalitions_ (combinations of players taking part in the game). If we switch the notion of a player to a feature and payout to model's prediction the algorithm can be applied to interpret Machine Learning models.

Shapley values are calculated on the prediction (local) level for each feature. The value can be interpreted as average contribution to the prediction compared to mean prediction over the whole dataset. To calculate contributions,  input observation is modified by removing the features  - analogous to absence of the players in a coalition. The predictions over all coalitions (feature combinations)  are gathered and an average 'payout' for each feature is calculated. 

An exact Shapley values calculation is compute-intensive as the number of coalitions increase exponentially with the number of feature, hence an approximate solution in the form of **SHAP values** is often used.

In [None]:
using ShapML
using Flux

In [None]:
# Let's train a black-box model for the SHAP values example
# It's a simple neural with one dense hidden layer with ReLU activation function
X_flux = transpose(X)
y_flux = transpose(houses.medv)
model = Chain(Dense(13 => 30, relu),  Dense(30 => 1))
loss(x, y) = Flux.Losses.mse(model(x), y)
parameters = Flux.params(model)
data = [(X_flux, y_flux)]
opt = Flux.Adam(0.005)
for epoch in 1:30_000
    Flux.train!(loss, parameters, data, opt)
end

In [None]:
# The R^2 should have a value around 90%
R2(y_flux, model(X_flux))

In [None]:
RMSE(y_flux, model(X_flux))

In [None]:
function predict_function(model, data)
  data_pred = DataFrame(y_pred = vec(model(transpose(Matrix(data)))))
  return data_pred
end

In [None]:
# Calculating SHAP values for first observation
# 600 random coalitions are tested instead of all combinations
data_shap = ShapML.shap(explain = DataFrame(houses[1, Not(:medv)]),
                        reference = houses[:, Not(:medv)],
                        model = model,
                        predict_function = predict_function,
                        sample_size = 600,
                        seed = 1
                        )
sort!(data_shap, :shap_effect);

In [None]:
bar(data_shap.feature_name,
    data_shap.shap_effect,
    legend = nothing,
    ylab = "Shap effect", 
    title = "SHAP values for observation 1")

In [None]:
# SHAP values for observation 42
data_shap = ShapML.shap(explain = DataFrame(houses[42, Not(:medv)]),
                        reference = houses[:, Not(:medv)],
                        model = model,
                        predict_function = predict_function,
                        sample_size = 600,
                        seed = 1
                        )
sort!(data_shap, :shap_effect);

In [None]:
bar(data_shap.feature_name,
    data_shap.shap_effect,
    legend = nothing,
    ylab = "Shap effect", 
    title = "SHAP values for observation 42")

## XAI for unstructured data

We worked only with tabular data so far, but the multitude of novel AI applications use unstructured data such as image and text. In such applications, mainly deep neural networks are used as the shallow models doesn't have sufficient capacity for the task. Hence we may expect a black-box approach whenever dealing with unstructured data. 

Approach feasible for tabular data doesn't fit well for image or text datasets in the context of interpretability. Features can't be easily listed and assigned an importance. For each domain, XAI techniques focus on a specific features - in image recognition relevant pixels (or superpixels) may be highlighted, while in sentiment analysis the words contributing the most to the sentiment prediction.

Some of the interpretability algorithms used on tabular data may be reused for unstructured datasets (e.g. [LIME](https://ema.drwhy.ai/LIME.html)), but there are also methods specific to each domain of the unstructred training. Often the specialised methods leverage the fact that the deep neural networks are based on gradient calculations, see for example [Integrated Gradients](https://www.tensorflow.org/tutorials/interpretability/integrated_gradients) or [SmoothGrad](https://arxiv.org/abs/1706.03825).


We'll use a pretrained image classification model [VGG](https://arxiv.org/abs/1409.1556) on the image of main building of Warsaw School of Economics. After inspecting the classes predicted by the model, we'll utilize [LRP](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0130140) algorithm to mark the pixels which contributed the most to the obtained prediction.

In [None]:
using Metalhead
using ExplainableAI
using FileIO
using ImageShow

![](./sgh.jpeg)

A picture we'll use for image classifiation task

In [None]:
# Load model
model = VGG(16, pretrain = true).layers
model = strip_softmax(flatten_chain(model))

#Load the list of Imagenet classes
imagenet_classes = CSV.read("imagenet.csv", DataFrame, delim = ';')

# Load input
input = preprocess_imagenet(load("sgh.jpeg"))
input = reshape(input, 224, 224, 3, :)

# Create the LRP algorithm analyzer
analyzer = LRP(model, EpsilonPlus());

In [None]:
#Sort output classes based on the prediction probability
best_classes = sortperm(vec(model(input)), rev = true);

In [None]:
#Main building of Warsaw School of Economics is classified as palace
imagenet_classes[best_classes[1], :class_name]

In [None]:
# Let's see which part of the picture were important for the prediction
ExplainableAI.heatmap(input, analyzer)

In [None]:
# The second pick from the model is dome - quite accurate
imagenet_classes[best_classes[2], :class_name]

In [None]:
ExplainableAI.heatmap(input, analyzer, best_classes[2])

In [None]:
# The third pick is totally off - the closest sea is 250km away
# Let's see why the model predicted that
imagenet_classes[best_classes[3], :class_name]

In [None]:
ExplainableAI.heatmap(input, analyzer, best_classes[3])