<a href="https://colab.research.google.com/github/CALDISS-AAU/sdsphd19_coursematerials/blob/master/notebooks/SDS_PHD_Explainable_ML.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Standard stuff
import pandas as pd #for manipulating data
import numpy as np #for manipulating data
import sklearn #for building models

# Dataviz
import matplotlib.pyplot as plt #for custom graphs at the end
import seaborn as sns #for custom graphs at the end

# Other tooling
import os #needed to use Environment Variables in Domino
import time #some of the routines take a while so we monitor the time

# SML
import xgboost as xgb #for building models
import sklearn.ensemble #for building models
from sklearn.model_selection import train_test_split #for creating a hold-out sample
from sklearn import datasets # Boston Housing Data

# Explainable ML&AI tools
!pip install lime
import lime #LIME package
import lime.lime_tabular #the type of LIIME analysis we’ll do
!pip install shap
import shap #SHAP package
import yellowbrick as yb
!pip install pdpbox
from pdpbox import pdp

# Introduction to Explainable ML&AI

Machine learning (ML) models are often considered black boxes due to their complex inner-workings. More advanced ML models such as random forests, gradient boosting machines (GBM), artificial neural networks (ANN), among others are typically more accurate for predicting nonlinear, faint, or rare phenomena. Unfortunately, more accuracy often comes at the expense of interpretability, and interpretability is crucial for business adoption, model documentation, regulatory oversight, and human acceptance and trust. 

![](https://www.dropbox.com/s/r5w3o80q3k3prdm/interpret.png?dl=1)

However, for situations where the social/economic costs of failure are high (plane crashes, who gets an insurance or has to go to prison), there is a need for explainability, interpretability, and accountability of algorithmic decisions.

![](https://www.dropbox.com/s/dl4xlxwl583cehi/random_computer_says_no.png?dl=1)

Luckily, several advancements have been made to aid in interpreting ML models.

Broadly, one can classify such aproaches as:

1. Global Explanations
2. Local Explanations

# Prediction Model

To illustrate a few features I am going to be using a scikit-learn dataset called the wine recognition set. This dataset has 13 features and 3 target classes and can be loaded directly from the scikit-learn library. In the below code I am importing the dataset and converting it to a data frame. The data can be used in a classifier without any additional preprocessing.

In [0]:
from sklearn import datasets

wine_data = datasets.load_wine()
df_wine = pd.DataFrame(wine_data.data,columns=wine_data.feature_names)
df_wine['target'] = pd.Series(wine_data.target)

In [0]:
df_wine.describe()

In [0]:
from sklearn.model_selection import train_test_split

X = df_wine.drop(['target'], axis=1)
y = df_wine['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Global Explanations

## Yellowbricks

This library is essentially an extension of the scikit-learn library and provides some really useful and pretty looking visualisations for machine learning models. The visualiser objects, the core interface, are scikit-learn estimators and so if you are used to working with scikit-learn the workflow should be quite familiar.

The visualisations that can be rendered cover model selection, feature importances and model performance analysis.

In [0]:
from yellowbrick.classifier import ClassificationReport
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
visualizer = ClassificationReport(model, size=(1080, 720))
visualizer.fit(X_train, y_train)
visualizer.score(X_test, y_test)
visualizer.poof()

## ELI5

ELI5 is another visualisation library that is useful for debugging machine learning models and explaining the predictions they have produced. It works with the most common python machine learning libraries including scikit-learn, XGBoost and Keras.

It provides easy functionality for creating model specific measure of globel feature importance.

In [0]:
import eli5
eli5.show_weights(model, feature_names = X.columns.tolist())

By default the show_weights method uses gain to calculate the weight but you can specify other types by adding the importance_type argument.

You can also use show_prediction to inspect the reasons for individual predictions.

In [0]:
from eli5 import show_prediction
show_prediction(model, X_train.iloc[1], feature_names = X.columns.tolist(), 
                show_feature_values=True)

## ML extend

This library contains a host of helper functions for machine learning. This covers things like stacking and voting classifiers, model evaluation, feature extraction and engineering and plotting. In addition to the documentation, [this paper](https://sebastianraschka.com/pdf/software/mlxtend-latest.pdf) is a good resource for a more detailed understanding of the package.

In [0]:
!pip install mlxtend

In [0]:
from mlxtend.plotting import plot_decision_regions
from mlxtend.classifier import EnsembleVoteClassifier

import matplotlib.gridspec as gridspec
import itertools 
from sklearn import model_selection

from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier

In [0]:
# Unfortunatelly only works for 2 features at once.
X_train_ml = X_train[['proline', 'color_intensity']].values
y_train_ml = y_train.values

In [0]:
# We run a variety of models here
clf1 = LogisticRegression(random_state=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = GaussianNB()
eclf = EnsembleVoteClassifier(clfs=[clf1, clf2, clf3], weights=[1,1,1])

In [0]:
# And plot it
value=1.5
width=0.75
gs = gridspec.GridSpec(2,2)
fig = plt.figure(figsize=(10,8))
labels = ['Logistic Regression', 'Random Forest', 'Naive Bayes', 'Ensemble']
for clf, lab, grd in zip([clf1, clf2, clf3, eclf],
                         labels,
                         itertools.product([0, 1], repeat=2)):
                         
    clf.fit(X_train_ml, y_train_ml)
    ax = plt.subplot(gs[grd[0], grd[1]])
    fig = plot_decision_regions(X=X_train_ml, y=y_train_ml, clf=clf)
    plt.title(lab)

# Local Explanations

## Lime


### Introducing [`lime`](https://github.com/marcotcr/lime)

> *"There once was a package called lime, Whose models were simply sublime, It gave explanations for their variations, one observation at a time."*

*lime-rick by Mara Averick*

**Local Interpretable Model-agnostic Explanations** (LIME) is a visualization technique that helps explain individual predictions. As the name implies, it is model agnostic so it can be applied to any supervised regression or classification model. The original paper is mindblowing, if you find time, just read it!

* Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ""hy Should I Trust You?: Explaining the Predictions of Any Classifier." In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016). ACM, New York, NY, USA, 1135-1144. DOI: https://doi.org/10.1145/2939672.2939778

Behind the workings of LIME lies the assumption that every complex model is linear on a local scale and asserting that it is possible to fit a simple model around a single observation that will mimic how the global model behaves at that locality. The simple model can then be used to explain the predictions of the more complex model locally.

The generalized algorithm LIME applies is:

1. Given an observation, permute it to create replicated feature data with slight value modifications.
2. Compute similarity distance measure between original observation and permuted observations.
3. Apply selected machine learning model to predict outcomes of permuted data.
3. Select m number of features to best describe predicted outcomes.
4. Fit a simple model to the permuted data, explaining the complex model outcome with m features from the permuted data weighted by its similarity to the original observation .
5. Use the resulting feature weights to explain local behavior.

A little example on image data (original application)

![](https://www.dropbox.com/s/wyimw0dw5b8ifhb/ml_lime_example.png?dl=1)

### How to apply

In [0]:
import lime.lime_tabular
explainer = lime.lime_tabular.LimeTabularExplainer(X_train.values,                                            
                 feature_names=X_train.columns.values.tolist(),                                        
                 class_names=y_train.unique())

In [0]:
predict_fn = lambda x: model.predict_proba(x).astype(float)

The `explain` function above first creates permutations, then calculates similarities, followed by selecting the m features. Lastly, explain will then fit a model. `lime` applies a ridge regression model (a subgroup of elastic nets) with the weighted permuted observations as the simple model.If the model is a regressor, the simple model will predict the output of the complex model directly. If the complex model is a classifier, the simple model will predict the probability of the chosen class(es).

The `explain` output is a data frame containing different information on the simple model predictions. Most importantly, for each observation  it contains the simple model fit  and the weighted importance (feature_weight) for each important feature that best describes the local relationship.

In [0]:
exp = explainer.explain_instance(X_test.values[1], predict_fn, num_features=6)
exp.show_in_notebook(show_all=False)

## SHAP Values

SHAP and LIME are both popular Python libraries for model explainability. SHAP (SHapley Additive exPlanation) leverages the idea of [Shapley values](https://christophm.github.io/interpretable-ml-book/shapley.html) for model feature influence scoring. The technical definition of a Shapley value is the “average marginal contribution of a feature value over all possible coalitions.” In other words, Shapley values consider all possible predictions for an instance using all possible combinations of inputs. Because of this exhaustive approach, SHAP can guarantee properties like consistency and local accuracy.

### How to apply

We first again create an explainer. 

In [0]:
# Tree on XGBoost
explainerXGB = shap.TreeExplainer(xgb_model)
shap_values_XGB_test = explainerXGB.shap_values(X_test)
shap_values_XGB_train = explainerXGB.shap_values(X_train)

In [0]:
# XGBoost
df_shap_XGB_test = pd.DataFrame(shap_values_XGB_test, columns=X_test.columns.values)
df_shap_XGB_train = pd.DataFrame(shap_values_XGB_train, columns=X_train.columns.values)


In [0]:
# if a feature has 10 or less unique values then treat it as categorical
categorical_features = np.argwhere(np.array([len(set(X_train.values[:,x]))
for x in range(X_train.values.shape[1])]) <= 10).flatten()

In [0]:
# j will be the record we explain
j = 1

In [0]:

# initialize js for SHAP
shap.initjs()
shap.force_plot(explainerXGB.expected_value, shap_values_XGB_test[j], X_test.iloc[[j]]) 