# Tutorial 1 - SimplEx for Tabular Data

In this tutorial we we create a simplex explainer object and use it to explain a test record. The explainer is then saved to disk and can be given to someone else to view in the [Interpretability Suite App](https://vanderschaarlab-demo-interpretabi-interpretability-suite-1uteyn.streamlit.app/).

We will be explaining the predictions of pytorch multi-layer perceptron that we have trained and saved separately on the iris dataset from sci-kit learn. The Interpretability.models module provides a few pytorch models that are compatible with trained models `state_dict`s available on the Google Drive link below.

### Import the relevant modules

In [11]:
# IMPORTS
# Standard
import os
import pathlib

# Third Party
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
import torch
import pandas as pd

# Interpretability
from interpretability.interpretability_models import simplex_explainer
from interpretability.interpretability_models.utils import io
from interpretability.models.multilayer_perceptron import IrisMLP # This is the class of the model we have already trained

### Load the data 
Load the data and split it into the corpus of examples used for explanation and the test examples we will explain.

In [12]:
# Load the data
X, y = load_iris(return_X_y=True, as_frame=True)

# Get feature names
feature_names = X.columns.to_list()

# Split the data
X_corpus, X_test, y_corpus, y_test = train_test_split(X, y, test_size=0.2)

### Download the trained model from Google Drive

You could train your own model using the IrisMLP class and load it here, but we have trained one already.

Download the model using this link: https://drive.google.com/file/d/1MbQX1PYABB4XNO9c_SR-Mo3i6HjU0hB-/view?usp=sharing and save it in a location matching the path `TRAINED_MODEL_STATE_PATH` below. The default location is the `"resources/saved_models"` folder inside the root Interpretability directory.

### Load the model

In [13]:
## Load the model
model = IrisMLP(n_cont=4, input_feature_num=len(feature_names))

def load_trained_model(model, trained_model_state_path, device='cpu'):
    model.load_state_dict(torch.load(trained_model_state_path, map_location=torch.device(device)))
    model.eval()
    return model

DEVICE = "cpu"

root_path = pathlib.Path.cwd().parents[0]
saved_models_path = root_path / "resources/saved_models"
TRAINED_MODEL_STATE_PATH = saved_models_path / "model_cv1.pth"
model = load_trained_model(model, TRAINED_MODEL_STATE_PATH, device=DEVICE)

### Initialize SimplEX
Initialize the explainer object by passing the predictive model and corpus.

In [14]:
my_explainer = simplex_explainer.SimplexTabluarExplainer(
    model,
    X_corpus,
    y_corpus,
    estimator_type="classifier",
    feature_names=feature_names,
    corpus_size=100,
    device="cpu",
)

  data = torch.tensor(self.X.iloc[i, :], dtype=torch.float32)


### Fit the explainer

Fit the explainer on the test data. This makes explanations of the test data available in the subsequent step.

In [15]:
my_explainer.fit(X_test, y_test, n_epochs=10000)

  data = torch.tensor(self.X.iloc[i, :], dtype=torch.float32)


Weight Fitting Epoch: 2000/10000 ; Error: 129 ; Regulator: 25.3 ; Reg Factor: 1
Weight Fitting Epoch: 4000/10000 ; Error: 31.8 ; Regulator: 19.3 ; Reg Factor: 1
Weight Fitting Epoch: 6000/10000 ; Error: 11 ; Regulator: 10.8 ; Reg Factor: 1
Weight Fitting Epoch: 8000/10000 ; Error: 5.86 ; Regulator: 5.16 ; Reg Factor: 1
Weight Fitting Epoch: 10000/10000 ; Error: 4.18 ; Regulator: 1.94 ; Reg Factor: 1


### Get the explanation
Explain any given record in the test set by changing the index, i.

In [16]:
# Explain
i = 29
explanation = my_explainer.explain(
    i,
    baseline="median",
)

### Plot the explanation

The explanation is plotted as a styled df, in this notebook, but it is also viewable in the browser, if the `return_type` is set to "html".

In [21]:
df1, df2 = my_explainer.summary_plot(
    example_importance_threshold=0.000000001,
    output_file_prefix="",
    return_type="styled_df",
)
display(df1)
display(df2)

  importance_df_colors = importance_df_colors.applymap(


Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Test Prediction,Test Label
Test Record,5.5,2.3,4.0,1.3,1,1


Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),Example Importance,Corpus Prediction,Corpus Label
Corpus member 0,5.5,2.4,3.8,1.1,54.90%,1,1
Corpus member 1,6.3,2.3,4.4,1.3,20.57%,0,0
Corpus member 2,6.0,2.2,5.0,1.5,20.21%,0,0
Corpus member 3,5.5,2.4,3.7,1.0,1.14%,1,1
Corpus member 4,5.0,2.3,3.3,1.0,0.64%,0,0
Corpus member 5,5.8,2.6,4.0,1.2,0.17%,2,2
Corpus member 6,5.6,2.7,4.2,1.3,0.15%,0,0
Corpus member 7,5.2,2.7,3.9,1.4,0.14%,2,2
Corpus member 8,6.1,2.8,4.7,1.2,0.12%,0,0
Corpus member 9,5.7,2.8,4.1,1.3,0.09%,1,1


### Save the explainer to file
This file can now be uploaded to the [Interpretability Suite App](https://vanderschaarlab-demo-interpretabi-interpretability-suite-1uteyn.streamlit.app/). This provides a non-programtic interface with which to view the various explanations, allowing you to send the explainer to a colleague who is less fluent in python.

In [18]:
io.save_explainer(
    my_explainer, "my_new_iris_mlp_simplex_explainer.p"
)

Saving explainer to: /home/rob/Documents/projects/Interpretability/Notebooks/my_new_iris_mlp_simplex_explainer.p
