# Tutorial 2 - SimplEx for Time Series Data

In this tutorial we we create a simplex explainer object and use it to explain a test record. The explainer is then saved to disk and can be given to someone else to view in the Interpretability App. (TODO: add link to app).

We will be explaining the predictions of pytorch convolutional neural net that we have trained and saved separately on an engine noise dataset from IEEE World Congress on Computational Intelligence, 2008. The Interpretability.models module provides a pytorch model for this that is compatible with trained models `state_dict`s available on the Google Drive link below.

### Import the relevant modules

In [1]:
# IMPORTS
# Standard
import os
import numpy as np
import pathlib
# Third Party
import torch
# Interpretability
from interpretability.interpretability_models import simplex_explainer
from interpretability.interpretability_models.utils import io
from interpretability.models.recurrent_neural_net import ConvNet

### Load the data 
Load the data and split it into the corpus of examples used for explanation and the test examples we will explain. This cell will download the data from the `root_url` and save it to a subdirectory in the folder this notebook is being run.

In [2]:
# LOADS
def load_forda_data():

    def readucr(filename):
        data = np.loadtxt(filename, delimiter="\t")
        y = data[:, 0]
        x = data[:, 1:]
        return x, y.astype(int)

    root_url = "https://raw.githubusercontent.com/hfawaz/cd-diagram/master/FordA/"

    x_train, y_train = readucr(root_url + "FordA_TRAIN.tsv")
    x_test, y_test = readucr(root_url + "FordA_TEST.tsv")

    x_train = x_train.reshape((x_train.shape[0], x_train.shape[1], 1))
    x_test = x_test.reshape((x_test.shape[0], x_test.shape[1], 1))

    idx = np.random.permutation(len(x_train))
    x_train = x_train[idx]
    y_train = y_train[idx]

    y_train[y_train == -1] = 0
    y_test[y_test == -1] = 0

    return x_train, y_train, x_test, y_test


# LOAD data
(
    X_corpus,
    y_corpus,
    X_explain,
    y_explain,
) = load_forda_data()

### Download the trained model from Google Drive

You could train your own model using the ConvNet class and load it here, but we have trained one already.

Download the model using this link: https://drive.google.com/file/d/173vniHegUSGmdC6fKCLupynRoxEdz9Ko/view?usp=sharing and save it in a location matching the path `TRAINED_MODEL_STATE_PATH` below. The default is the desktop.

### Load the model

In [4]:
## Load the model
model = ConvNet()

def load_trained_model(model, trained_model_state_path):
    model.load_state_dict(torch.load(trained_model_state_path))
    model.eval()
    return model

desktop_path = pathlib.Path.home() / 'Desktop'

TRAINED_MODEL_STATE_PATH = os.path.join(desktop_path, "model_cv1_2.pth")
model = load_trained_model(model, TRAINED_MODEL_STATE_PATH)

### Initialize SimplEX
Initialize the explainer object by passing the predictive model and corpus.

In [5]:
# Fit SimplEx
corpus_size = 100
# Initialize SimplEX, fit it on test examples
my_explainer = simplex_explainer.SimplexTimeSeriesExplainer(
    model,
    X_corpus,
    y_corpus,
    estimator_type="classifier",
    feature_names=["Engine Noise"],
    corpus_size=corpus_size,
    device="cuda" if torch.cuda.is_available() else "cpu",
)

### Fit the explainer

Fit the explainer on the test data. This makes explanations of the test data available in the subsequent step.

In [6]:
my_explainer.fit(X_explain, y_explain)

Weight Fitting Epoch: 2000/10000 ; Error: 5.86e+06 ; Regulator: 863 ; Reg Factor: 1
Weight Fitting Epoch: 4000/10000 ; Error: 5.63e+06 ; Regulator: 620 ; Reg Factor: 1
Weight Fitting Epoch: 6000/10000 ; Error: 5.6e+06 ; Regulator: 598 ; Reg Factor: 1
Weight Fitting Epoch: 8000/10000 ; Error: 5.59e+06 ; Regulator: 593 ; Reg Factor: 1
Weight Fitting Epoch: 10000/10000 ; Error: 5.59e+06 ; Regulator: 591 ; Reg Factor: 1


### Get the explanation
Explain any given record in the test set by changing the index, i.

In [7]:
i = 1
explanation = my_explainer.explain(i, baseline="median")

### Plot the explanation

The explanation is plotted as a styled df, in this notebook, but it is also viewable in the browser, if the `return_type` is set to "html".

In [12]:
# Corpus of patients
my_explainer.summary_plot(
    # rescale_dict=rescale_dict,
    example_importance_threshold=0.08,
    time_steps_to_display=10,
    plot_test=True,
    output_file_prefix="testing",
    open_in_browser=True,
    return_type="styled_df",
)

Unnamed: 0,Engine Noise
(t_max) - 9,0.435186
(t_max) - 8,-0.346502
(t_max) - 7,-0.924912
(t_max) - 6,-1.208716
(t_max) - 5,-1.247996
(t_max) - 4,-1.139974
(t_max) - 3,-1.041772
(t_max) - 2,-1.041772
(t_max) - 1,-1.159614
(t_max),-1.375659


Corpus Example: 0
Example Importance: 0.15802191197872162


Unnamed: 0,0,1,2,3,4,5,6,7,8,9
Engine Noise,-0.665776,-0.617297,-0.422226,-0.027928,0.51273,1.066778,1.529639,1.760492,1.702779,1.356499


Corpus Example: 1
Example Importance: 0.12179539352655411


Unnamed: 0,0,1,2,3,4,5,6,7,8,9
Engine Noise,-1.292667,-1.337287,-1.337287,-1.337287,-1.359596,-1.348442,-1.370751,-1.381906,-1.393061,-1.393061


Corpus Example: 2
Example Importance: 0.12088353931903839


Unnamed: 0,0,1,2,3,4,5,6,7,8,9
Engine Noise,0.647003,-0.00981,-0.780005,-1.279253,-1.217744,-0.537045,0.570117,1.785945,2.729083,3.036628


Corpus Example: 3
Example Importance: 0.11819148808717728


Unnamed: 0,0,1,2,3,4,5,6,7,8,9
Engine Noise,-1.672539,-2.083083,-1.864126,-1.088655,0.009778,1.082665,1.830767,2.058847,1.739535,1.046173


### Save the explainer to file
This file can now be uploaded to the Interpretability Suite app (TODO: add link). This provides a non-programtic interface with which to view the various explanations, allowing you to send the explainer to a colleague who is less fluent in python.

In [9]:
io.save_explainer(
    my_explainer, "my_new_forda_conv_time_simplex_explainer.p"
)

Saving explainer to: /home/rob/Documents/projects/Interpretability/Notebooks/my_new_forda_conv_time_simplex_explainer.p
