# Tutorial 2 - SimplEx for Time Series Data

In this tutorial we we create a simplex explainer object and use it to explain a test record. The explainer is then saved to disk and can be given to someone else to view in the [Interpretability Suite App](https://vanderschaarlab-demo-interpretabi-interpretability-suite-1uteyn.streamlit.app/).

We will be explaining the predictions of pytorch convolutional neural net that we have trained and saved separately on an engine noise dataset from IEEE World Congress on Computational Intelligence, 2008. The Interpretability.models module provides a pytorch model for this that is compatible with trained models `state_dict`s available on the Google Drive link below.

### Import the relevant modules

In [None]:
# IMPORTS
# Standard
import os
import numpy as np
import pathlib
# Third Party
import torch
# Interpretability
from interpretability.interpretability_models import simplex_explainer
from interpretability.interpretability_models.utils import io
from interpretability.models.recurrent_neural_net import ConvNet
import sklearn

### Load the data 
Load the data and split it into the corpus of examples used for explanation and the test examples we will explain. This cell will download the data from the `root_url` and save it to a subdirectory in the folder this notebook is being run.

In [None]:
# LOADS
def load_forda_data():

    def readucr(filename):
        data = np.loadtxt(filename, delimiter="\t")
        y = data[:, 0]
        x = data[:, 1:]
        return x, y.astype(int)

    root_url = "https://raw.githubusercontent.com/hfawaz/cd-diagram/master/FordA/"

    x_train, y_train = readucr(root_url + "FordA_TRAIN.tsv")
    x_test, y_test = readucr(root_url + "FordA_TEST.tsv")

    x_train = x_train.reshape((x_train.shape[0], x_train.shape[1], 1))
    x_test = x_test.reshape((x_test.shape[0], x_test.shape[1], 1))

    idx = np.random.permutation(len(x_train))
    x_train = x_train[idx]
    y_train = y_train[idx]

    y_train[y_train == -1] = 0
    y_test[y_test == -1] = 0

    return x_train, y_train, x_test, y_test


# LOAD data
(
    X_corpus,
    y_corpus,
    X_explain,
    y_explain,
) = load_forda_data()

# # Scaling is not required here but purely shown for illustrative purposes 
# scaler = sklearn.preprocessing.MinMAxScaler()
# scaler.fit(X_corpus)
# X_corpus, X_explain = scaler.transform(X_corpus), scaler.transform(X_explain)

### Download the trained model from Google Drive

You could train your own model using the ConvNet class and load it here, but we have trained one already.

Download the model using this link: https://drive.google.com/file/d/173vniHegUSGmdC6fKCLupynRoxEdz9Ko/view?usp=sharing and save it in a location matching the path `TRAINED_MODEL_STATE_PATH` below. The default location is the `"resources/saved_models"` folder inside the root Interpretability directory.


### Load the model

In [None]:
## Load the model
model = ConvNet()

def load_trained_model(model, trained_model_state_path, device='cpu'):
    model.load_state_dict(torch.load(trained_model_state_path, map_location=torch.device(device)))
    model.eval()
    return model

DEVICE = "cpu"

root_path = pathlib.Path.cwd().parents[0]
saved_models_path = root_path / "resources/saved_models"
TRAINED_MODEL_STATE_PATH = saved_models_path / "model_cv1_2.pth"
model = load_trained_model(model, TRAINED_MODEL_STATE_PATH, device=DEVICE)

### Initialize SimplEX
Initialize the explainer object by passing the predictive model and corpus.

In [None]:
# Fit SimplEx
corpus_size = 100
# Initialize SimplEX, fit it on test examples
my_explainer = simplex_explainer.SimplexTimeSeriesExplainer(
    model,
    X_corpus,
    y_corpus,
    estimator_type="classifier",
    feature_names=["Engine Noise"],
    corpus_size=corpus_size,
    device="cuda" if torch.cuda.is_available() else "cpu",
)

### Fit the explainer

Fit the explainer on the test data. This makes explanations of the test data available in the subsequent step.

In [None]:
my_explainer.fit(X_explain, y_explain)

### Get the explanation
Explain any given record in the test set by changing the index, i.

In [None]:
i = 1
explanation = my_explainer.explain(i, baseline="median")

### Plot the explanation

The explanation is plotted as a styled df, in this notebook, but it is also viewable in the browser, if the `return_type` is set to "html".

In [None]:
# Corpus of patients
my_explainer.summary_plot(
    example_importance_threshold=0.08,
    time_steps_to_display=10,
    return_type="styled_df",
    # rescaler=scaler,
)

### Save the explainer to file
This file can now be uploaded to the [Interpretability Suite App](https://vanderschaarlab-demo-interpretabi-interpretability-suite-1uteyn.streamlit.app/). This provides a non-programtic interface with which to view the various explanations, allowing you to send the explainer to a colleague who is less fluent in python.

In [None]:
io.save_explainer(
    my_explainer, "my_new_forda_conv_time_simplex_explainer.p"
)