In [1]:
from deepview import DeepView
import matplotlib.pyplot as plt
import numpy as np
import time
# ---------------------------
import demo_utils as demo

%load_ext autoreload
%autoreload 2
%matplotlib qt

In [2]:
# matplotlib qt seems to be a bit buggy with notebooks, so we execute it multiple times
%matplotlib qt

# Table of contents

<br>

<font size="+1"><b>
    
 0. [Usage Instructions](#DeepView-Usage-Instructions)
 0. [Setting up DeepView](#Demo-with-Torch-model)
 0. [Tuning $\lambda$ Hyperparameter](#Tuning-the-$\lambda$-Hyperparameter)
 
</b></font>

## Getting data and models
 
Each section in this notebook can be run independently, thus at the beginning of each section, the according model (i.e. torch/knn/decision tree) and the dataset will be initialized. The reason for this is, that running both torch and tensorflow simultaneously on the GPU may lead to problems.
This notebook tests the DeepView framework on different classifiers

 * ResNet-20 on CIFAR10
 * DecisionTree on MNIST
 * RandomForest on MNIST
 * KNN on MNIST

---

## DeepView Usage Instructions

 1. Create a wrapper funktion like ```pred_wrapper``` which receives a numpy array of samples and returns according class probabilities from the classifier as numpy arrays
 2. Initialize DeepView-object and pass the created method to the constructor
 3. Run your code and call ```add_samples(samples, labels)``` at any time to add samples to the visualization together with the ground truth labels.
    * The ground truth labels will be visualized along with the predicted labels
    * The object will keep track of a maximum number of samples specified by ```max_samples``` and it will throw away the oldest samples first
 4. Call the ```show``` method to render the plot

The following parameters must be specified on initialization:

| <p align="left">Variable               | <p align="left">Meaning           |
|------------------------|-------------------|
| <p align="left">(!)```pred_wrapper```     | <p align="left">Wrapper function allowing DeepView to use your model. Expects a single argument, which should be a batch of samples to classify. Returns (valid / softmaxed) prediction probabilities for this batch of samples. |
| <p align="left">(!)```classes```          | <p align="left">Names of all different classes in the data. |
| <p align="left">(!)```max_samples```      | <p align="left">The maximum amount of samples that DeepView will keep track of. When more samples are added, the oldest samples are removed from DeepView. |
| <p align="left">(!)```batch_size```       | <p align="left">The batch size used for classification |
| <p align="left">(!)```data_shape```       | <p align="left">Shape of the input data (complete shape; excluding the batch dimension) |
| <p align="left">```resolution```       | <p align="left">x- and y- Resolution of the decision boundary plot. A high resolution will compute significantly longer than a lower resolution, as every point must be classified, default 100. |
| <p align="left">```cmap```             | <p align="left">Name of the colormap that should be used in the plots, default 'tab10'. |
| <p align="left">```interactive```      | <p align="left">When ```interactive``` is True, this method is non-blocking to allow plot updates. When ```interactive``` is False, this method is blocking to prevent termination of python scripts, default True. |
| <p align="left">```title```            | <p align="left">Title of the deepview-plot. |
| <p align="left">```data_viz```         | <p align="left">DeepView has a reactive plot, that responds to mouse clicks and shows the according data sample, when it is clicked. You can pass a custom visualization function, if ```data_viz``` is None, DeepView will try to show each sample as an image, if possible. (optional, default None)  |
| <p align="left">```mapper```           | <p align="left">An object that maps samples from the data space to 2D space. Normally UMAP is used for this, but you can pass a custom mapper as well. (optional)  |
| <p align="left">```inv_mapper```       | <p align="left">An object that maps samples from the 2D space to the data space. Normally ```deepview.embeddings.InvMapper``` is used for this, but you can pass a custom inverse mapper as well. (optional)  |
| <p align="left">```kwargs```       | <p align="left">Configuration for the embeddings in case they are not specifically given in mapper and inv_mapper. Defaults to ```deepview.config.py```.  (optional)  |

## Demo with Torch model

1. Initialize a pretrained torch model
2. Create CIFAR10 dataset
3. Write the wrapper function ```pred_wrapper```
    1. (optional) Create a visualization function if you want to visualize single examples by clicking in the DeepView plot.
4. Initialize a ```DeepView``` object
5. Add samples to DeepView and call ```deepview.show```

In [3]:
import torch

# device will be detected automatically
# Set to 'cpu' or 'cuda:0' to set the device manually
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
# get torch model
torch_model = demo.create_torch_model(device)
# get CIFAR-10 data
testset = demo.make_cifar_dataset()

print('\nUsing device:', device)

Created PyTorch model:	 ResNet
 * Dataset:		 CIFAR10
 * Best Test prec:	 91.78000183105469
Files already downloaded and verified

Using device: cuda:0


In [46]:
# softmax operation to use in pred_wrapper
softmax = torch.nn.Softmax(dim=-1)

# this is the prediction wrapper, it encapsulates the call to the model
# and does all the casting to the appropriate datatypes
def pred_wrapper(x):
    with torch.no_grad():
        x = np.array(x, dtype=np.float32)
        tensor = torch.from_numpy(x).to(device)
        logits = torch_model(tensor)
        probabilities = softmax(logits).cpu().numpy()
    return probabilities

def visualization(image, point2d, pred, label=None, title=None):
    f, a = plt.subplots()
    a.set_title(title)
    a.imshow(image.transpose([1, 2, 0]))

# the classes in the dataset to be used as labels in the plots
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

# --- Deep View Parameters ----
batch_size = 512
max_samples = 500
data_shape = (3, 32, 32)
n = 5
lam = .0
resolution = 100
cmap = 'tab10'
title = 'ResNet-20 - CIFAR10'

deepview = DeepView(pred_wrapper, classes, max_samples, batch_size, 
                    data_shape, n, lam, resolution, cmap, title=title, data_viz=None)

In [47]:
n_samples = 200
sample_ids = np.random.choice(len(testset), n_samples)
X = np.array([ testset[i][0].numpy() for i in sample_ids ])
Y = np.array([ testset[i][1] for i in sample_ids ])

t0 = time.time()
deepview.add_samples(X, Y)
deepview.show()

print('Time to calculate visualization for %d samples: %.2f sec' % (n_samples, time.time() - t0))

Distance calculation 20.00 %
Distance calculation 40.00 %
Distance calculation 60.00 %
Distance calculation 80.00 %
Distance calculation 100.00 %
Embedding samples ...
Computing decision regions ...
Time to calculate visualization for 200 samples: 44.74 sec


> The visualization contains mostly collapsed clusters of data samples, because $\lambda$ isn't configured, yet.


## Tuning the $\lambda$ Hyperparameter

For $\lambda = 0$, the DeepView plot will be organized according to the certainty of the model. When this is used with a perfect model, the embedding will contain collapsed class clusters, as equally predicted points will be mapped to the same position.

<img alt="img_collapsed_clusters" width=400px src="https://user-images.githubusercontent.com/30961397/90417945-be51fa00-e0b4-11ea-8951-d0183432c90b.png">

For $\lambda = 1$, the DeepView plot will be organized according to structural properties of the datapoints, ignoring the predictions of the model. So for most image-datasets, the points will be scattered all over the place, as the structure itself is an insufficient indicator for sample class.

<img alt="img_diffused_clusters" width=400px src="https://user-images.githubusercontent.com/30961397/90418110-fc4f1e00-e0b4-11ea-8c85-fa3b53a8e531.png">

A balance must be found here in order to get class clusters that are organized by the structural properties of the data points.

**The [DeepView-paper](https://www.ijcai.org/Proceedings/2020/0319.pdf) suggests the following tuning method for $\lambda$:**

As a metric for how well the embedding $\pi$ is representing the view of the model, the consistency of the point clusters in the embedding with the models predictions can be used:
> [...] points that are close to each other should be points that are classified similarity by the classification model.

To evaluate on this, a KNN model is trained on the embedded samples with the model predictions as labels.
Thus, the leave-one-out error $Q_{kNN}$ of this classifier is the according metric. In order to tune $\lambda$,

> [...] we evaluate $\pi$ for $Q_{kNN}$ with $\lambda \in [0.2;0.8]$ and choose the largest one that does not degrade $Q_{kNN}$ significantly.[...]

This is applied here for 6 values linearly distributed accross $[0; 1]$

In [25]:
from deepview.evaluate import evaluate_umap

print('Evaluation of DeepView: %s\n' % deepview.title)

for l in np.linspace(0., 1., 6):
    deepview.verbose = False
    deepview.set_lambda(l)
    q_knn = evaluate_umap(deepview, X, Y, True)['pred']['fish_umap']
    print('Lambda: %.2f - Q_kNN: %.3f' % (l, q_knn))

Evaluation of DeepView: ResNet-20 - CIFAR10

Lambda: 0.00 - Q_kNN: 0.025
Lambda: 0.20 - Q_kNN: 0.020
Lambda: 0.40 - Q_kNN: 0.035
Lambda: 0.60 - Q_kNN: 0.065
Lambda: 0.80 - Q_kNN: 0.570
Lambda: 1.00 - Q_kNN: 0.825


In [27]:
for l in np.linspace(0.6, 0.8, 6):
    deepview.verbose = False
    deepview.set_lambda(l)
    q_knn = evaluate_umap(deepview, X, Y, True)['pred']['fish_umap']
    print('Lambda: %.2f - Q_kNN: %.3f' % (l, q_knn))

Lambda: 0.60 - Q_kNN: 0.065
Lambda: 0.64 - Q_kNN: 0.070
Lambda: 0.68 - Q_kNN: 0.135
Lambda: 0.72 - Q_kNN: 0.245
Lambda: 0.76 - Q_kNN: 0.355
Lambda: 0.80 - Q_kNN: 0.570


**Results**

These values show, that for $\lambda \lesssim 0.64$, classification clusters are developing in the embedding. For a greater $\lambda$, the clusters are diffusing too much. So choosing $\lambda = 0.65$ will allow the clusters to form, while also letting the structural information shape the clusters:

In [43]:
deepview.set_lambda(0.65)
deepview.show()

**Plot after $\lambda$-tuning**

<img alt="img_good_clusters" width=500px src="https://user-images.githubusercontent.com/30961397/90418547-8a2b0900-e0b5-11ea-8ae2-2b4a76c2e6ce.png">

In [None]:
deepview.close()