# HiPlot in Notebooks

A simple guide to get HiPlot working for hyperparameter tuning

https://github.com/facebookresearch/hiplot


## HiPlot + Optuna + PyTorch (Hyperparameter Optimisation Techniques and displays)

Using Optuna to perform hyperparameter optimisation on a MLP Pytorch Model, then visualising the results using HiPlot

In [1]:
# install hiplot library for visualisation
!pip install -U hiplot
# install optuna library for hyperparameter optimisation
!pip install optuna

Collecting hiplot
[?25l  Downloading https://files.pythonhosted.org/packages/b7/b3/7e8cc68d23544c4b44e5a551b2fde4b0621860680aa146f9c9cf583f3d53/hiplot-0.1.22-py3-none-any.whl (748kB)
[K     |▍                               | 10kB 17.4MB/s eta 0:00:01[K     |▉                               | 20kB 24.0MB/s eta 0:00:01[K     |█▎                              | 30kB 14.5MB/s eta 0:00:01[K     |█▊                              | 40kB 10.6MB/s eta 0:00:01[K     |██▏                             | 51kB 7.0MB/s eta 0:00:01[K     |██▋                             | 61kB 7.5MB/s eta 0:00:01[K     |███                             | 71kB 7.7MB/s eta 0:00:01[K     |███▌                            | 81kB 8.1MB/s eta 0:00:01[K     |████                            | 92kB 8.1MB/s eta 0:00:01[K     |████▍                           | 102kB 8.5MB/s eta 0:00:01[K     |████▉                           | 112kB 8.5MB/s eta 0:00:01[K     |█████▎                          | 122kB 8.5MB/s eta 

Collecting optuna
[?25l  Downloading https://files.pythonhosted.org/packages/87/10/06b58f4120f26b603d905a594650440ea1fd74476b8b360dbf01e111469b/optuna-2.3.0.tar.gz (258kB)
[K     |█▎                              | 10kB 18.2MB/s eta 0:00:01[K     |██▌                             | 20kB 23.4MB/s eta 0:00:01[K     |███▉                            | 30kB 17.3MB/s eta 0:00:01[K     |█████                           | 40kB 11.9MB/s eta 0:00:01[K     |██████▍                         | 51kB 7.9MB/s eta 0:00:01[K     |███████▋                        | 61kB 8.2MB/s eta 0:00:01[K     |████████▉                       | 71kB 8.6MB/s eta 0:00:01[K     |██████████▏                     | 81kB 9.2MB/s eta 0:00:01[K     |███████████▍                    | 92kB 9.4MB/s eta 0:00:01[K     |████████████▊                   | 102kB 9.7MB/s eta 0:00:01[K     |██████████████                  | 112kB 9.7MB/s eta 0:00:01[K     |███████████████▏                | 122kB 9.7MB/s eta 0:00:01[K

In [2]:
import hiplot as hip

In [3]:
#@title Hyperparameter tuning using PyTorch, Optuna {display-mode: "form"}
#@markdown Using MNIST dataset, optimise a PyTorch Model (MLP) to improve accuracy of hand-written digit recognition, taken from https://github.com/optuna/optuna/blob/master/examples/pytorch_simple.py


# This code will be hidden when the notebook is loaded.

"""
Optuna example that optimizes multi-layer perceptrons using PyTorch.
In this example, we optimize the validation accuracy of hand-written digit recognition using
PyTorch and MNIST. We optimize the neural network architecture as well as the optimizer
configuration. As it is too time consuming to use the whole MNIST dataset, we here use a small
subset of it.
"""

import os

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data
from torchvision import datasets
from torchvision import transforms

import optuna


DEVICE = torch.device("cpu")
BATCHSIZE = 128
CLASSES = 10
DIR = os.getcwd()
EPOCHS = 10
LOG_INTERVAL = 10
N_TRAIN_EXAMPLES = BATCHSIZE * 30
N_VALID_EXAMPLES = BATCHSIZE * 10


def define_model(trial):
    # HiPlot - No optimisation of number layers,
    # n_layers = trial.suggest_int("n_layers", 1, 3)
    n_layers = 3
    layers = []

    in_features = 28 * 28
    for i in range(n_layers):
        out_features = trial.suggest_int("n_units_l{}".format(i), 4, 128)
        layers.append(nn.Linear(in_features, out_features))
        layers.append(nn.ReLU())
        p = trial.suggest_float("dropout_l{}".format(i), 0.2, 0.5)
        layers.append(nn.Dropout(p))

        in_features = out_features
    layers.append(nn.Linear(in_features, CLASSES))
    layers.append(nn.LogSoftmax(dim=1))

    return nn.Sequential(*layers)


def get_mnist():
    # Load MNIST dataset.
    train_loader = torch.utils.data.DataLoader(
        datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor()),
        batch_size=BATCHSIZE,
        shuffle=True,
    )
    valid_loader = torch.utils.data.DataLoader(
        datasets.MNIST(DIR, train=False, transform=transforms.ToTensor()),
        batch_size=BATCHSIZE,
        shuffle=True,
    )

    return train_loader, valid_loader


def objective(trial):

    # Generate the model.
    model = define_model(trial).to(DEVICE)

    # Generate the optimizers.
    optimizer_name = trial.suggest_categorical("optimizer", ["Adam", "RMSprop", "SGD"])
    lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
    optimizer = getattr(optim, optimizer_name)(model.parameters(), lr=lr)

    # Get the MNIST dataset.
    train_loader, valid_loader = get_mnist()

    # Training of the model.
    for epoch in range(EPOCHS):
        model.train()
        for batch_idx, (data, target) in enumerate(train_loader):
            # Limiting training data for faster epochs.
            if batch_idx * BATCHSIZE >= N_TRAIN_EXAMPLES:
                break

            data, target = data.view(data.size(0), -1).to(DEVICE), target.to(DEVICE)

            optimizer.zero_grad()
            output = model(data)
            loss = F.nll_loss(output, target)
            loss.backward()
            optimizer.step()

        # Validation of the model.
        model.eval()
        correct = 0
        with torch.no_grad():
            for batch_idx, (data, target) in enumerate(valid_loader):
                # Limiting validation data.
                if batch_idx * BATCHSIZE >= N_VALID_EXAMPLES:
                    break
                data, target = data.view(data.size(0), -1).to(DEVICE), target.to(DEVICE)
                output = model(data)
                # Get the index of the max log-probability.
                pred = output.argmax(dim=1, keepdim=True)
                correct += pred.eq(target.view_as(pred)).sum().item()

        accuracy = correct / min(len(valid_loader.dataset), N_VALID_EXAMPLES)

        trial.report(accuracy, epoch)

        # Handle pruning based on the intermediate value.
        if trial.should_prune():
            raise optuna.exceptions.TrialPruned()

    return accuracy

In [4]:
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100, timeout=600)

pruned_trials = [t for t in study.trials if t.state == optuna.trial.TrialState.PRUNED]
complete_trials = [t for t in study.trials if t.state == optuna.trial.TrialState.COMPLETE]

print("Study statistics: ")
print("  Number of finished trials: ", len(study.trials))
print("  Number of pruned trials: ", len(pruned_trials))
print("  Number of complete trials: ", len(complete_trials))

print("Best trial:")
trial = study.best_trial

print("  Value: ", trial.value)

print("  Params: ")
for key, value in trial.params.items():
    print("    {}: {}".format(key, value))


[32m[I 2020-12-29 12:58:09,341][0m A new study created in memory with name: no-name-febbc74c-4be2-479d-a052-63ab4e285ccc[0m


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /content/MNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting /content/MNIST/raw/train-images-idx3-ubyte.gz to /content/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to /content/MNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting /content/MNIST/raw/train-labels-idx1-ubyte.gz to /content/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to /content/MNIST/raw/t10k-images-idx3-ubyte.gz



HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting /content/MNIST/raw/t10k-images-idx3-ubyte.gz to /content/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to /content/MNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting /content/MNIST/raw/t10k-labels-idx1-ubyte.gz to /content/MNIST/raw
Processing...
Done!









[32m[I 2020-12-29 12:58:18,934][0m Trial 0 finished with value: 0.90390625 and parameters: {'n_units_l0': 110, 'dropout_l0': 0.4836728228193914, 'n_units_l1': 20, 'dropout_l1': 0.41140888100563416, 'n_units_l2': 54, 'dropout_l2': 0.3223606449227492, 'optimizer': 'Adam', 'lr': 0.0015796953079710715}. Best is trial 0 with value: 0.90390625.[0m
[32m[I 2020-12-29 12:58:23,406][0m Trial 1 finished with value: 0.42109375 and parameters: {'n_units_l0': 18, 'dropout_l0': 0.27833853286383514, 'n_units_l1': 14, 'dropout_l1': 0.40999449958859474, 'n_units_l2': 5, 'dropout_l2': 0.2646475646866994, 'optimizer': 'RMSprop', 'lr': 0.00024158227667283385}. Best is trial 0 with value: 0.90390625.[0m
[32m[I 2020-12-29 12:58:28,303][0m Trial 2 finished with value: 0.30234375 and parameters: {'n_units_l0': 19, 'dropout_l0': 0.4959594057885399, 'n_units_l1': 59, 'dropout_l1': 0.2369056300546472, 'n_units_l2': 4, 'dropout_l2': 0.27327414114341636, 'optimizer': 'Adam', 'lr': 0.04808443449088103}. Best

Study statistics: 
  Number of finished trials:  100
  Number of pruned trials:  63
  Number of complete trials:  37
Best trial:
  Value:  0.94921875
  Params: 
    n_units_l0: 114
    dropout_l0: 0.3281262795464433
    n_units_l1: 67
    dropout_l1: 0.31844151838247703
    n_units_l2: 101
    dropout_l2: 0.40832913858419356
    optimizer: Adam
    lr: 0.004756481949923075


In [5]:
hyper_opt_data = []
for each_trial in study.trials:
    trial_params = each_trial.params.copy()
    trial_params["uid"] = each_trial.number
    trial_params["state"] = str(each_trial.state)
    trial_params["accuracy"] = each_trial.value
    hyper_opt_data.append(trial_params)
hip.Experiment.from_iterable(hyper_opt_data).display()

<IPython.core.display.Javascript object>

<hiplot.ipython.IPythonExperimentDisplayed at 0x7fdac08949e8>