# Tutorial

In this notebook, I'll show you how to use Wandb to perform the following tasks in a typical ML workflow:

1. Data versioning
1. Experiment tracking
1. Hyperparameter tuning

In order to run this notebook, please follow the instruction in the `README.md` file to setup your working environment with Wandb installed and your Wandb account logged in.

In [1]:
PROJECT_NAME = "soict-2022"

## 1. Data versioning

> Those that fail to learn from history are doomed to repeat it. - Winston Churchill

In Wandb, an `Artifact` is the input or output of a process. A `Run` is a task that we want to perform.

In ML, the most important artifacts are _datasets_ and _models_. They should be organized so that you can learn from them.

In Wandb, we can log `Artifact` as ouputs of Wandb `Run`s or use `Artifact` as input to `Run`s, as in this diagram, where a training run takes in a dataset and produces a model.

![wandb-artifact-run](assets/wandb-artifact-run.png)

This example uses the MNIST database. The MNIST database is a large database of handwritten digits that is commonly used for training various image processing systems.

![mnist-examples](assets/mnist-exmaples.png)

We start with the `Dataset`s:

- A training set and a validation set, for model training
- A test set, for model evaluation

The cell below defines these three datasets.

In [2]:
import os
import random
import torch
import torchvision
from torch.utils.data import TensorDataset
from tqdm.notebook import tqdm
import wandb

# Ensure deterministic behavior
torch.backends.cudnn.deterministic = True
random.seed(1)
torch.manual_seed(1)
torch.cuda.manual_seed_all(1)

# Device configuration
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Data parameters
num_classes = 10
input_shape = (1, 28, 28)
N_TRAIN_VALID = 1000
N_TEST = 200

# drop slow mirror from list of MNIST mirrors
torchvision.datasets.MNIST.mirrors = [mirror for mirror in torchvision.datasets.MNIST.mirrors
                                        if not mirror.startswith("http://yann.lecun.com")]

def load(n_train_valid=N_TRAIN_VALID, n_test=N_TEST):
    # split between train and test sets
    train = torchvision.datasets.MNIST("./", train=True, download=True)
    test = torchvision.datasets.MNIST("./", train=False, download=True)
    (x_train, y_train), (x_test, y_test) = (train.data, train.targets), (test.data, test.targets)
    x_train = x_train[:n_train_valid]
    y_train = y_train[:n_train_valid]
    x_test = x_test[:n_test]
    y_test = y_test[:n_test]

    # split off a validation set for hyperparameter tuning
    train_size = int(n_train_valid * 0.75)
    x_train, x_val = x_train[:train_size], x_train[train_size:]
    y_train, y_val = y_train[:train_size], y_train[train_size:]

    training_set = TensorDataset(x_train, y_train)
    validation_set = TensorDataset(x_val, y_val)
    test_set = TensorDataset(x_test, y_test)
    datasets = [training_set, validation_set, test_set]
    return datasets

In order to log these datasets as Artifacts, we just need to:

1. Create a Run with `wandb.init`
1. Create an Artifact for the dataset
1. Save and log the associated files

In [3]:
def load_and_log(steps):
    # start a run, with a type to label it and a project name
    with wandb.init(project=PROJECT_NAME, job_type="load-data") as run:
        datasets = load()  # separate code for loading the datasets
        names = ["training", "validation", "test"]

        # create our Artifact
        raw_data = wandb.Artifact(
            "mnist-preprocess", type="dataset",
            description="Preprocessed MNIST dataset",
            metadata={"source": "torchvision.datasets.MNIST",
                        "sizes": [len(dataset) for dataset in datasets]})

        for name, data in zip(names, datasets):
            # Store a new file in the artifact, and write data
            with raw_data.new_file(name + ".pt", mode="wb") as file:
                processed_dataset = preprocess(data, **steps)
                x, y = processed_dataset.tensors
                torch.save((x, y), file)

        # Save the artifact to Wandb
        run.log_artifact(raw_data)

def preprocess(dataset, normalize=True, expand_dims=True):
    x, y = dataset.tensors
    if normalize:
        # Scale images to the [0, 1] range
        x = x.type(torch.float32) / 255
    if expand_dims:
        # Make sure images have shape (1, 28, 28)
        x = torch.unsqueeze(x, 1)
    return TensorDataset(x, y)

steps = {"normalize": True, "expand_dims": True}

load_and_log(steps)

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33mdemo[0m. Use [1m`wandb login --relogin`[0m to force relogin


### wandb.init

`Run` defines which Wandb project we want to run.

Artifacts that are logged will be kept inside a single Wandb project. This keeps things simple, but Artifacts are portable across projects!

To keep track of different types of jobs, it's useful to provide a `job_type` when making Runs. This keeps the graph of your Artifacts nice and tidy.

**Note**: the `job_type` should be descriptive and correspond to a single step of your pipeline. Here, we separate out loading data from preprocessing data.

<br>

### wandb.Artifact

To log an Artifact, we make an Artifact object with a name.

**Note**: the name should be descriptive, hyphen-separated, and correspond to variable names in the code.

An Artifact also has a type. Just like `job_type` for Runs, this is used for organizing the graph of Runs and Artifacts.

You can attach a description and some metadata as a dictionary. The metadata needs to be serializable to JSON.

<br>

### artifact.new_file and run.log_artifact

Once we've made an Artifact object, we need to add files to it.

Artifacts are structured like directories, with files and sub-directories.

We use the `new_file` method to simultaneously write the file and attach it to the Artifact. We also use the `add_file` method, which separates those two steps.

Once we've added all of our files, we call `log_artifact` to upload artifacts to wandb.ai.

## 2. Experiment tracking

This example show us how Artifacts can improve your ML workflow.

This cell below builds a simple Convolutional Neurnet Net model in PyTorch.

In [4]:
from math import floor
import torch.nn as nn

class ConvNet(nn.Module):
      def __init__(self, hidden_layer_sizes=[32, 64],
            kernel_sizes=[3],
            activation="ReLU",
            pool_sizes=[2],
            dropout=0.5,
            num_classes=num_classes,
            input_shape=input_shape):
            super(ConvNet, self).__init__()

            self.layer1 = nn.Sequential(
                  nn.Conv2d(in_channels=input_shape[0], out_channels=hidden_layer_sizes[0], kernel_size=kernel_sizes[0]),
                  getattr(nn, activation)(),
                  nn.MaxPool2d(kernel_size=pool_sizes[0])
            )
            self.layer2 = nn.Sequential(
                  nn.Conv2d(in_channels=hidden_layer_sizes[0], out_channels=hidden_layer_sizes[-1], kernel_size=kernel_sizes[-1]),
                  getattr(nn, activation)(),
                  nn.MaxPool2d(kernel_size=pool_sizes[-1])
            )
            self.layer3 = nn.Sequential(
                  nn.Flatten(),
                  nn.Dropout(dropout)
            )

            fc_input_dims = floor((input_shape[1] - kernel_sizes[0] + 1) / pool_sizes[0]) # layer 1 output size
            fc_input_dims = floor((fc_input_dims - kernel_sizes[-1] + 1) / pool_sizes[-1]) # layer 2 output size
            fc_input_dims = fc_input_dims*fc_input_dims*hidden_layer_sizes[-1] # layer 3 output size

            self.fc = nn.Linear(fc_input_dims, num_classes)

      def forward(self, x):
            x = self.layer1(x)
            x = self.layer2(x)
            x = self.layer3(x)
            x = self.fc(x)
            return x

Let's train the model.

In [5]:
import torch.nn.functional as F
from torch.utils.data import DataLoader

def train(model, train_loader, valid_loader, config):
    optimizer = getattr(torch.optim, config.optimizer)(model.parameters())
    model.train()
    for epoch in range(config.epochs):
        train_epoch(model, train_loader, valid_loader, config.batch_log_interval, optimizer, epoch)

def train_epoch(model, train_loader, valid_loader, batch_log_interval, optimizer, epoch):
    example_ct = epoch * len(train_loader)
    cumu_loss = 0
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.cross_entropy(output, target)
        cumu_loss += float(loss)
        loss.backward()
        optimizer.step()

        example_ct += len(data)
        if batch_idx % batch_log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0%})]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                batch_idx / len(train_loader), loss.item()))
            train_log(loss, example_ct, epoch)

    if not valid_loader is None:
        # evaluate the model on the validation set at each epoch
        loss, accuracy = test(model, valid_loader)
        test_log(loss, accuracy, example_ct, epoch)

    return cumu_loss / len(train_loader)

def test(model, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.cross_entropy(output, target, reduction='sum')  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum()

    test_loss /= len(test_loader.dataset)
    accuracy = 100. * correct / len(test_loader.dataset)
    return test_loss, accuracy

def train_log(loss, example_ct, epoch):
    loss = float(loss)
    # where the magic happens
    wandb.log({"epoch": epoch, "train/loss": loss}, step=example_ct)
    print(f"Loss after " + str(example_ct).zfill(5) + f" examples: {loss:.3f}")
    
def test_log(loss, accuracy, example_ct, epoch):
    loss = float(loss)
    accuracy = float(accuracy)
    # where the magic happens
    wandb.log({"epoch": epoch, "validation/loss": loss, "validation/accuracy": accuracy}, step=example_ct)
    print(f"Loss/accuracy after " + str(example_ct).zfill(5) + f" examples: {loss:.3f}/{accuracy:.3f}")

We'll run two separate Artifact-producing Runs this time.

Once the first finishes training the model, the second will consume the trained model Artifact by evaluating its performance on the `test_dataset`.

We also select the most confused 32 examples -- on which the `categorical_crossentropy` is highest.

In [6]:
def evaluate(model, test_loader):
    loss, accuracy = test(model, test_loader)
    highest_losses, hardest_examples, true_labels, predictions = get_hardest_k_examples(model, test_loader.dataset)
    return loss, accuracy, highest_losses, hardest_examples, true_labels, predictions

def get_hardest_k_examples(model, testing_set, k=32):
    model.eval()
    loader = DataLoader(testing_set, 1, shuffle=False)
    # get the losses and predictions for each item in the dataset
    losses = None
    predictions = None
    with torch.no_grad():
        for data, target in loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            loss = F.cross_entropy(output, target)
            pred = output.argmax(dim=1, keepdim=True)
            
            if losses is None:
                losses = loss.view((1, 1))
                predictions = pred
            else:
                losses = torch.cat((losses, loss.view((1, 1))), 0)
                predictions = torch.cat((predictions, pred), 0)

    argsort_loss = torch.argsort(losses, dim=0)
    highest_k_losses = losses[argsort_loss[-k:]]
    hardest_k_examples = testing_set[argsort_loss[-k:]][0]
    true_labels = testing_set[argsort_loss[-k:]][1]
    predicted_labels = predictions[argsort_loss[-k:]]
    return highest_k_losses, hardest_k_examples, true_labels, predicted_labels

def train_and_log(model_config, train_config):
    with wandb.init(project=PROJECT_NAME, job_type="train", config=train_config) as run:
        train_config = wandb.config
        data = run.use_artifact('mnist-preprocess:latest')
        data_dir = data.download()

        training_dataset = read(data_dir, "training")
        validation_dataset = read(data_dir, "validation")
        train_loader = DataLoader(training_dataset, batch_size=train_config.batch_size)
        validation_loader = DataLoader(validation_dataset, batch_size=train_config.batch_size)
        
        train_config.update(model_config)
        model = ConvNet(**model_config)
        model = model.to(device)
        train(model, train_loader, validation_loader, train_config)
        
        model_artifact = wandb.Artifact(
            "trained-model", type="model",
            description="Trained NN model",
            metadata=dict(model_config))

        with model_artifact.new_file("trained_model.pth", mode="wb") as file:
            torch.save(model.state_dict(), file)

        run.log_artifact(model_artifact)

    return model
    
def evaluate_and_log(config=None):
    with wandb.init(project=PROJECT_NAME, job_type="report", config=config) as run:
        data = run.use_artifact('mnist-preprocess:latest')
        data_dir = data.download()
        testing_set = read(data_dir, "test")
        test_loader = torch.utils.data.DataLoader(testing_set, batch_size=128, shuffle=False)

        model_artifact = run.use_artifact("trained-model:latest")
        model_dir = model_artifact.download()
        model_path = os.path.join(model_dir, "trained_model.pth")
        model_config = model_artifact.metadata

        model = ConvNet(**model_config)
        model.load_state_dict(torch.load(model_path))
        model.to(device)

        loss, accuracy, highest_losses, hardest_examples, true_labels, preds = evaluate(model, test_loader)
        run.summary.update({"loss": loss, "accuracy": accuracy})

        wandb.log({"high-loss-examples":
            [wandb.Image(hard_example, caption=str(int(pred)) + "," +  str(int(label)))
                for hard_example, pred, label in zip(hardest_examples, preds, true_labels)]})

def read(data_dir, ds_name):
    filename = ds_name + ".pt"
    x, y = torch.load(os.path.join(data_dir, filename))
    return TensorDataset(x, y)

model_config = {"hidden_layer_sizes": [32, 64],
                "kernel_sizes": [3],
                "activation": "ReLU",
                "pool_sizes": [2],
                "dropout": 0.5,
                "num_classes": 10}

train_config = {"batch_size": 128,
                "epochs": 3,
                "batch_log_interval": 2,
                "optimizer": "Adam"}

model = train_and_log(model_config, train_config)
evaluate_and_log()

[34m[1mwandb[0m:   3 of 3 files downloaded.  


Loss after 00128 examples: 2.333
Loss after 00384 examples: 2.262
Loss after 00640 examples: 2.207
Loss/accuracy after 00750 examples: 2.124/39.200
Loss after 00134 examples: 2.054
Loss after 00390 examples: 1.952
Loss after 00646 examples: 1.920
Loss/accuracy after 00756 examples: 1.728/76.000
Loss after 00140 examples: 1.615
Loss after 00396 examples: 1.441
Loss after 00652 examples: 1.445
Loss/accuracy after 00762 examples: 1.207/77.600


0,1
epoch,▁▁▁▁▅█
train/loss,█▄▁
validation/accuracy,▁██
validation/loss,█▅▁

0,1
epoch,2.0
train/loss,2.20724
validation/accuracy,77.6
validation/loss,1.2069


[34m[1mwandb[0m:   3 of 3 files downloaded.  
[34m[1mwandb[0m:   1 of 1 files downloaded.  


VBox(children=(Label(value='0.041 MB of 0.041 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
accuracy,75.0
loss,1.1973


### run.use_artifact

To use an Artifact, we need to know its name and its version.

By default, the last uploaded version is tagged as `latest`. Versions are separated from names with `:`, so the Artifact we want is `mnist-preprocess:latest`.

<br>

### artifact.download

Before we actually download anything, we check to see if the right version is available locally by using *hashing*.

<br>

**Note**: the steps of the preprocessing are saved with the `preprocessed_data` as metadata.

If you're trying to make your experiments reproducible, capturing lots of metadata is a good idea!

## 3. Hyperparameter tuning

In Wandb, *Hyperparameter Sweeps* provide an organized and efficient way to search through high dimensional hyperparameter spaces to find the most performant model.

They enable this by automatically searching through combinations of hyperparameter values (e.g. learning rate, batch size, number of hidden layers, optimizer type) to find the most optimal values.

![wandb-sweep-overview](assets/wandb-sweep-overview.png)

To run a hyperparameter sweep with Wandb, there are 3 simple steps:

1.  Define the sweep configuration

    We do this by creating a dictionary that specifies the search strategy, optimization metric, and parameters to search through.

1.  Initialize the sweep
    
    We initialize the sweep and pass in the dictionary of sweep configurations

    ```bash
    sweep_id = wandb.sweep(sweep_config)
    ```

1.  Run the sweep agent
    
    We call `wandb.agent()` and pass the `sweep_id` to run, along with a function that defines your training steps:

    ```bash
    wandb.agent(sweep_id, function=train)
    ```

In this section, we'll see how you can run sophisticated hyperparameter sweeps using Wandb.


### 3.1. Define Sweep config

A Sweep combines a strategy for trying out a bunch of hyperparameter values with the code that evalutes them.

#### Pick a method
The first thing we need to define is the method for choosing new parameter values. It can be:

- `grid` Search – Iterate over every combination of hyperparameter values. Very effective, but can be computationally costly.
- `random` Search – Select each new combination at random according to provided distributions. Surprisingly effective!
- `bayesian` Search – Create a probabilistic model of metric score as a function of the hyperparameters, and choose parameters with high probability of improving the metric. Works well for small numbers of continuous parameters but scales poorly.

We select `random` Search for this notebook.

In [7]:
sweep_config = {
    'method': 'random',
}

Once you've picked a method to try out new values of the hyperparameters, you need to define what those parameters are.

This step is straightforward: just give the parameter a name and specify a list of legal values of the parameter.

In [8]:
parameters = {
    # epochs var doesn't vary, but we still want it here
    'epochs': {
        'value': 1,
    },
    'optimizer': {
        'values': ['adam', 'sgd'],
    },
    'hidden_layer_1_size': {
        'values': [16, 32],
    },
    'hidden_layer_2_size': {
        'values': [32, 64],
    },
    'dropout': {
        'values': [0.4, 0.5],
    },
    'learning_rate': {
        # a flat distribution between 0 and 0.1
        'distribution': 'uniform',
        'min': 0,
        'max': 0.1
    },
    'batch_size': {
        # integers between 32 and 256
        # with evenly-distributed logarithms 
        'distribution': 'q_log_uniform_values',
        'q': 8,
        'min': 32,
        'max': 256,
    }
}
sweep_config['parameters'] = parameters
sweep_config

{'method': 'random',
 'parameters': {'epochs': {'value': 1},
  'optimizer': {'values': ['adam', 'sgd']},
  'hidden_layer_1_size': {'values': [16, 32]},
  'hidden_layer_2_size': {'values': [32, 64]},
  'dropout': {'values': [0.4, 0.5]},
  'learning_rate': {'distribution': 'uniform', 'min': 0, 'max': 0.1},
  'batch_size': {'distribution': 'q_log_uniform_values',
   'q': 8,
   'min': 32,
   'max': 256}}}

Wandb also offers the option to `early_terminate` your runs with the `HyperBand` scheduling algorithm. See more [here](https://docs.wandb.ai/guides/sweeps/define-sweep-configuration#early_terminate).

### 3.2. Run the Sweep

The Sweep Controller is in charge of our Sweep. The Sweep Controller instructs how to run each set of hyperparameters via Agents.

In a typical Sweep, the Controller lives on Wandb's server, while the agents who complete runs live on your machine(s). This makes it easy to scale up Sweeps by just adding more machines to run agents!

![sweeps-diagram](assets/sweeps-diagram.png)

We can initialize a Sweep Controller by calling `wandb.sweep` with `sweep_config` and project name.

In [9]:
sweep_id = wandb.sweep(sweep_config, project=PROJECT_NAME)

Create sweep with ID: f2d0sj4o
Sweep URL: http://localhost:8080/demo/soict-2022/sweeps/f2d0sj4o


Before we can actually execute the sweep, we need to define the training procedure used in the Sweep.

In [10]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
batch_log_interval = 2

def train_sweep(config=None):
    with wandb.init(config=config) as run:
        # If called by wandb.agent, as below,
        # this config will be set by Sweep Controller
        config = wandb.config

        loader = build_dataset(run, config)
        model = build_model(run, config)
        optimizer = build_optimizer(model, config)

        for epoch in range(config.epochs):
            avg_loss = train_epoch(model, loader, None, batch_log_interval, optimizer, epoch)
            wandb.log({"loss": avg_loss, "epoch": epoch})

The cell below defines: `build_dataset`, `build_model`, and `build_optimizer`.

In [11]:
def build_dataset(run, config):
    batch_size = config.batch_size
    data = run.use_artifact('mnist-preprocess:latest')
    data_dir = data.download()
    training_dataset = read(data_dir, "training")
    sub_dataset = torch.utils.data.Subset(
        training_dataset, indices=range(0, len(training_dataset), 5))
    train_loader = DataLoader(sub_dataset, batch_size=batch_size)
    return train_loader

def build_model(run, config):
    model_config = {
        'hidden_layer_sizes': [
            config.hidden_layer_1_size,
            config.hidden_layer_2_size,
        ],
        'dropout': config.dropout,
    }
    model = ConvNet(**model_config)
    model = model.to(device)
    return model
        
def build_optimizer(model, config):
    optimizer = config.optimizer
    learning_rate = config.learning_rate
    if optimizer == "sgd":
        optimizer = torch.optim.SGD(model.parameters(),
                                lr=learning_rate, momentum=0.9)
    elif optimizer == "adam":
        optimizer = torch.optim.Adam(model.parameters(),
                                lr=learning_rate)
    return optimizer

The cell below will launch an agent that runs train 5 times.

In [12]:
wandb.agent(sweep_id, train_sweep, count=5)

[34m[1mwandb[0m: Agent Starting Run: wh3o1fbb with config:
[34m[1mwandb[0m: 	batch_size: 128
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	hidden_layer_1_size: 16
[34m[1mwandb[0m: 	hidden_layer_2_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.09373099134024647
[34m[1mwandb[0m: 	optimizer: adam
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


[34m[1mwandb[0m:   3 of 3 files downloaded.  


Loss after 00128 examples: 2.310


VBox(children=(Label(value='0.026 MB of 0.026 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
epoch,▁
loss,▁
train/loss,▁

0,1
epoch,0.0
loss,33.38466
train/loss,2.30975


[34m[1mwandb[0m: Agent Starting Run: grqs46wh with config:
[34m[1mwandb[0m: 	batch_size: 48
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	hidden_layer_1_size: 16
[34m[1mwandb[0m: 	hidden_layer_2_size: 32
[34m[1mwandb[0m: 	learning_rate: 0.02563970103436769
[34m[1mwandb[0m: 	optimizer: sgd
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


[34m[1mwandb[0m:   3 of 3 files downloaded.  


Loss after 00048 examples: 2.307
Loss after 00144 examples: 2.287


VBox(children=(Label(value='0.026 MB of 0.026 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
epoch,▁▁
loss,▁
train/loss,█▁

0,1
epoch,0.0
loss,2.30805
train/loss,2.28716


[34m[1mwandb[0m: Agent Starting Run: 9v8aq57l with config:
[34m[1mwandb[0m: 	batch_size: 208
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	hidden_layer_1_size: 16
[34m[1mwandb[0m: 	hidden_layer_2_size: 32
[34m[1mwandb[0m: 	learning_rate: 0.023653073963399208
[34m[1mwandb[0m: 	optimizer: adam
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


[34m[1mwandb[0m:   3 of 3 files downloaded.  


Loss after 00150 examples: 2.307


VBox(children=(Label(value='0.026 MB of 0.026 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
epoch,▁
loss,▁
train/loss,▁

0,1
epoch,0.0
loss,2.30698
train/loss,2.30698


[34m[1mwandb[0m: Agent Starting Run: 9kxfkpli with config:
[34m[1mwandb[0m: 	batch_size: 40
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	hidden_layer_1_size: 32
[34m[1mwandb[0m: 	hidden_layer_2_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.039848386304843275
[34m[1mwandb[0m: 	optimizer: sgd
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


[34m[1mwandb[0m:   3 of 3 files downloaded.  


Loss after 00040 examples: 2.307
Loss after 00120 examples: 2.252


VBox(children=(Label(value='0.026 MB of 0.026 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
epoch,▁▁
loss,▁
train/loss,█▁

0,1
epoch,0.0
loss,2.29722
train/loss,2.25237


[34m[1mwandb[0m: Agent Starting Run: 9v4z3xwn with config:
[34m[1mwandb[0m: 	batch_size: 32
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	hidden_layer_1_size: 16
[34m[1mwandb[0m: 	hidden_layer_2_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.019880191611981027
[34m[1mwandb[0m: 	optimizer: sgd
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


[34m[1mwandb[0m:   3 of 3 files downloaded.  


Loss after 00032 examples: 2.306
Loss after 00096 examples: 2.287
Loss after 00150 examples: 2.258


VBox(children=(Label(value='0.026 MB of 0.026 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
epoch,▁▁▁
loss,▁
train/loss,█▅▁

0,1
epoch,0.0
loss,2.2897
train/loss,2.25834


### 3.3. Visualize Sweep Results

#### Parallel Coordinates Plot

This plot maps hyperparameter values to model metrics.

![hyperparam-plot](assets/hyperparam-plot.png)

#### Hyperparameter Importance Plot

The hyperparameter importance plot surfaces which hyperparameters were the best predictors of your metrics. Wandb reports feature importance (using a random forest model) and correlation (using a linear model).

![hyperparam-importance](assets/hyperparam-importance.png)

These visualizations can help you save both time and resources running expensive hyperparameter optimizations by refining the parameters (and value ranges), and thereby worthy of further exploration.

### 3.4. Stop the Sweep

In [13]:
# For self-hosted Wandb server
!wandb sweep --stop "demo/$PROJECT_NAME/$sweep_id"

# For Wandb cloud server
# Stop sweep at https://<wandb-server-address>/<wandb-user>/soict-2022/sweeps/<sweep-id>/controls

[34m[1mwandb[0m: Stopping sweep demo/soict-2022/f2d0sj4o.
[34m[1mwandb[0m: Done.
