## Hyperparameter Sweeps in PyTorch and Wandb AI

First, fetch MNIST Dataset

In [1]:
!wget www.di.ens.fr/~lelarge/MNIST.tar.gz --no-check-certificate
!tar -zxvf MNIST.tar.gz

--2022-06-09 20:07:10--  http://www.di.ens.fr/~lelarge/MNIST.tar.gz
Resolving www.di.ens.fr... 129.199.99.14
Connecting to www.di.ens.fr|129.199.99.14|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://www.di.ens.fr/~lelarge/MNIST.tar.gz [following]
--2022-06-09 20:07:11--  https://www.di.ens.fr/~lelarge/MNIST.tar.gz
Connecting to www.di.ens.fr|129.199.99.14|:443... connected.
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: 'MNIST.tar.gz'

     0K .......... .......... .......... .......... .......... 59.0K
    50K .......... .......... .......... .......... .......... 75.7K
   100K .......... .......... .......... .......... ..........  101K
   150K .......... .......... .......... .......... ..........  103K
   200K .......... .......... .......... .......... ..........  101K
   250K .......... .......... .......... .......... ..........  138K
   30

In [2]:
import wandb

wandb.login()

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
wandb: Currently logged in as: raghvender. Use `wandb login --relogin` to force relogin


True

### 1. Pick a `Sweep` Method

In [3]:
sweep_config = {
    'method': 'random'
}

In [4]:
# Choose the Metric
metric = {
    'name': 'loss',
    'goal': 'minimize'   
    }

sweep_config['metric'] = metric

Name of the Hyperparameters to search and combine through

In [5]:
parameters_dict = {
    'optimizer': {
        'values': ['adam', 'sgd']
        },
    'fc_layer_size': {
        'values': [128, 256, 512]
        },
    'dropout': {
          'values': [0.3, 0.4, 0.5]
        },
    }

sweep_config['parameters'] = parameters_dict

In [6]:
# Set Hyperparameter whose values may vary but still to be in sweep_config
parameters_dict.update({
    'epochs': {
        'value': 1
    }
})

For a `random` search, all the `values` of a parameter are equally likely to be choosen. We can specify a named `distribution` with its parameters. For example, mean `mu` and standard deviation `sigma` of `normal` distribution.

In [7]:
parameters_dict.update({
    'learning_rate': {
        # a flat distribution between 0 and 0.1
        'distribution': 'uniform',
        'min': 0,
        'max': 0.1
      },
    'batch_size': {
        # integers between 32 and 256
        # with evenly-distributed logarithms 
        'distribution': 'q_log_uniform_values',
        'q': 8,
        'min': 32,
        'max': 256,
      }
    })

In [8]:
import pprint

pprint.pprint(sweep_config)

{'method': 'random',
 'metric': {'goal': 'minimize', 'name': 'loss'},
 'parameters': {'batch_size': {'distribution': 'q_log_uniform_values',
                               'max': 256,
                               'min': 32,
                               'q': 8},
                'dropout': {'values': [0.3, 0.4, 0.5]},
                'epochs': {'value': 1},
                'fc_layer_size': {'values': [128, 256, 512]},
                'learning_rate': {'distribution': 'uniform',
                                  'max': 0.1,
                                  'min': 0},
                'optimizer': {'values': ['adam', 'sgd']}}}


### 2. Intialize the Sweep

In [10]:
sweep_id = wandb.sweep(sweep_config, project='PyTorchMNIST_Sweep-WandbExamples')

Create sweep with ID: ioxyupew
Sweep URL: https://wandb.ai/raghvender/PyTorchMNIST_Sweep-WandbExamples/sweeps/ioxyupew


### 3. Run the Sweep Agent

1. `wandb.init()`: Init a new Wandb Run
2. `wandb.config()`: Save all Hyperparameters in a config object to be logged.
3. `wandb.log()`: Log model behaviour to Wandb.

In [11]:
import torch
import torch.optim as optim
import torch.nn.functional as F
import torch.nn as nn
from torchvision import datasets, transforms

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

def train(config=None):
    with wandb.init(config=config):
        config = wandb.config

        loader = build_dataset(config.batch_size)
        network = build_network(config.fc_layer_size, config.dropout)
        optimizer = build_optimizer(network, config.optimizer, config.learning_rate)

        for epoch in range(config.epochs):
            avg_loss = train_epoch(network, loader, optimizer)
            wandb.log({'loss': avg_loss, 'epoch': epoch})

In [12]:
# Helper Functions
def build_dataset(batch_size):
    transform = transforms.Compose(
        [transforms.ToTensor(),
         transforms.Normalize((0.1307,), (0.3081,))])
    # download MNIST training dataset
    dataset = datasets.MNIST(".", train=True, download=True,
                             transform=transform)
    sub_dataset = torch.utils.data.Subset(
        dataset, indices=range(0, len(dataset), 5))
    loader = torch.utils.data.DataLoader(sub_dataset, batch_size=batch_size)

    return loader


def build_network(fc_layer_size, dropout):
    network = nn.Sequential(  # fully-connected, single hidden layer
        nn.Flatten(),
        nn.Linear(784, fc_layer_size), nn.ReLU(),
        nn.Dropout(dropout),
        nn.Linear(fc_layer_size, 10),
        nn.LogSoftmax(dim=1))

    return network.to(device)
        

def build_optimizer(network, optimizer, learning_rate):
    if optimizer == "sgd":
        optimizer = optim.SGD(network.parameters(),
                              lr=learning_rate, momentum=0.9)
    elif optimizer == "adam":
        optimizer = optim.Adam(network.parameters(),
                               lr=learning_rate)
    return optimizer


def train_epoch(network, loader, optimizer):
    cumu_loss = 0
    for _, (data, target) in enumerate(loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()

        # ➡ Forward pass
        loss = F.nll_loss(network(data), target)
        cumu_loss += loss.item()

        # ⬅ Backward pass + weight update
        loss.backward()
        optimizer.step()

        wandb.log({"batch loss": loss.item()})

    return cumu_loss / len(loader)

In [13]:
# Run
wandb.agent(sweep_id, train, count=5)

wandb: Agent Starting Run: s25c3cat with config:
wandb: 	batch_size: 184
wandb: 	dropout: 0.5
wandb: 	epochs: 1
wandb: 	fc_layer_size: 256
wandb: 	learning_rate: 0.09130457309068588
wandb: 	optimizer: adam
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


VBox(children=(Label(value='0.000 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=0.0, max…

0,1
batch loss,▁▆█▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
epoch,▁
loss,▁

0,1
batch loss,1.79246
epoch,0.0
loss,7.77743


wandb: Agent Starting Run: icmcarrl with config:
wandb: 	batch_size: 248
wandb: 	dropout: 0.4
wandb: 	epochs: 1
wandb: 	fc_layer_size: 256
wandb: 	learning_rate: 0.02793892849268429
wandb: 	optimizer: adam
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,▂▇▇█▆▄▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
epoch,▁
loss,▁

0,1
batch loss,0.4941
epoch,0.0
loss,2.0077


wandb: Agent Starting Run: 39nspnem with config:
wandb: 	batch_size: 176
wandb: 	dropout: 0.3
wandb: 	epochs: 1
wandb: 	fc_layer_size: 512
wandb: 	learning_rate: 0.01055177385748306
wandb: 	optimizer: adam
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,▄▆█▄▃▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂
epoch,▁
loss,▁

0,1
batch loss,0.69347
epoch,0.0
loss,0.71802


wandb: Sweep Agent: Waiting for job.
wandb: Job received.
wandb: Agent Starting Run: nauf2asd with config:
wandb: 	batch_size: 40
wandb: 	dropout: 0.5
wandb: 	epochs: 1
wandb: 	fc_layer_size: 256
wandb: 	learning_rate: 0.028843491019654135
wandb: 	optimizer: sgd
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,█▆▄▅▄▃▂▂▃▄▃▃▃▂▂▃▄▄▂▃▂▂▁▂▂▃▂▂▂▁▂▃▁▂▂▁▂▂▁▃
epoch,▁
loss,▁

0,1
batch loss,0.69865
epoch,0.0
loss,0.61682


wandb: Agent Starting Run: p46i3gsf with config:
wandb: 	batch_size: 192
wandb: 	dropout: 0.3
wandb: 	epochs: 1
wandb: 	fc_layer_size: 128
wandb: 	learning_rate: 0.06753263373891516
wandb: 	optimizer: adam
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,▁▇█▅▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
epoch,▁
loss,▁

0,1
batch loss,1.36399
epoch,0.0
loss,3.98769
