<a href="https://colab.research.google.com/github/akanksha0911/DeepLearning_Week4/blob/main/DeepLearning_week4_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **B). Various experiments with weights and biases of hyperparameters in weights and biases of various optimizers, layer depth width, learning rate etc both in keras and pytorch**



ref cr: https://colab.research.google.com/github/wandb/examples/blob/master/colabs/pytorch/Organizing_Hyperparameter_Sweeps_in_PyTorch_with_W%26B.ipynb#scrollTo=PMiVjwp9vftb

Sweeps: An Overview
There are  3 simple steps:

Define the sweep: do this by creating a dictionary or a YAML file that specifies the parameters to search through, the search strategy, the optimization metric et all.

Initialize the sweep: with one line of code we initialize the sweep and pass in the dictionary of sweep configurations: sweep_id = wandb.sweep(sweep_config)

Run the sweep agent: also accomplished with one line of code, we call wandb.agent() and pass the sweep_id to run, along with a function that defines your model architecture and trains it: wandb.agent(sweep_id, function=train)

Start out by installing the experiment tracking library and setting up your free W&B account:

Install with !pip install
import the library into Python
.login() so you can log metrics to your projects

In [1]:
%%capture
!pip install wandb --upgrade

# workaround to fetch MNIST data
!wget www.di.ens.fr/~lelarge/MNIST.tar.gz
!tar -zxvf MNIST.tar.gz

In [2]:
import wandb

wandb.login()

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

# Step 1️⃣. Define the Sweep

we just need to define your strategy in the form of a configuration.

When you're setting up a Sweep in a notebook like this, that config object is a nested dictionary. When you run a Sweep via the command line, the config object is a YAML file.

In [3]:
sweep_config = {
    'method': 'random'
    }

In [4]:
metric = {
    'name': 'loss',
    'goal': 'minimize'   
    }

sweep_config['metric'] = metric

In [5]:
parameters_dict = {
    'optimizer': {
        'values': ['adam', 'sgd']
        },
    'fc_layer_size': {
        'values': [128, 256, 512]
        },
    'dropout': {
          'values': [0.3, 0.4, 0.5]
        },
    }

sweep_config['parameters'] = parameters_dict

In [6]:
parameters_dict.update({
    'epochs': {
        'value': 1}
    })

For a grid search, that's all you ever need.

For a random search, all the values of a parameter are equally likely to be chosen on a given run.

If that just won't do, you can instead specify a named distribution, plus its parameters, like the mean mu and standard deviation sigma of a normal distribution.

In [7]:
import math

parameters_dict.update({
    'learning_rate': {
        # a flat distribution between 0 and 0.1
        'distribution': 'uniform',
        'min': 0,
        'max': 0.1
      },
    'batch_size': {
        # integers between 32 and 256
        # with evenly-distributed logarithms 
        'distribution': 'q_log_uniform',
        'q': 1,
        'min': math.log(32),
        'max': math.log(256),
      }
    })

In [8]:
import pprint

pprint.pprint(sweep_config)

{'method': 'random',
 'metric': {'goal': 'minimize', 'name': 'loss'},
 'parameters': {'batch_size': {'distribution': 'q_log_uniform',
                               'max': 5.545177444479562,
                               'min': 3.4657359027997265,
                               'q': 1},
                'dropout': {'values': [0.3, 0.4, 0.5]},
                'epochs': {'value': 1},
                'fc_layer_size': {'values': [128, 256, 512]},
                'learning_rate': {'distribution': 'uniform',
                                  'max': 0.1,
                                  'min': 0},
                'optimizer': {'values': ['adam', 'sgd']}}}


# Step 2️⃣. Initialize the Sweep

In [9]:
sweep_id = wandb.sweep(sweep_config, project="pytorch-sweeps-demo")

Create sweep with ID: e6fdfrn5
Sweep URL: https://wandb.ai/akanksharawat9/pytorch-sweeps-demo/sweeps/e6fdfrn5


# Step 3️⃣. Run the Sweep agent

we define a simple fully-connected neural network in PyTorch, and add the following wandb tools to log model metrics, visualize performance and output and track our experiments:

wandb.init() – Initialize a new W&B Run. Each Run is a single execution of the training function.
wandb.config – Save all your hyperparameters in a configuration object so they can be logged. Read more about how to use wandb.config here.
wandb.log() – log model behavior to W&B. Here, we just log the performance; see this Colab for all the other rich media that can be logged with wandb.log.

In [10]:
import torch
import torch.optim as optim
import torch.nn.functional as F
import torch.nn as nn
from torchvision import datasets, transforms

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

def train(config=None):
    # Initialize a new wandb run
    with wandb.init(config=config):
        # If called by wandb.agent, as below,
        # this config will be set by Sweep Controller
        config = wandb.config

        loader = build_dataset(config.batch_size)
        network = build_network(config.fc_layer_size, config.dropout)
        optimizer = build_optimizer(network, config.optimizer, config.learning_rate)

        for epoch in range(config.epochs):
            avg_loss = train_epoch(network, loader, optimizer)
            wandb.log({"loss": avg_loss, "epoch": epoch})   

In [11]:
def build_dataset(batch_size):
   
    transform = transforms.Compose(
        [transforms.ToTensor(),
         transforms.Normalize((0.1307,), (0.3081,))])
    # download MNIST training dataset
    dataset = datasets.MNIST(".", train=True, download=True,
                             transform=transform)
    sub_dataset = torch.utils.data.Subset(
        dataset, indices=range(0, len(dataset), 5))
    loader = torch.utils.data.DataLoader(sub_dataset, batch_size=batch_size)

    return loader


def build_network(fc_layer_size, dropout):
    network = nn.Sequential(  # fully-connected, single hidden layer
        nn.Flatten(),
        nn.Linear(784, fc_layer_size), nn.ReLU(),
        nn.Dropout(dropout),
        nn.Linear(fc_layer_size, 10),
        nn.LogSoftmax(dim=1))

    return network.to(device)
        

def build_optimizer(network, optimizer, learning_rate):
    if optimizer == "sgd":
        optimizer = optim.SGD(network.parameters(),
                              lr=learning_rate, momentum=0.9)
    elif optimizer == "adam":
        optimizer = optim.Adam(network.parameters(),
                               lr=learning_rate)
    return optimizer


def train_epoch(network, loader, optimizer):
    cumu_loss = 0
    for _, (data, target) in enumerate(loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()

        # ➡ Forward pass
        loss = F.nll_loss(network(data), target)
        cumu_loss += loss.item()

        # ⬅ Backward pass + weight update
        loss.backward()
        optimizer.step()

        wandb.log({"batch loss": loss.item()})

    return cumu_loss / len(loader)

which Sweep it's a part of (sweep_id)
which function it's supposed to run (here, train)
(optionally) how many configs to ask the Controller for (count)

cell below will launch an agent that runs train 5 times, usingly the randomly-generated hyperparameter values returned by the Sweep Controller. Execution takes under 5 minutes.

In [12]:
wandb.agent(sweep_id, train, count=5)

[34m[1mwandb[0m: Agent Starting Run: 10ht7lzf with config:
[34m[1mwandb[0m: 	batch_size: 155
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	fc_layer_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.05673733816932706
[34m[1mwandb[0m: 	optimizer: adam
[34m[1mwandb[0m: Currently logged in as: [33makanksharawat9[0m (use `wandb login --relogin` to force relogin)





VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,▁█▇▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
epoch,▁
loss,▁

0,1
batch loss,1.37494
epoch,0.0
loss,3.75987


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 0e14ejgc with config:
[34m[1mwandb[0m: 	batch_size: 33
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	fc_layer_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.003753334263927255
[34m[1mwandb[0m: 	optimizer: sgd





VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,█▇▆▅▃▄▃▂▂▂▃▂▂▁▁▁▃▂▃▂▂▃▁▂▂▂▁▂▁▂▁▂▁▁▁▁▂▁▁▁
epoch,▁
loss,▁

0,1
batch loss,0.78865
epoch,0.0
loss,0.63759


[34m[1mwandb[0m: Agent Starting Run: c0a9wb2j with config:
[34m[1mwandb[0m: 	batch_size: 81
[34m[1mwandb[0m: 	dropout: 0.3
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	fc_layer_size: 512
[34m[1mwandb[0m: 	learning_rate: 0.07923081354423983
[34m[1mwandb[0m: 	optimizer: sgd





VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,█▆▄▃▄▄▂▂▁▂▂▃▂▂▂▃▂▁▂▂▂▃▂▂▂▁▂▂▁▁▂▂▂▃▁▁▂▁▁▂
epoch,▁
loss,▁

0,1
batch loss,0.04618
epoch,0.0
loss,0.55105


[34m[1mwandb[0m: Agent Starting Run: f47kxpey with config:
[34m[1mwandb[0m: 	batch_size: 170
[34m[1mwandb[0m: 	dropout: 0.4
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	fc_layer_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.04566927047879006
[34m[1mwandb[0m: 	optimizer: sgd





VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,██▇▆▄▄▃▃▂▃▃▂▂▂▂▂▂▂▂▂▂▂▁▂▁▂▂▁▁▂▁▂▁▁▁▁▁▁▁▁
epoch,▁
loss,▁

0,1
batch loss,0.29723
epoch,0.0
loss,0.66329


[34m[1mwandb[0m: Agent Starting Run: c2uxjpz4 with config:
[34m[1mwandb[0m: 	batch_size: 45
[34m[1mwandb[0m: 	dropout: 0.5
[34m[1mwandb[0m: 	epochs: 1
[34m[1mwandb[0m: 	fc_layer_size: 256
[34m[1mwandb[0m: 	learning_rate: 0.05102739177684415
[34m[1mwandb[0m: 	optimizer: adam





VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
batch loss,█▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
epoch,▁
loss,▁

0,1
batch loss,1.95789
epoch,0.0
loss,2.74241


# Visualize Sweep Results

Check on this link:

https://wandb.ai/akanksharawat9/pytorch-sweeps-demo/sweeps/f6k4trub?workspace=user-akanksharawat9

# **Integrating Keras with Weights & Biases**

a simple image classifier and show you how to use Weights & Biases- training an image classifier for medMNIST (bloodMNIST) dataset.

W&B comes with a lightweight integration for Keras (WandbCallback) and with just a few lines of code you can log your metrics, save model, training configuration, evaluate model and more.

In [13]:
!pip install -qq medmnist

[?25l[K     |███▊                            | 10 kB 20.6 MB/s eta 0:00:01[K     |███████▌                        | 20 kB 14.4 MB/s eta 0:00:01[K     |███████████▏                    | 30 kB 13.1 MB/s eta 0:00:01[K     |███████████████                 | 40 kB 4.4 MB/s eta 0:00:01[K     |██████████████████▊             | 51 kB 5.0 MB/s eta 0:00:01[K     |██████████████████████▍         | 61 kB 5.9 MB/s eta 0:00:01[K     |██████████████████████████▏     | 71 kB 6.3 MB/s eta 0:00:01[K     |██████████████████████████████  | 81 kB 6.2 MB/s eta 0:00:01[K     |████████████████████████████████| 87 kB 3.4 MB/s 
[?25h  Building wheel for fire (setup.py) ... [?25l[?25hdone


In [14]:
# General Dependencies
import os
import random
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.cm as cm
%matplotlib inline

# For Deep Learning
import tensorflow as tf
print("TF: ", tf.__version__)
from tensorflow.keras import layers
from tensorflow.keras import models

# For MLOps
import wandb
print("W&B: ", wandb.__version__)
from wandb.keras import WandbCallback

# For medMNIST dataset
import medmnist
print("medMNIST: ", medmnist.__version__)
from medmnist import INFO

TF:  2.8.0
W&B:  0.12.11
medMNIST:  2.0.2


In [15]:
# Login to W&B
wandb.login()

True

## **Configs**

Keeping the track of hyperparameters used to train/evaluate your model is essential for reproducing the experiments. Here we will first define all the hyperparameters needed for training our classifier.



In [16]:
configs = dict(
    data_flag = 'bloodmnist',
    image_width = 32,
    image_height = 32,
    batch_size = 128,
    model_name = 'vgg16',
    pretrain_weights = 'imagenet',
    epochs = 100,
    init_learning_rate = 0.001,
    lr_decay_rate = 0.1,
    optimizer = 'adam',
    loss_fn = 'sparse_categorical_crossentropy',
    metrics = ['acc'],
    earlystopping_patience = 5
)

## **Prepare Dataset**

we are using BloodMNIST dataset, It contains a total of 17,092 images and is organized into 8 classes. We split the source dataset with a ratio of 7:1:2 into training, validation and test set. The source images with resolution 3×360×363 pixels are center-cropped into 3×200×200, and then resized into 3×28×28.

In [17]:
info = INFO[configs['data_flag']]
configs['class_names'] = info['label']
configs['image_channels'] = info['n_channels']

info

{'MD5': '7053d0359d879ad8a5505303e11de1dc',
 'description': 'The BloodMNIST is based on a dataset of individual normal cells, captured from individuals without infection, hematologic or oncologic disease and free of any pharmacologic treatment at the moment of blood collection. It contains a total of 17,092 images and is organized into 8 classes. We split the source dataset with a ratio of 7:1:2 into training, validation and test set. The source images with resolution 3×360×363 pixels are center-cropped into 3×200×200, and then resized into 3×28×28.',
 'label': {'0': 'basophil',
  '1': 'eosinophil',
  '2': 'erythroblast',
  '3': 'ig',
  '4': 'lymphocyte',
  '5': 'monocyte',
  '6': 'neutrophil',
  '7': 'platelet'},
 'license': 'CC BY 4.0',
 'n_channels': 3,
 'n_samples': {'test': 3421, 'train': 11959, 'val': 1712},
 'python_class': 'BloodMNIST',
 'task': 'multi-class',
 'url': 'https://zenodo.org/record/5208230/files/bloodmnist.npz?download=1'}

Each MedMNIST dataset can be downloaded using the download_and_prepare_dataset function below and the downloaded dataset is in the .npz format.

Each subset (e.g., bloodmnist.npz) is comprised of 6 keys: train_images, train_labels, val_images, val_labels, test_images and test_labels

In [18]:
def download_and_prepare_dataset(data_info: dict):
    """
    Utility function to download the dataset and return train/valid/test images/labels.

    Arguments:
        data_info (dict): Dataset metadata
    """
    data_path = tf.keras.utils.get_file(origin=data_info['url'], md5_hash=data_info['MD5'])

    with np.load(data_path) as data:
        # Get images
        train_images = data['train_images']
        valid_images = data['val_images']
        test_images = data['test_images']

        # Get labels
        train_labels = data['train_labels'].flatten()
        valid_labels = data['val_labels'].flatten()
        test_labels = data['test_labels'].flatten()

    return train_images, train_labels, valid_images, valid_labels, test_images, test_labels

In [19]:
train_images, train_labels, valid_images, valid_labels, test_images, test_labels = download_and_prepare_dataset(info)

Downloading data from https://zenodo.org/record/5208230/files/bloodmnist.npz?download=1


Explore the Dataset using W&B Tables

You can log data to W&B Tables row wise or column wise. In the section below, we have created the table column wise. Use add_column to define the name of the column and provide array of data associated with that column. Simply adding array of images will not render in the W&B Tables UI. You will have to wrap each image array with wandb.Image. To do so, add_computed_columns is used

In [20]:
log_full = True #@param {type:"boolean"}

if log_full:
    log_train_samples = len(train_images)
else:
    log_train_samples = 1000 

print(f'Number of train images : {log_train_samples} to be logged')

Number of train images : 11959 to be logged


In [21]:

# Initialize a new W&B run
run = wandb.init(project='medmnist-bloodmnist', group='viz_data')

# Intialize a W&B Artifacts
ds = wandb.Artifact("medmnist_bloodmnist_dataset", "dataset")

# Initialize an empty table
train_table = wandb.Table(columns=[], data=[])
# Add training data
train_table.add_column('image', train_images[:log_train_samples])
# Add training label_id
train_table.add_column('label_id', train_labels[:log_train_samples])
# Add training class names
train_table.add_computed_columns(lambda ndx, row:{
    "images": wandb.Image(row["image"]),
    "class_names": configs['class_names'][str(row["label_id"])]
    })

# Add the table to the Artifact
ds['train_data'] = train_table

# Let's do the same for the validation data
valid_table = wandb.Table(columns=[], data=[])
valid_table.add_column('image', valid_images)
valid_table.add_column('label_id', valid_labels)
valid_table.add_computed_columns(lambda ndx, row:{
    "images": wandb.Image(row["image"]),
    "class_name": configs['class_names'][str(row["label_id"])]
    })
ds['valid_data'] = valid_table

# Save the dataset as an Artifact
ds.save()

# Finish the run
wandb.finish()






VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

## **Data Pipeline**
tf.data.Dataset is used to build the data pipeline.

In [22]:
@tf.function
def preprocess(image: tf.Tensor, label: tf.Tensor):
    """
    Preprocess the image tensors and parse the labels
    """
    # Preprocess images
    image = tf.image.convert_image_dtype(image, tf.float32)
    
    # Parse label
    label = tf.cast(label, tf.float32)
    
    return image, label


def prepare_dataloader(images: np.ndarray,
                       labels: np.ndarray,
                       loader_type: str='train',
                       batch_size: int=128):
    """
    Utility function to prepare dataloader.
    """
    dataset = tf.data.Dataset.from_tensor_slices((images, labels))

    if loader_type=='train':
        dataset = dataset.shuffle(1024)

    dataloader = (
        dataset
        .map(preprocess, num_parallel_calls=tf.data.AUTOTUNE)
        .batch(batch_size)
        .prefetch(tf.data.AUTOTUNE)
    )

    return dataloader

In [23]:
trainloader = prepare_dataloader(train_images, train_labels, 'train', configs.get('batch_size', 64))
validloader = prepare_dataloader(valid_images, valid_labels, 'valid', configs.get('batch_size', 64))
testloader = prepare_dataloader(test_images, test_labels, 'test', configs.get('batch_size', 64))

## **Data Augmentation**


apply simple image augmentation policies using the Keras preprocessing layers API.

In [24]:
img_augmentation = models.Sequential(
    [
        layers.RandomRotation(factor=0.15),
        layers.RandomFlip()],
    name="img_augmentation",
)

In [25]:
#@title
def augment_5_times(img):
    augmented_imgs = []
    for _ in range(5):
        aug_img = tf.squeeze(img_augmentation(img), axis=0)
        # Notice the use of wrapping the images with wandb.Image
        wandb_image = wandb.Image(aug_img.numpy())
        augmented_imgs.append(wandb_image)

    return augmented_imgs

In [26]:
%%time

viz_augment_samples = 100

# Initialize a W&B run
run = wandb.init(project='medmnist-bloodmnist', group='viz_augmentation')

# Use the already logged dataset
train_art = run.use_artifact('ayush-thakur/medmnist-bloodmnist/medmnist_bloodmnist_dataset:latest', type='dataset')

# Get the train_table to access the data
train_table = train_art.get("train_data")

# Get the images, ground truth label, and row index
images = train_table.get_column("images", convert_to="numpy")
labels = train_table.get_column("label_id", convert_to="numpy")
ids = train_table.get_index()
# Shuffle the ids and slice
random.shuffle(ids)
sample_ids = ids[0:viz_augment_samples]

# Create augmentation table
augment_table = wandb.Table(columns=['image', 'truth', 'label_id', 'aug1', 'aug2', 'aug3', 'aug4', 'aug5'])

# Get augmented images and log it onto the table
for sample_id in sample_ids:
    img = images[sample_id]
    label = labels[sample_id]
    augmented_imgs = augment_5_times(tf.expand_dims(img, axis=0))
    augment_table.add_data(wandb.Image(img),
                           np.argmax(label),
                           configs['class_names'][str(label)],
                           augmented_imgs[0],
                           augmented_imgs[1],
                           augmented_imgs[2],
                           augmented_imgs[3],
                           augmented_imgs[4])

# Log the table
wandb.log({'augmented data': augment_table})

# Finish the run
wandb.finish()






VBox(children=(Label(value='1.114 MB of 1.121 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=0.993854…

CPU times: user 14.2 s, sys: 2.32 s, total: 16.5 s
Wall time: 35.8 s


## **Model**


using VGG16 as the backbone CNN block.

In [27]:
def get_model(input_shape: tuple=(28, 28, 3), 
              resize: tuple=(32, 32, 3),
              dropout_rate: float=0.5,
              num_classes: int=8,
              output_activation: str='softmax'):
  
    inputs = layers.Input(input_shape)
    resize_img = layers.Resizing(resize[0], resize[1], interpolation='bilinear')(inputs)
    augment_img = img_augmentation(resize_img)
  
    base_model = tf.keras.applications.VGG16(include_top=False, 
                                             weights=configs['pretrain_weights'], 
                                             input_shape=resize,
                                             input_tensor=augment_img)
    base_model.trainabe = True

    
    x = base_model.output
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(dropout_rate)(x)
    outputs = layers.Dense(num_classes, activation=output_activation)(x)

    return models.Model(inputs, outputs)

tf.keras.backend.clear_session()
model = get_model()
model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 28, 28, 3)]       0         
                                                                 
 resizing (Resizing)         (None, 32, 32, 3)         0         
                                                                 
 img_augmentation (Sequentia  (None, None, None, 3)    0         
 l)                                                              
                                                                 
 block1_conv1 (Conv2D)       (None, 32, 32, 64)        1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 32, 32, 64)        36928     
                                                      

Callback:  define early stopping callback. 

In [28]:
earlystopper = tf.keras.callbacks.EarlyStopping(
    monitor='val_loss', patience=configs['earlystopping_patience'], verbose=0, mode='auto',
    restore_best_weights=True
)

In [29]:
def lr_scheduler(epoch, lr):
    # log the current learning rate onto W&B
    if wandb.run is None:
        raise wandb.Error("You must call wandb.init() before WandbCallback()")

    wandb.log({'learning_rate': lr}, commit=False)
    
    if epoch < 7:
        return lr
    else:
        return lr * tf.math.exp(-configs['lr_decay_rate'])

lr_callback = tf.keras.callbacks.LearningRateScheduler(lr_scheduler)

In [30]:
def train(config: dict, 
          callbacks: list,
          verbose: int=0):
    """
    Utility function to train the model.

    Arguments:
        config (dict): Dictionary of hyperparameters.
        callbacks (list): List of callbacks passed to `model.fit`.
        verbose (int): 0 for silent and 1 for progress bar.
    """

    # Initalize model
    tf.keras.backend.clear_session()
    model = get_model(resize=(config.image_width, config.image_height, config.image_channels))

    # Compile the model
    opt = tf.keras.optimizers.Adam(learning_rate=config.init_learning_rate)
    model.compile(opt,
                  config.loss_fn,
                  metrics=config.metrics)

    # Train the model
    _ = model.fit(trainloader,
                  epochs=config.epochs,
                  validation_data=validloader,
                  callbacks=callbacks,
                  verbose=verbose)

    return model

Train using WandbCallback


In [31]:
# Initialize the W&B run
run = wandb.init(project='medmnist-bloodmnist', config=configs, job_type='train')
config = wandb.config

# Define WandbCallback for experiment tracking
wandb_callback = WandbCallback(monitor='val_loss',
                               log_weights=True,
                               log_evaluation=True,
                               validation_steps=5)

# callbacks
callbacks = [earlystopper, wandb_callback, lr_callback]

# Train
model = train(config, callbacks=callbacks, verbose=1)

# Evaluate the trained model
loss, acc = model.evaluate(validloader)
wandb.log({'evaluate/accuracy': acc})

# Close the W&B run.
wandb.finish()






VBox(children=(Label(value='171.999 MB of 171.999 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0,…

0,1
acc,▁
epoch,▁
evaluate/accuracy,▁
learning_rate,▁
loss,▁
val_acc,▁
val_loss,▁

0,1
acc,0.21833
best_epoch,0.0
best_val_loss,1.55392
epoch,0.0
evaluate/accuracy,0.44451
learning_rate,0.001
loss,2.003
val_acc,0.44451
val_loss,1.55392


Weights and Biases's Keras integration enables experiment tracking and so much more with just few lines of code


***************************

# **Integrating Keras with Weights & Biases**

In pseudocode, what we'll do is:
# import the library
import wandb

# start a new experiment
wandb.init(project="new-sota-model")

# capture a dictionary of hyperparameters with config
wandb.config = {"learning_rate": 0.001, "epochs": 100, "batch_size": 128}

# set up model and data
model, dataloader = get_model(), get_data()

# optional: track gradients
wandb.watch(model)

for batch in dataloader:
  metrics = model.training_step()
  # log metrics inside your training loop to visualize model performance
  wandb.log(metrics)

# optional: save model at the end
model.to_onnx()
wandb.save("model.onnx")

In [32]:
import os
import random

import numpy as np
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from tqdm.notebook import tqdm

# Ensure deterministic behavior
torch.backends.cudnn.deterministic = True
random.seed(hash("setting random seeds") % 2**32 - 1)
np.random.seed(hash("improves reproducibility") % 2**32 - 1)
torch.manual_seed(hash("by removing stochasticity") % 2**32 - 1)
torch.cuda.manual_seed_all(hash("so runs are repeatable") % 2**32 - 1)

# Device configuration
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# remove slow mirror from list of MNIST mirrors
torchvision.datasets.MNIST.mirrors = [mirror for mirror in torchvision.datasets.MNIST.mirrors
                                      if not mirror.startswith("http://yann.lecun.com")]

In [33]:
config = dict(
    epochs=5,
    classes=10,
    kernels=[16, 32],
    batch_size=128,
    learning_rate=0.005,
    dataset="MNIST",
    architecture="CNN")

let's define the overall pipeline, which is pretty typical for model-training:

we first make a model, plus associated data and optimizer, then

we train the model accordingly and finally

test it to see how training went.

In [34]:
def model_pipeline(hyperparameters):

    # tell wandb to get started
    with wandb.init(project="Akanksha_pytorch-demo", config=hyperparameters):
      # access all HPs through wandb.config, so logging matches execution!
      config = wandb.config

      # make the model, data, and optimization problem
      model, train_loader, test_loader, criterion, optimizer = make(config)
      print(model)

      # and use them to train the model
      train(model, train_loader, criterion, optimizer, config)

      # and test its final performance
      test(model, test_loader)

    return model

In [35]:
def make(config):
    # Make the data
    train, test = get_data(train=True), get_data(train=False)
    train_loader = make_loader(train, batch_size=config.batch_size)
    test_loader = make_loader(test, batch_size=config.batch_size)

    # Make the model
    model = ConvNet(config.kernels, config.classes).to(device)

    # Make the loss and optimizer
    criterion = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(
        model.parameters(), lr=config.learning_rate)
    
    return model, train_loader, test_loader, criterion, optimizer

Define the Data Loading and Model

In [36]:
def get_data(slice=5, train=True):
    full_dataset = torchvision.datasets.MNIST(root=".",
                                              train=train, 
                                              transform=transforms.ToTensor(),
                                              download=True)
    #  equiv to slicing with [::slice] 
    sub_dataset = torch.utils.data.Subset(
      full_dataset, indices=range(0, len(full_dataset), slice))
    
    return sub_dataset


def make_loader(dataset, batch_size):
    loader = torch.utils.data.DataLoader(dataset=dataset,
                                         batch_size=batch_size, 
                                         shuffle=True,
                                         pin_memory=True, num_workers=2)
    return loader

In [37]:
# Conventional and convolutional neural network

class ConvNet(nn.Module):
    def __init__(self, kernels, classes=10):
        super(ConvNet, self).__init__()
        
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, kernels[0], kernel_size=5, stride=1, padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(16, kernels[1], kernel_size=5, stride=1, padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.fc = nn.Linear(7 * 7 * kernels[-1], classes)
        
    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.reshape(out.size(0), -1)
        out = self.fc(out)
        return out

In [38]:
def train(model, loader, criterion, optimizer, config):
    # Tell wandb to watch what the model gets up to: gradients, weights, and more!
    wandb.watch(model, criterion, log="all", log_freq=10)

    # Run training and track with wandb
    total_batches = len(loader) * config.epochs
    example_ct = 0  # number of examples seen
    batch_ct = 0
    for epoch in tqdm(range(config.epochs)):
        for _, (images, labels) in enumerate(loader):

            loss = train_batch(images, labels, model, optimizer, criterion)
            example_ct +=  len(images)
            batch_ct += 1

            # Report metrics every 25th batch
            if ((batch_ct + 1) % 25) == 0:
                train_log(loss, example_ct, epoch)


def train_batch(images, labels, model, optimizer, criterion):
    images, labels = images.to(device), labels.to(device)
    
    # Forward pass ➡
    outputs = model(images)
    loss = criterion(outputs, labels)
    
    # Backward pass ⬅
    optimizer.zero_grad()
    loss.backward()

    # Step with optimizer
    optimizer.step()

    return loss

In [39]:
def train_log(loss, example_ct, epoch):
    # Where the magic happens
    wandb.log({"epoch": epoch, "loss": loss}, step=example_ct)
    print(f"Loss after " + str(example_ct).zfill(5) + f" examples: {loss:.3f}")

In [40]:
def test(model, test_loader):
    model.eval()

    # Run the model on some test examples
    with torch.no_grad():
        correct, total = 0, 0
        for images, labels in test_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

        print(f"Accuracy of the model on the {total} " +
              f"test images: {100 * correct / total}%")
        
        wandb.log({"test_accuracy": correct / total})

    # Save the model in the exchangeable ONNX format
    torch.onnx.export(model, images, "model.onnx")
    wandb.save("model.onnx")

In [43]:
# Build, train and analyze the model with the pipeline
model = model_pipeline(config)



ConvNet(
  (layer1): Sequential(
    (0): Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer2): Sequential(
    (0): Conv2d(16, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc): Linear(in_features=1568, out_features=10, bias=True)
)


  0%|          | 0/1 [00:00<?, ?it/s]

Loss after 01080 examples: 2.328
Loss after 02205 examples: 2.280
Loss after 03330 examples: 2.288
Loss after 04455 examples: 2.295
Loss after 05580 examples: 2.316
Loss after 06705 examples: 2.312
Loss after 07830 examples: 2.300
Loss after 08955 examples: 2.279
Loss after 10080 examples: 2.300
Loss after 11205 examples: 2.312
Accuracy of the model on the 2000 test images: 7.9%



VBox(children=(Label(value='0.112 MB of 0.112 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
epoch,▁▁▁▁▁▁▁▁▁▁
loss,█▁▂▃▆▆▄▁▄▆
test_accuracy,▁

0,1
epoch,0.0
loss,2.31232
test_accuracy,0.079
