
# Training a Classifier with Tensorboard

Last week, we looked at training a CNN classifier using the Fashion-MNIST dataset. Today we are going to learn how to run experiments in Tensorboard.
Tensorboard lets you track your training results and view graphs, so you can understand your model better. 

In this Lab session, we will cover the same training steps as in Lab session 1, this time using the CIFAR color dataset, with Tensorboard Integrated. We will do the following steps in order:

1. Import your libraries
2. Load the Dataset & Make the Dataset Iterable
3. Visualize the Data
4. Define the Network
5. Define Loss Function and Optimizer
6. Train the network
7. Save the model
8. Test the Network

Note that the code corresponding to Lab session 1 is already available in this notebook as a starting point, you do not need to reuse your own code (you can also read this [PyTorch Tutorials](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py) again for more explanations about the different steps).
Please read the instructions for each step in the code comments carefully. Places where you need to add or modify code are highlighted with a TODO instruction in comments.

Before you get started, please read the following tutorials which will help you completing the instructions:
* [Debuggercafe](https://debuggercafe.com/track-your-pytorch-deep-learning-project-with-tensorboard/)
* [PyTorch Tutorials](https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html)

# Step 1:Import the libraries

In [None]:
# Import the pytorch, torchvision and tensorboard libraries.
# You will be using all three of these throughout the lab.
import torch
import torchvision
import torchvision.transforms as transforms

# Verify the version of the two imported libraries.
print(torch.__version__)
print(torchvision.__version__)
!nvidia-smi

In [None]:
# TODO: Create a SummaryWriter using tensorboard.
# By default the writer will output to ./runs/ directory by default.
# Change this to the runs/CIFAR_Experiment_6 directory.


# TODO: Load the TensorBoard notebook extension


# Step 2:  Load the Dataset & Make the Dataset Iterable


In [None]:
# Define a transform to normalize the data
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) # 3 channels (RGB) for colored images

# Download the training datasets
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
# Make the Dataset Iterable -change 8 instead of 4
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

# Download the test datasets
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
# Make the Dataset Iterable
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

# Step 3: Visualize the Data

In [None]:
# Find the length of the dataset.
len(trainset)

In [None]:
# Find the shape of a single batch tensor in the dataset.
images, labels = next(iter(trainloader))
print("Type of images:", type(images))
print("Batch shape:", images.shape)
print("Corresponding lables vector shape:",labels.shape)

# Find the size of a single image in the dataset.
image, label = next(iter(trainset))
print("Single image shape:", image.shape)

## Display a sample grid of images
Let us show some of the training images.



In [None]:
# Import the matplotlib and numpy libraries.
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

# The images need to be unnormalised to view them correctly
def show_img(img):
    img = img / 2 + 0.5 
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    return npimg # return the unnormalized images

# Display the grid of images
# The imported libraries can be used to display the grid of images.
# get images 
dataiter = iter(trainloader)
images, labels = dataiter.next()
# create grid of images
img_grid = torchvision.utils.make_grid(images)
# get and show the unnormalized images
img_grid = show_img(img_grid)
# unnormalizing the images
img_grid = np.clip(img_grid, 0., 1.)


In [None]:
# TODO: load tensorboard in this cell. 
# You will need to run this cell again or use the TensorBoard refresh to view new changes throughout the notebook.

# TODO: Add the images to Tensorboard.


# Step 4: Define the Network

In [None]:
# Create a Convolutional Nueral Network with the following structure:
# Net(
#   (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
#   (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
#   (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
#   (fc1): Linear(in_features=16 * 4 * 4, out_features=120, bias=True)
#   (fc2): Linear(in_features=120, out_features=84, bias=True)
#   (fc3): Linear(in_features=84, out_features=10, bias=True)
# )
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

# define net - Model
net = Net()


In [None]:
# TODO: Add the neural network graph to TensorBoard.


# Step 5: Define Loss Function and Optimizer

Let's use a Classification Cross-Entropy loss and SGD with momentum.



In [None]:
# Create a criterion to calculate the cross entropy loss.
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
# Create an optimizer to perform Stochastic Gradient Descent.
# Let the learning rate = 0.001 and momentum = 0.9. Note that you can modify
# these parameters later.
optimizer = optim.SGD(net.parameters(), lr=0.002, momentum=0.9)

# Step 6: Train the network

This is when things start to get interesting.
We simply have to loop over our data iterator, and feed the inputs to the
network and optimize.



In [None]:
# TODO: Add the training loss and accuracy to TensorBoard after each epoch.
# to monitor the model's training.
# TODO: Modify the model's hyper-parameters defined in previous steps such as
# learning rate, momentum, batch size, number of epochs...
# Use TensorBoard to monitor the effect the hyper parameters have on 
# the model's training loss and accuracy. Note that you can refresh TensorBoard 
# while the training loop is running.
# What happens if you re-run this cell without resetting the network weights 
# (i.e. re-run step 4)?
num_epoch = 10
for epoch in range(num_epoch):  
    running_loss = 0.0
    running_correct = 0
    for data in (trainloader):
        # get the images and labels
        inputs, labels = data
        # set the parameter gradients to zero
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        _, preds = torch.max(outputs.data, 1)
        loss.backward()
        # update the parameters
        optimizer.step()
        running_loss += loss.item()
        running_correct += (preds == labels).sum().item()

    # log the epoch loss in TensorBoard here
   
    # log the epoch accuracy in TensorBoard here
    
    
    print(f"Epoch {epoch+1} train loss: {running_loss/len(trainset):.3f} train acc: {running_correct/len(trainset)}")
print('Finished Training')

# Step 7: Save the model
Let's quickly save our trained model:



In [None]:
# Save the model.
PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)

# Step 8: Test the network on the test data

We have trained the network for over the training dataset.
But we need to check if the network has learnt anything at all.

In [None]:
# TODO: Write a validation loop over the training set.
# The validation loop is very similar training loop, but 
# the loss is not back propogated through the network.

# TODO: Add the the test accuracy for each class in the validation set 
# to TensorBoard
# Hint: use the add_histogram() function.


# Further Reading
*   [CNN Explainer](https://poloclub.github.io/cnn-explainer/)
*   [Deep Learning Wizard](https://www.deeplearningwizard.com/deep_learning/practical_pytorch/pytorch_convolutional_neuralnetwork/)

*   [Deep Learning Tips and Tricks cheatsheet](https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-deep-learning-tips-and-tricks)

*   [Understanding CNN-cs231n](https://cs231n.github.io/understanding-cnn/)

*   [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html)


**Advanced:**

*   [Weights & Biases](https://www.wandb.com/)
*   [Weights & Biases with PyTorch Lightning](https://www.wandb.com/articles/pytorch-lightning-with-weights-biases)



