In [None]:
import numpy as np
import scipy.stats as st
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set_context('paper')
sns.set_style('white')
# A helper function for downloading files
import requests
import os
def download(url, local_filename=None):
    """
    Downloads the file in the ``url`` and saves it in the current working directory.
    """
    data = requests.get(url)
    if local_filename is None:
        local_filename = os.path.basename(url)
    with open(local_filename, 'wb') as fd:
        fd.write(data.content)

# Hands-on Activity 25 - Deep Neural Networks Continued

## Objectives

+ Implement image classification network in `PyTorch`.
+ Add L2 regularization.
+ Add convolutional layers.
+ Add hyperparameter tuning.

## References 

+ [Deep Learning with PyTorch: A 60 minute blitz](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html) and in particular:
    - [Training a Classifier](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py) - with which we use the same dataset in this hands-on activity.

## The CIFAR10 dataset

We are going to use the [CIFAR10 dateset](https://www.cs.toronto.edu/~kriz/cifar.html) to demonstrate multiclass classification.
The dataset consists of 60000 32x32 color images in 10 classes (plane, car, bird, cat, deer, dog, frog, horse, ship, and truck), with 6000 images per class.
The dataset can be download direclty from `PyTorch` using the module `torchvision`.

You can think of the original images as 32x32x3 arrays.
The first two dimensions correspond to the pixels.
The third dimension corresponds to the color (red, green, blue).
Of course, we will have to turn them into `PyTorch` tensors.
Also, it is more convenient to scale them to be between $[-1,1]$.
We will achieve this using a transformation.
Don't worry about this now.
We will explain it as we go.

In [None]:
import torch
import torchvision
import torchvision.transforms as transforms

# This is the transformation that we will apply to each image
transform = transforms.Compose(
    [transforms.ToTensor(),   # This turns the picture to a Tensor
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) # This scales it to [-1, 1]

# Here is how you can download the training dataset
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)

# And here is how to download the test dataset:
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)

# These are the class labels
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Now, all these data went in the folder "./data."
Here is what this folder contains:

In [None]:
!ls ./data/

The file `cifar-10-python.tar.gz` is a compressed file containing everything.
The contents were automatically extracted and put in the folder `cifar-10-batches-py`.
Let's look insider this folder:

In [None]:
!ls -lht data/cifar-10-batches-py

You see several files.
The important ones are `data_batch_1` to `data_batch_5` and `test_batch`.
Each of these contains 10000 images in a binary format.
The format is explained [here](https://www.cs.toronto.edu/~kriz/cifar.html).
We can read them as follows:

In [None]:
def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

data = unpickle('data/cifar-10-batches-py/data_batch_1')
# data is a dictionary
# Here are the keys
print(data.keys())

In [None]:
# One key has to do with the pictures
# It gives you a numpy array:
print(data[b'data'].shape)

In [None]:
# The first dimension correspond to differnt picture
# The second dimension is
32 * 32 * 3

In [None]:
# So this is the first picture:
img = data[b'data'][0, :].reshape((32, 32, 3), order='F')
# Here is the Red channel:
print(img[:, :, 0])

The numbers go from 0 (no red) to 255 (full red).
Here is how to visualize it:

In [None]:
fig, ax = plt.subplots(figsize=(1, 1))
ax.imshow(np.transpose(img, (1, 0, 2)));

This is clearly a frog.
Let's verify this:

In [None]:
classes[data[b'labels'][0]]

This is nice. And we could proceed manually like this.
However, `PyTorch` offers some useful functionality.
Let's investigate the `trainset` that was returned by `CIFAR10`:

In [None]:
trainset

In [None]:
# Here are the classes:
trainset.classes

In [None]:
# Here is the correspondence between classes and discrete labels
trainset.class_to_idx

In [None]:
# Here are the images from all training batches
print(trainset.data.shape)

In [None]:
# Here are the labels
print(trainset.targets[:10])

Alright.
Now, let's use `PyTorch` functionality for looping over the training and the test datasets.
We need a [DataLoader](https://pytorch.org/docs/stable/data.html):

In [None]:
# One for the training data:
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=0)

# One for the test data:
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=0)


These objects work as follows:

In [None]:
# They help you loop over all the data in a random way (because we had shuffle=True)
for i, data in enumerate(trainloader, 0):
    inputs, labels = data
    # Here inputs are of size batch_size x (3 x 32 x 32)
    # Since we had specified, the batch_size to be 4
    # this essentially loads four images per iteration
    if i % 1000 == 0:
        print('Data point:', i, 'input size:', str(inputs.shape))

When you reach the end of the loop you have visited all the images once.
Notice that `PyTorch` has reshaped the images to 3 x 32 x 32 3D arrays.
This is more convenient for the convolutional layers we are going to use later.
Also, `PyTorch` is using the transformations we gave it to scale the data to array elements to $[-1, 1]$.
Let me show you an example:

In [None]:
for i, data in enumerate(trainloader, 0):
    inputs, labels = data
    print(inputs[0])
    break

## Training a classifier using a dense DNNs

Let's just train a classifer using a dense neural network.
It's not going to work very well, but it is very easy to put together.
We are going to start the network with 3 x 32 x 32 = 3072, followed up with a few dense layers that end at 10 outputs passed through softmax.
However, for reasons of numerical stability, we are not going to end with the softmax layer during training.

In [None]:
import torch.nn as nn

# The classifer - The dimensions of the layers have
# been picked to match those of the convolutional neural network
# that we are going to build later
# For now, just notice that we gradually take the 3072-dimensional input
# down to 10 dimensions (the number of classes we have)
# Also, notice that I do not add the softmax layer at this point
model_dense = nn.Sequential(nn.Linear(3072, 1176), nn.ReLU(),
                            nn.Linear(1176, 400), nn.ReLU(),
                            nn.Linear(400, 120), nn.ReLU(),
                            nn.Linear(120, 84), nn.ReLU(),
                            nn.Linear(84, 10))

# This is our loss function. 
# Read this: https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html
criterion = nn.CrossEntropyLoss()
# The reason we did not add the Softmax layer at the end is because
# the loss function above is doing it internally.
# It expects that you provide "contain raw, unnormalized scores for each class"

In [None]:
# Here is the optimizer
import torch.optim as optim
optimizer = optim.SGD(model_dense.parameters(), lr=0.001, momentum=0.9)

Let's train the network. This is going to take a while...

In [None]:
# How many times do you want to go over the entire dataset?
# Don't pick a very big number because you will overfit
num_epochs = 2

# Here is the main training algorithm
for epoch in range(num_epochs):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = model_dense(inputs.reshape(4, 3 * 32 * 32))
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 1000 == 999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 1000))
            running_loss = 0.0

print('Finished Training')

Since training networks takes a while, it's a good idea to save it:

In [None]:
torch.save(model_dense.state_dict(), 'hands-on-25-model-dense.pth')

Here it is as a file:

In [None]:
!ls -lht hands-on-25-model-dense.pth

Now let's make some predictions:

In [None]:
# Get the first four images and their labels
dataiter = iter(testloader)
images, labels = dataiter.next()

In [None]:
print(labels)

In [None]:
# Make predictions with the net and pass them through 
# softmax to turn them into probabilities
st = nn.Softmax(dim=1)
predictions = st(model_dense(images.reshape(4, 3072)))

In [None]:
def imshow(img, ax):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    ax.imshow(np.transpose(npimg, (1, 2, 0)))

# Plot the pictures and the predictions
for i in range(4):
    fig, ax = plt.subplots(figsize=(1,1))
    imshow(images[i], ax)
    fig2, ax2 = plt.subplots()
    ax2.bar(np.arange(10), predictions[i].detach().numpy())
    ax2.set_xticks(np.arange(10))
    ax2.set_xticklabels(classes)

Now, let's do the same thing with a convolutional neural network.
We are not going to use `nn.Sequential` this time.
Instead, we are going to use `nn.Module` to manually create the network.
The documentation is [here](https://pytorch.org/docs/stable/generated/torch.nn.Module.html).
You basically need to inherit `nn.Module`, and implement `__init__()` and `forward()`.

In [None]:
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # A convolutional layer:
        # 3 = input channels (colors),
        # 6 = output channels (features),
        # 5 = kernel size
        self.conv1 = nn.Conv2d(3, 6, 5)
        # A 2 x 2 max pooling layer - we are going to use it two times
        self.pool = nn.MaxPool2d(2, 2)
        # Another convolutional layer
        self.conv2 = nn.Conv2d(6, 16, 5)
        # Some linear layers
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # This function implements your network output
        # Convolutional layer, followed by relu, followed by max pooling
        x = self.pool(F.relu(self.conv1(x)))
        # Same thing
        x = self.pool(F.relu(self.conv2(x)))
        # Flatting the output of the convolutional layers
        x = x.view(-1, 16 * 5 * 5)
        # Go throught the first dense linear layer followed by relu
        x = F.relu(self.fc1(x))
        # Through the second dense layer
        x = F.relu(self.fc2(x))
        # Finish up with a linear transformation
        x = self.fc3(x)
        return x


model_cnn = Net()

Here is a new optimizer:

In [None]:
model_cnn

In [None]:
optimizer = optim.SGD(model_cnn.parameters(), lr=0.001, momentum=0.9)

In [None]:
# How many times do you want to go over the entire dataset?
# Don't pick a very big number because you will overfit
num_epochs = 2

# Here is the main training algorithm
for epoch in range(num_epochs):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = model_cnn(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 1000 == 999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 1000))
            running_loss = 0.0

print('Finished Training')

Make some predictions:

In [None]:
# Make predictions with the net and pass them through 
# softmax to turn them into probabilities
st = nn.Softmax(dim=1)
predictions = st(model_cnn(images))
for i in range(4):
    fig, ax = plt.subplots(figsize=(1,1))
    imshow(images[i], ax)
    fig2, ax2 = plt.subplots()
    ax2.bar(np.arange(10), predictions[i].detach().numpy())
    ax2.set_xticks(np.arange(10))
    ax2.set_xticklabels(classes)

It doesn't work equally well for all classes.
Here is some code from the `PyTorch` tutorial to get the accuracy for each class:

In [None]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = model_cnn(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

This is not very good. There are several things that we can do.
First, we would run this for more epochs. At least 50 epochs are probably needed to train it properly.
Second, we could add data augmentation.
This can be done through transformation, see [this](https://discuss.pytorch.org/t/data-augmentation-in-pytorch/7925).
Third, we have to make the netork a little bit bigger.
Here is [a list of large networks trained on CIFAR10](https://github.com/kuangliu/pytorch-cifar).
It is possible to reach an accuracy of 95%.

### Questions

+ Set the number of epochs for the CNN-based model to 40. How much better accuracy do you get? Make sure you do this right before you go to bed and look at it in the morning. Alternatively, you can go for a run...