# Lecture 2 - Basic Neural Network Example

We are now upto speed with the various APIs in PyTorch that enables automatic differentiation and weight updates. We can apply the things we learned on a real dataset now. Let us see how we can train a neural network to classify handwritten digits using the MNIST dataset.

In [None]:
import time 
import torch
import torchvision

import numpy as np
import matplotlib.pyplot as plt

from torch import nn, optim
from torchvision import datasets, transforms

Torch has a sister package called `TorchVision` which includes a variety of datasets and `torch.utils` has a `DataLoader` method to aid our research. We can specify image transformations directly in the `DataLoader` without the need to apply augmentations to individual samples before training.

MNIST is a dataset with $28\times 28$ images of handwritten digits. There are a total of 70,000 images. 60k for training and 10k for testing. We shall use `torchvision.dataset` to load MNIST dataset to memory.

In [None]:
torch.manual_seed(13)

N_train = 64
N_test = 256

# We will use torch.utils.data.DataLoader to wrap our dataset.
# This provides easier batching, GPU support, etc.
# Calling torchvision.datasets.MNIST() will download and format the MNIST
# dataset with the transforms we specify. Here, in the transforms we first convert
# the image to PyTorch tensor, and then normalize the image based on a given mean
# and standard deviation. Normalizing the image does: image = (image - mean) / std.
# We shuffle the data as well by defining shuffle=True.

train_loader = torch.utils.data.DataLoader(
  torchvision.datasets.MNIST('../Datasets/', train=True, download=True,
                             transform=torchvision.transforms.Compose([
                               torchvision.transforms.ToTensor(),
                               torchvision.transforms.Normalize(
                                 (0.1307,), (0.3081,))
                             ])),
  batch_size=N_train, shuffle=True)

test_loader = torch.utils.data.DataLoader(
  torchvision.datasets.MNIST('../Datasets/', train=False, download=True,
                             transform=torchvision.transforms.Compose([
                               torchvision.transforms.ToTensor(),
                               torchvision.transforms.Normalize(
                                 (0.1307,), (0.3081,))
                             ])),
  batch_size=N_test, shuffle=True)

Let's look at a small subset of the test dataset to get familiarized with the MNIST dataset. We can use `enumerate` method in Python to loop through lists or arrays. The `Dataloader` also support `enumerate` function. We can call `next` on the subset to get one batch of data out. Recall that we defined our testing batch size `N_test=256`. So we expect 256 number of $1\times 28 \times28$ tensors, since PyTorch save data by default in $NCHW$ format.

In [None]:
test_subset = enumerate(test_loader)
batch_idx, (one_batch_of_test_subset_x, one_batch_of_test_subset_y) = next(test_subset)

In [None]:
one_batch_of_test_subset_x.shape

We can use `matplotlib` to plot some of the images and corresponding labels inside the test data subset.

In [None]:
fig = plt.figure()
for i in range(6):
    plt.subplot(2,3,i+1)
    plt.tight_layout()
    plt.imshow(one_batch_of_test_subset_x[i][0], cmap='gray', interpolation='none')
    plt.title("Ground Truth: {}".format(one_batch_of_test_subset_y[i]))

In [None]:
def view_classify(img, ps, cmap='gray'):
    ''' 
    Function for viewing an image and it's predicted classes.
    Source: https://github.com/amitrajitbose/handwritten-digit-recognition
    '''
    ps = ps.data.numpy().squeeze()

    fig, (ax1, ax2) = plt.subplots(figsize=(6,9), ncols=2)
    ax1.imshow(img.resize_(1, 28, 28).numpy().squeeze(), cmap=cmap)
    ax1.axis('off')
    ax2.barh(np.arange(10), ps)
    ax2.set_aspect(0.1)
    ax2.set_yticks(np.arange(10))
    ax2.set_yticklabels(np.arange(10))
    ax2.set_title('Class Probability')
    ax2.set_xlim(0, 1.1)
    plt.tight_layout()

Let us now build a fully-connected network with 4 layers (including the input layer - which we often don't mention in PyTorch). 

In [None]:
input_size = 784
hidden_sizes = [128, 64]
output_size = 10

model = nn.Sequential(nn.Linear(input_size, hidden_sizes[0]),
                      nn.ReLU(),
                      nn.Linear(hidden_sizes[0], hidden_sizes[1]),
                      nn.ReLU(),
                      nn.Linear(hidden_sizes[1], output_size),
                      nn.LogSoftmax(dim=1))
print(model)

We will define the loss function as Negative Log Likelihood (https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html).

We will use Stochastic Gradient Descend (SGD) as the optimizer.

In [None]:
loss_fn = nn.NLLLoss() # also called criterion sometimes.
optimizer = optim.SGD(model.parameters(), lr=0.003, momentum=0.9)
start = time.time()

NUM_EPOCHS = 5
for EPOCH in range(NUM_EPOCHS):
    running_loss = 0
    for images, labels in train_loader:
        # Flatten MNIST images into a 784 long vector
        images = images.view(images.shape[0], -1)
    
        # Training pass
        optimizer.zero_grad()
        
        output = model(images)
        loss = loss_fn(output, labels)
        
        #This is where the model learns by backpropagating
        loss.backward()
        
        #And optimizes its weights here
        optimizer.step()
        
        running_loss += loss.item()
    else:
        print("Epoch {} - Training loss: {}".format(EPOCH, running_loss/len(train_loader)))
print("\nTraining Time (in minutes) =",(time.time()-start)/60)

#### Model Inference

In [None]:
images, labels = next(iter(test_loader))

img = images[0].view(1, 784)
with torch.no_grad():
    logps = model(img)

ps = torch.exp(logps)
probab = list(ps.numpy()[0])
print("Predicted Digit =", probab.index(max(probab)))
view_classify(img.view(1, 28, 28), ps, cmap='gray_r')

In [None]:
correct_count, all_count = 0, 0
for images,labels in test_loader:
    for i in range(len(labels)):
        img = images[i].view(1, 784)
        # Turn off gradients to speed up this part
        with torch.no_grad():
            logps = model(img)

        # Output of the network are log-probabilities, need to take exponential for probabilities
        ps = torch.exp(logps)
        probab = list(ps.numpy()[0])
        pred_label = probab.index(max(probab))
        true_label = labels.numpy()[i]
        if(true_label == pred_label):
            correct_count += 1
        all_count += 1

print("Number Of Images Tested =", all_count)
print("\nModel Accuracy =", (correct_count/all_count))

#### Saving and Loading trained models

We can save the whole model by using `torch.save`.

In [None]:
torch.save(model, 'mnist.pytorch')

We can use `torch.load` to load the model back. Let us use another variable to load the model.

In [None]:
model_loaded = torch.load('mnist.pytorch')

In [None]:
correct_count, all_count = 0, 0
for images,labels in test_loader:
    for i in range(len(labels)):
        img = images[i].view(1, 784)
        # Turn off gradients to speed up this part
        with torch.no_grad():
            logps = model_loaded(img)

        # Output of the network are log-probabilities, need to take exponential for probabilities
        ps = torch.exp(logps)
        probab = list(ps.numpy()[0])
        pred_label = probab.index(max(probab))
        true_label = labels.numpy()[i]
        if(true_label == pred_label):
            correct_count += 1
        all_count += 1

print("Number Of Images Tested =", all_count)
print("\nModel Accuracy =", (correct_count/all_count))