<a href="https://colab.research.google.com/github/caocscar/workshops/blob/master/pytorch/Workshop_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Image Classification Problem

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import DataLoader
import torchvision
from torchvision import datasets, transforms
import numpy as np

print('Torch version', torch.__version__)
print('Torchvision version', torchvision.__version__)
print('Numpy version', np.__version__)

Torch version 1.3.1
Torchvision version 0.4.2
Numpy version 1.17.4


The following should say `cuda:0`. If it does not, we need to go to *Edit* -> *Notebook settings* and change it to a `GPU` from `None`. You only have to do this once per notebook.

In [2]:
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
device

'cpu'

Define a transform to convert image to PyTorch tensor

In [0]:
tf = transforms.ToTensor() # convert image to PyTorch tensor

Download training **dataset** and create `DataLoader`

In [0]:
train_loader = DataLoader(datasets.MNIST('data', download=True, train=True, transform=tf),
                           batch_size=100, 
                           shuffle=True)

Download validation **dataset** and create `DataLoader`


In [0]:
test_loader = DataLoader(datasets.MNIST('data', download=True, train=False, transform=tf),
                           batch_size=100, 
                           shuffle=True)

We'll write a python class to define out convolutional neural network.

In [0]:
class TwoLayerCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.batchnorm = nn.BatchNorm2d(1)
        self.conv1 = nn.Conv2d(1,4,5) # input image channel, output channels, square kernel size
        self.conv2 = nn.Conv2d(4,16,5)
        self.fc1 = nn.Linear(16*4*4,100) # fully connected, 4x4 image size result from 2 conv layers
        self.fc2 = nn.Linear(100,10)
        
    def forward(self,x):
        x1 = self.batchnorm(x)
        x1 = F.max_pool2d(F.relu(self.conv1(x1)), 2)
        x1 = F.max_pool2d(F.relu(self.conv2(x1)), 2)
        x1 = x1.view(-1, self.num_flat_features(x1))
        x1 = F.dropout(F.relu(self.fc1(x1), 0.4))
        x1 = F.relu(self.fc2(x1))
        return x1
                      
    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = np.prod(size)
        return num_features

We create an instance of this class

In [7]:
model = TwoLayerCNN().to(device)
model

TwoLayerCNN(
  (batchnorm): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv1): Conv2d(1, 4, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(4, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=256, out_features=100, bias=True)
  (fc2): Linear(in_features=100, out_features=10, bias=True)
)

We'll define a template for our `fit_model` function that contains `train`,  `validate`, and `accuracy` functions.

In [0]:
def fit_model(model, loss_fn, optimizer):
    def train(x,y):
        yhat = model(x)
        loss = loss_fn(yhat,y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        return loss.item(), accuracy(yhat,y)
    
    def validate(x,y):
        yhat = model(x)
        loss = loss_fn(yhat,y)
        return loss.item(), accuracy(yhat,y)
    
    def accuracy(yhat,y):
        probs = np.argmax(yhat.cpu().detach().numpy(), axis=1)
        actual = y.cpu().detach().numpy()
        correct = (probs == actual).sum()
        total = y.shape[0]
        return correct / total   
    
    return train, validate

We define our *loss function*, *learning rate*, and our *optimizer*. We pass this to `fit_model` to return our `train` and `validate` functions.

In [0]:
loss_fn = nn.CrossEntropyLoss()
learning_rate = 0.01
optimizer = optim.Adagrad(model.parameters(), lr=learning_rate)
train, validate = fit_model(model, loss_fn, optimizer)

Here is our training loop with mini-batch processing. We have to move each batch onto the GPU. We also should have a `DataLoader` for the validation dataset but we'll skip that in this case since it is so small.

In [10]:
epochs = 5
for epoch in range(epochs):
    # training    
    losses, accuracy = [], []
    for i, (xbatch, ybatch) in enumerate(train_loader):
        xbatch = xbatch.to(device)
        ybatch = ybatch.to(device)
        loss, acc = train(xbatch, ybatch)
        losses.append(loss)
        accuracy.append(acc)
    training_loss = np.mean(losses)
    training_accuracy = np.mean(accuracy)
    # validation
    val_losses, val_accuracy = [], []
    for j, (xtest, ytest) in enumerate(test_loader):
        xtest = xtest.to(device)
        ytest = ytest.to(device)
        val_loss, val_acc = validate(xtest, ytest)
        val_losses.append(val_loss)
        val_accuracy.append(val_acc)
    validation_loss = np.mean(val_losses)
    validation_accuracy = np.mean(val_accuracy)
    # print intermediate results
    print(f'{epoch}, {training_loss:.4f}, {training_accuracy:.3f}, {validation_loss:.4f}, {validation_accuracy:.3f}')

0, 0.3363, 0.899, 0.1599, 0.954
1, 0.1516, 0.956, 0.1300, 0.961
2, 0.1271, 0.963, 0.1067, 0.965
3, 0.1139, 0.967, 0.1046, 0.969
4, 0.1044, 0.970, 0.0955, 0.972


### nn.Sequential

If we wanted to user the simpler `nn.Sequential` function, our model construction would have looked like this.

In [11]:
model_sequential = nn.Sequential(
    nn.BatchNorm2d(1),
    nn.Conv2d(1,4,5),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Conv2d(4,16,5),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Flatten(),
    nn.Linear(256,100),
    nn.ReLU(),
    nn.Dropout(0.4),
    nn.Linear(100,10),
    nn.Softmax(dim=1),
).to(device)
model_sequential

Sequential(
  (0): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (1): Conv2d(1, 4, kernel_size=(5, 5), stride=(1, 1))
  (2): ReLU()
  (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (4): Conv2d(4, 16, kernel_size=(5, 5), stride=(1, 1))
  (5): ReLU()
  (6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (7): Flatten()
  (8): Linear(in_features=256, out_features=100, bias=True)
  (9): ReLU()
  (10): Dropout(p=0.4, inplace=False)
  (11): Linear(in_features=100, out_features=10, bias=True)
  (12): Softmax(dim=1)
)