


<H1 style="text-align: center">ECMM426 - Computer Vision </H1>
<H1 style="text-align: center">Workshop 8</H1>
<H1 style="text-align: center">Neural Networks</H1>



## Neural Networks
In this workshop, we will implement a multi-layered perceptron (MLP) to classify digits. Specifically we will consider the MNIST dataset for training and testing our model. For this workshop, we are going to use [PyTorch](https://pytorch.org/), the cutting-edge deep learning framework to complete our task.

### MNIST Dataset and Data Loader

MNIST dataset contains 10 digits (classes) of 70000 images which are further splitted into train, validation and test sets. More details on the MNIST dataset can be found here: http://yann.lecun.com/exdb/mnist/

In [1]:
import torch
import torchvision
## Create dataloader, in PyTorch, we feed the trainer data with dataloader
## We create dataloader with dataset from torchvision, and we don't have to 
## download it seperately, all automatically done

# Define batch size, batch size is how much data you feed for training in one iteration
batch_size_train = 256 # We use a small batch size here for training
batch_size_test = 1024 #
# define how image should be transformed
image_transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor(), 
                                                  torchvision.transforms.Normalize((0.1307,), (0.3081,))])

# Since MNIST dataset is already implemented within 'torchvision.datasets', we 
# don't need to implement seperate dataset for MNIST. Below is how we can use 
# the MNIST dataset
#image datasets
train_dataset = torchvision.datasets.MNIST('data/', train=True, 
                                           download=True,
                                           transform=image_transform)
test_dataset = torchvision.datasets.MNIST('data/', train=False,
                                          transform=image_transform)
#data loaders
train_loader = torch.utils.data.DataLoader(train_dataset,
                                           batch_size=batch_size_train, 
                                           shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset,
                                          batch_size=batch_size_test, 
                                          shuffle=True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw



### Model
You have to define trainable layers and put them inside a model. Have a look on the documentation of [PyTorch](https://pytorch.org/docs/stable/index.html) and read more about different layers and functionalities of PyTorch there. Here we are going to implement various versions of the same Net model and use it for classification. Please note that they are the same model, but implemented in different way. Also note the difference.

In [2]:
## Now we can start to build our model
## We first import the pytorch nn module and optimizer
import torch.nn as nn
## Then define the model class
class Net1(nn.Module):
    def __init__(self, num_classes):
        super(Net1, self).__init__()
        self.linear1 = nn.Linear(in_features=784, out_features=4096, bias=True)
        self.relu1 = nn.ReLU()
        self.dropout1 = nn.Dropout()
        self.linear2 = nn.Linear(in_features=4096, out_features=4096, bias=True)
        self.relu2 = nn.ReLU()
        self.dropout2 = nn.Dropout()
        self.linear3 = nn.Linear(in_features=4096, out_features=num_classes, bias=True)

    def forward(self, x):
        # B x C x H x W -> B x C*H*W
        x = x.reshape(x.shape[0], -1)
        x = self.linear1(x)
        x = self.relu1(x)
        x = self.dropout1(x)
        x = self.linear2(x)
        x = self.relu2(x)
        x = self.dropout2(x)
        x = self.linear3(x)
        return x

class Net2(nn.Module):
    def __init__(self, num_classes):
        super(Net2, self).__init__()
        self.classifier = nn.Sequential(
            nn.Linear(in_features=784, out_features=4096, bias=True),
            nn.ReLU(),
            nn.Dropout(),
            nn.Linear(in_features=4096, out_features=4096, bias=True),
            nn.ReLU(),
            nn.Dropout(),
            nn.Linear(in_features=4096, out_features=num_classes, bias=True)
        )

    def forward(self, x):
        # B x C x H x W -> B x C*H*W
        x = x.reshape(x.shape[0], -1)
        x = self.classifier(x)
        return x

### Model and Optimizer Initialization
Please read the comments and understand the purpose of different lines of code.

In [3]:
import torch.optim as optim
## create model and optimizer
learning_rate = 0.0001
weight_decay = 0.0005
# define the model 
model = Net1(10)
# device: cuda or cpu
device = torch.device("cuda")
# map to device
model = model.to(device) # model.cuda()
# make the parameters trainable
for param in model.parameters():
    param.requires_grad = True
# define optimizer
optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)

### Meter
Meter for keeping losses and accuracies

In [4]:
class AverageMeter(object):
    """Computes and stores the average and current value"""
    def __init__(self):
        self.reset()

    def reset(self):
        self.val = 0
        self.avg = 0
        self.sum = 0
        self.count = 0

    def update(self, val, n=1):
        self.val = val
        self.sum += val * n
        self.count += n
        self.avg = self.sum / self.count

### Train and Test Functions

In [5]:
import torch.nn.functional as F
from tqdm import tqdm_notebook as tqdm
##define train function
def train(model, device, train_loader, optimizer):
    # meter
    loss = AverageMeter()
    # switch to train mode
    model.train()
    tk0 = tqdm(train_loader, total=int(len(train_loader)))
    for batch_idx, (data, target) in enumerate(tk0):
        data, target = data.to(device), target.to(device)  # data, target = data.cuda(), target.cuda()
        optimizer.zero_grad()
        output = model(data) # model.forward(data)
        loss_this = F.cross_entropy(output, target)
        loss_this.backward()
        optimizer.step()
        loss.update(loss_this.item(), target.shape[0])
    print('Train: Average loss: {:.4f}\n'.format(loss.avg))
        
##define test function
def test(model, device, test_loader):
    # meters
    loss = AverageMeter()
    acc = AverageMeter()
    # switch to test mode
    correct = 0
    model.eval()
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)  # data, target = data.cuda(), target.cuda()
            output = model(data)
            loss_this = F.cross_entropy(output, target) # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True) # get the index of the max log-probability
            correct_this = pred.eq(target.view_as(pred)).sum().item()
            correct += correct_this
            acc_this = correct_this/target.shape[0]*100.0
            acc.update(acc_this, target.shape[0])
            loss.update(loss_this.item(), target.shape[0])
    print('Test: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format(
        loss.avg, correct, len(test_loader.dataset), acc.avg))

### Training Loop
Training loop containing alternating train and test phase

In [6]:
num_epoch = 5
for epoch in range(1, num_epoch + 1):
    train(model, device, train_loader, optimizer)
    test(model, device, test_loader)

Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`
  tk0 = tqdm(train_loader, total=int(len(train_loader)))


  0%|          | 0/235 [00:00<?, ?it/s]

Train: Average loss: 0.3609

Test: Average loss: 0.1601, Accuracy: 9514/10000 (95.14%)



  0%|          | 0/235 [00:00<?, ?it/s]

Train: Average loss: 0.1541

Test: Average loss: 0.1128, Accuracy: 9671/10000 (96.71%)



  0%|          | 0/235 [00:00<?, ?it/s]

Train: Average loss: 0.1095

Test: Average loss: 0.0896, Accuracy: 9721/10000 (97.21%)



  0%|          | 0/235 [00:00<?, ?it/s]

Train: Average loss: 0.0853

Test: Average loss: 0.0775, Accuracy: 9771/10000 (97.71%)



  0%|          | 0/235 [00:00<?, ?it/s]

Train: Average loss: 0.0720

Test: Average loss: 0.0714, Accuracy: 9786/10000 (97.86%)



### Summary
Show the summary of the model

In [7]:
print(model)

Net1(
  (linear1): Linear(in_features=784, out_features=4096, bias=True)
  (relu1): ReLU()
  (dropout1): Dropout(p=0.5, inplace=False)
  (linear2): Linear(in_features=4096, out_features=4096, bias=True)
  (relu2): ReLU()
  (dropout2): Dropout(p=0.5, inplace=False)
  (linear3): Linear(in_features=4096, out_features=10, bias=True)
)


In [8]:
from torchsummary import summary
summary(model, (1, 28, 28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Linear-1                 [-1, 4096]       3,215,360
              ReLU-2                 [-1, 4096]               0
           Dropout-3                 [-1, 4096]               0
            Linear-4                 [-1, 4096]      16,781,312
              ReLU-5                 [-1, 4096]               0
           Dropout-6                 [-1, 4096]               0
            Linear-7                   [-1, 10]          40,970
Total params: 20,037,642
Trainable params: 20,037,642
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.19
Params size (MB): 76.44
Estimated Total Size (MB): 76.63
----------------------------------------------------------------
