# Image recognition with MNIST Dataset

MNIST dataset is one of the most well studied datasets for image recognition.

**Objective**
Experiment with various deep learning techniques to see it's effects and understand what they mean.

I will be iteratively trying something and see it's effects on train accuracy and validation(test) accuracy. This is just an exercise to see how tinkering with each cog would work, and this approach is advisable only for learning purpose. 

**Model v1**

Made mistake of comparing training set for validataion instead of test set


## Imports

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets
from torchvision import transforms

dtype = torch.float
device = 'cuda' if torch.cuda.is_available() else 'cpu'

## Load in data

Let's load in the data. Thankfully torchvision library has inbuilt functions to download the data and load it in.
We will be doing only minimal transformations.

In [2]:
data_transforms = {
    'train' : transforms.Compose(
    [transforms.RandomRotation(10),
     transforms.ToTensor(),
     transforms.Normalize((0.1307,), (0.3081,))
    ]),
    'val' : transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.1307,), (0.3081,))
    ])
}

img_datasets = {'train': datasets.MNIST(root='./data', train=True, download=True, transform=data_transforms['train']),
                'val' : datasets.MNIST(root='./data', train=False, download=True, transform=data_transforms['val'])
                }
    
dataloaders_dict = {'train': torch.utils.data.DataLoader(img_datasets['train'], batch_size=64, 
                                                       shuffle=True, num_workers=4),
                    'val': torch.utils.data.DataLoader(img_datasets['val'], batch_size=64, 
                                                       shuffle=False, num_workers=4)
                    }

## Network definition

Here we will be defining our network.

In [3]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 10, 5)
        self.conv2 = nn.Conv2d(10, 20, 5)
        # an affine operation: y = Wx + b
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)
        
    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

net = Net()
print(net)

Net(
  (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1))
  (conv2_drop): Dropout2d(p=0.5)
  (fc1): Linear(in_features=320, out_features=50, bias=True)
  (fc2): Linear(in_features=50, out_features=10, bias=True)
)


# Training & Testing

Next we will create functions that run training and testing cycles

In [4]:
def train(model, device, train_loader, optimizer, criterion, epoch):
    model.train()
    running_corrects = 0
    running_loss = 0.0
    #optimizer = sgdr(optimizer, epoch+1)
    for data, target in train_loader:
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        _, preds = torch.max(output, 1)
        loss.backward()
        optimizer.step()
        
        running_corrects += torch.sum(preds == target.data)
        running_loss += loss.item() * target.size(0)
    print('Train Epoch: {} Acc: {} Loss: {:.6f}'.format(epoch, running_corrects, running_loss))
            
def test(model, device, test_loader, criterion):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += criterion(output, target).item() # sum up batch loss
            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)
    print('Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)'.format(
        test_loss, correct, len(test_loader.dataset),100. * correct / len(test_loader.dataset)))

In [5]:
def run_model():
    num_epochs = 20
    learning_rate = 3e-4
    optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate, weight_decay=0.001)
    scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=20)
    criterion = nn.CrossEntropyLoss()
    for epoch in range(num_epochs):
        train(net, device, dataloaders_dict['train'], optimizer, criterion, epoch)
        test(net, device, dataloaders_dict['val'], criterion)
        scheduler.step()

Now let's run the model and see, what we get ...

In [6]:
run_model()

Train Epoch: 0 Acc: 43722 Loss: 50006.551396
Test set: Average loss: 0.0028, Accuracy: 9493/10000 (94.93%)
Train Epoch: 1 Acc: 52880 Loss: 23677.654387
Test set: Average loss: 0.0018, Accuracy: 9652/10000 (96.52%)
Train Epoch: 2 Acc: 54343 Loss: 19175.770447
Test set: Average loss: 0.0015, Accuracy: 9704/10000 (97.04%)
Train Epoch: 3 Acc: 54972 Loss: 16829.553546
Test set: Average loss: 0.0012, Accuracy: 9762/10000 (97.62%)
Train Epoch: 4 Acc: 55577 Loss: 15030.259017
Test set: Average loss: 0.0011, Accuracy: 9773/10000 (97.73%)
Train Epoch: 5 Acc: 55813 Loss: 14244.480922
Test set: Average loss: 0.0010, Accuracy: 9798/10000 (97.98%)
Train Epoch: 6 Acc: 56146 Loss: 13305.230465
Test set: Average loss: 0.0010, Accuracy: 9804/10000 (98.04%)
Train Epoch: 7 Acc: 56256 Loss: 12892.700940
Test set: Average loss: 0.0009, Accuracy: 9817/10000 (98.17%)
Train Epoch: 8 Acc: 56475 Loss: 12052.413511
Test set: Average loss: 0.0009, Accuracy: 9813/10000 (98.13%)
Train Epoch: 9 Acc: 56663 Loss: 11667

And that's it.