# MNIST Image Classification

## Introduction

I'm going to start with MNIST, because it's a dataset that I'm
massively familiar with that. On top of that, it has a continuous
distribution, so once I get comfortable with implementing a simple
convolutional neural network, I'll develop an auto encoder, and a 
variational auto encoder. 

For this example, we will do a simple convolutional neural network
in torch.

## TorchVision

The [torchvision](https://pytorch.org/docs/stable/torchvision/index.html) 
package contains utilities for many tasks in computer vision. 
These include transformations, loading toy data sets, and popular model
architectures. In this tutorial, I don't intend to use the last. I'll just 
[load](http://127.0.0.1:8888/?token=755449bfdb2fd6b486d94057b9759d4c877749c8b9a71482)
MNIST (a popular dataset of images of numbers) and play about with it. 

In addition, we will start by just training the model without transformations. While
this tutorial focuses on models in torch, it should be fun to see how transforms impact
MNIST. Up front, I'll hypothesize that any gains will be very much marginal. The MNIST
data set is pretty free of noise when preprocessed properly. Worth noting, there are 
plenty of datasets to play with, including QMNIST, which is not processed. 

In [1]:
import torch
import torchvision
import torchvision.transforms as transforms
from torchvision.datasets import MNIST

## Loading and Preprocessing

As noted, we won't do any elaborate preprocessing. Instead, we just load the data
and normalize.

The image datasets are loaded as PILImages. We need to load these as tensors, and
normalize them. For this, we can apply `transformation`s from `torchvision`.
These 
[transforms](https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision-transforms)
are similar to the sklearn transforms, but are specifically for images. There is
also a funcitonal module that allows for more custom control. 

Users can call the `Compose()` method to build pipelines of transformations. We 
just be casting the images to tensors, then centering the data ourselves. This 
way, we don't have to make any underlying assumptions about the distribution of
the data. Worth noting that `Normalize` always expects an iterable. If you need
to normalize single channel images, you must pass an iterable with one element. 

Note that mnist comes with a single (gray) channel of 28x28 pixels.

In [2]:
transform = transforms.Compose([
    transforms.ToTensor()
])
trainset = MNIST(
    root='./data/', train=True, 
    download=True, transform=transform
)

In [3]:
mu = trainset.data.float().mean() / 255
sigma = trainset.data.float().std() / 255

transform.transforms.append(
    transforms.Normalize(
        mean=(mu,), std=(sigma,)
    )
)

In [4]:
trainset

Dataset MNIST
    Number of datapoints: 60000
    Root location: ./data/
    Split: Train
    StandardTransform
Transform: Compose(
               ToTensor()
               Normalize(mean=(tensor(0.1307),), std=(tensor(0.3081),))
           )

In [5]:
BATCH_SIZE = 64
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=BATCH_SIZE,
    shuffle=True, num_workers=2
)

In [6]:
dataiter = iter(trainloader)
im, lab = dataiter.next()

im.size()

torch.Size([64, 1, 28, 28])