# THE MNIST DATABASE of handwritten digits

The MNIST database of handwritten digits, available from [here](http://yann.lecun.com/exdb/mnist/), has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

Four files are available on this site:
```
train-images-idx3-ubyte.gz:  training set images
train-labels-idx1-ubyte.gz:  training set labels
t10k-images-idx3-ubyte.gz:   test set images
t10k-labels-idx1-ubyte.gz:   test set labels
```
These files are not in any standard image format. You have to write your own (very simple) program to read them. So, let's download, load and prepare the MNIST dataset.


In [1]:
from pathlib import Path
import requests
import gzip
import numpy as np

data_path = Path("data") / "mnist" / "raw"
data_path.mkdir(parents=True, exist_ok=True)

mnist_url = "http://yann.lecun.com/exdb/mnist/"
train_images = "train-images-idx3-ubyte.gz"
train_labels = "train-labels-idx1-ubyte.gz"
test_images = "t10k-images-idx3-ubyte.gz"
test_labels = "t10k-labels-idx1-ubyte.gz"

for filename in [train_images, train_labels, test_images, test_labels]:
    if not (data_path / filename).exists():
        content = requests.get(mnist_url + filename).content
        (data_path / filename).open("wb").write(content)

with gzip.open((data_path / train_images).as_posix(), "r") as f:
    x_train = np.frombuffer(f.read(), dtype=np.uint8, offset=16).reshape((-1, 28, 28))
with gzip.open((data_path / train_labels).as_posix(), "r") as f:
    y_train = np.frombuffer(f.read(), dtype=np.uint8, offset=8)
with gzip.open((data_path / test_images).as_posix(), "r") as f:
    x_test = np.frombuffer(f.read(), dtype=np.uint8, offset=16).reshape((-1, 28, 28))
with gzip.open((data_path / test_labels).as_posix(), "r") as f:
    y_test = np.frombuffer(f.read(), dtype=np.uint8, offset=8)

``x_train`` and ``x_test`` are ``uint8`` arrays of grayscale image data with shapes ``(num_samples, 28, 28)``. ``y_train`` and ``y_test`` are ``uint8`` arrays of digit labels (integers in range 0-9) with shapes ``(num_samples,)``.

Let's normalize the image samples from integers (in range 0-255) to floating-point numbers (in range 0.0-1.0):


In [2]:
x_train, x_test = x_train / 255.0, x_test / 255.0

PyTorch uses ``torch.tensor``, rather than numpy arrays, so we need to convert our data.


In [3]:
import torch

In [4]:
x_train = torch.tensor(x_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.int64)
x_test = torch.tensor(x_test, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.int64)

  y_train = torch.tensor(y_train, dtype=torch.int64)


PyTorch’s [TensorDataset](https://pytorch.org/docs/stable/_modules/torch/utils/data/dataset.html#TensorDataset) is a Dataset wrapping tensors. By defining a length and way of indexing, this also gives us a way to iterate, index, and slice along the first dimension of a tensor. This will make it easier to access both the independent and dependent variables in the same line as we train.

Pytorch’s [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader) is responsible for managing batches. You can create a ``DataLoader`` from any ``Dataset``. ``DataLoader`` makes it easier to iterate over batches.


In [5]:
from torch.utils.data import TensorDataset, DataLoader

In [6]:
train_set = TensorDataset(x_train, y_train)
train_loader = DataLoader(train_set, batch_size=32, shuffle=True, num_workers=0)

test_set = TensorDataset(x_test, y_test)
test_loader = DataLoader(test_set, batch_size=32, shuffle=True, num_workers=0)

Although, you should know that there is an easier way to download, load and prepare  MNIST dataset and create a ``DataLoader`` from the loaded dataset.

```python
from torchvision import datasets, transforms

transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_set = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_set = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

train_loader = torch.utils.data.DataLoader(train_set, batch_size=1, shuffle=True, num_workers=0)
test_loader = torch.utils.data.DataLoader(test_set, batch_size=1, shuffle=True, num_workers=0)
```

Now, let's build the ``Sequential`` model by stacking layers:


In [7]:
from torch import nn
from torch import optim

In [8]:
class Lambda(nn.Module):
    def __init__(self, func):
        super().__init__()
        self.func = func

    def forward(self, x):
        return self.func(x)

model = nn.Sequential(
    Lambda(lambda x: x.view(x.size(0), -1)),
    nn.Linear(28 * 28, 128),
    nn.ReLU(),
    nn.Dropout(0.2),
    nn.Linear(128, 10)
)

print(model)

Sequential(
  (0): Lambda()
  (1): Linear(in_features=784, out_features=128, bias=True)
  (2): ReLU()
  (3): Dropout(p=0.2, inplace=False)
  (4): Linear(in_features=128, out_features=10, bias=True)
)


Now, we choose an optimizer and loss function for training.


In [9]:
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())

The [``nn.CrossEntropyLoss``](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss) loss combines [``nn.LogSoftmax()``](https://pytorch.org/docs/stable/generated/torch.nn.LogSoftmax.html#torch.nn.LogSoftmax) and [``nn.NLLLoss()``](https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html#torch.nn.NLLLoss) in one single class.


In [10]:
def accuracy(outs, labels):
    outputs = np.argmax(outs, axis=1)
    return np.sum(outputs==labels) / float(labels.size)
metrics = {'accuracy': accuracy}

Now, let's train our model.


In [11]:
def fit(model, optimizer, loss_fn, train_dl, metrics, epoches=1):
    mtr_sum = []
    model.train()
    for epoch in range(epoches):
        for x, y in train_dl:
            optimizer.zero_grad()
            preds = model(x)
            loss = loss_fn(preds, y)
            loss.backward()
            optimizer.step()
            summary_batch = {metric: metrics[metric](preds.detach().numpy(), y.detach().numpy()) for metric in metrics}
            summary_batch['loss'] = loss.item()
            mtr_sum.append(summary_batch)
        metrics_mean = {metric: np.mean([mtr[metric] for mtr in mtr_sum]) for metric in mtr_sum[0]}
        metrics_string = " ; ".join("{}: {:09.6f}".format(k, v) for k, v in metrics_mean.items())
        print(f"Epoch {epoch+1} - ", metrics_string)

In [12]:
fit(model, optimizer, loss_fn, train_loader, metrics, epoches=10)

Epoch 1 -  accuracy: 00.909300 ; loss: 00.321321
Epoch 2 -  accuracy: 00.931658 ; loss: 00.236484
Epoch 3 -  accuracy: 00.942900 ; loss: 00.194619
Epoch 4 -  accuracy: 00.950079 ; loss: 00.168788
Epoch 5 -  accuracy: 00.955283 ; loss: 00.150335
Epoch 6 -  accuracy: 00.959075 ; loss: 00.136530
Epoch 7 -  accuracy: 00.961964 ; loss: 00.126071
Epoch 8 -  accuracy: 00.964506 ; loss: 00.117018
Epoch 9 -  accuracy: 00.966650 ; loss: 00.109446
Epoch 10 -  accuracy: 00.968472 ; loss: 00.103121


Now, that our model is trained, it's time to evaluate its performance on an unseen dataset.


In [13]:
def evaluate(model, loss_fn, test_dl, metrics):
    mtr_sum = []
    model.eval()
    with torch.no_grad():
        for x, y in test_dl:
            preds = model(x)
            loss = loss_fn(preds, y)
            summary_batch = {metric: metrics[metric](preds.detach().numpy(), y.detach().numpy()) for metric in metrics}
            summary_batch['loss'] = loss.item()
            mtr_sum.append(summary_batch)
        metrics_mean = {metric: np.mean([mtr[metric] for mtr in mtr_sum]) for metric in mtr_sum[0]}
        metrics_string = " ; ".join("{}: {:09.6f}".format(k, v) for k, v in metrics_mean.items())
        print(metrics_string)

In [14]:
evaluate(model, loss_fn, test_loader, metrics)

accuracy: 00.980431 ; loss: 00.071889


You can see that the image classifier is now trained to ~98% accuracy on this dataset.


## Convolutional Neural Networks

Now, let's change our model to take advantage of convolutional neural networks. We will use Pytorch’s predefined [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d) class as our convolutional layer.


In [15]:
conv_model = nn.Sequential(
    nn.Conv2d(1, 32, kernel_size=3),
    nn.ReLU(),
    nn.Flatten(),
    nn.Linear(32 * 26 * 26, 128),
    nn.ReLU(),
    nn.Linear(128, 10)
)

Now, we choose an optimizer and loss function for training.


In [16]:
conv_loss_fn = nn.CrossEntropyLoss()
conv_optimizer = optim.Adam(conv_model.parameters())

We need to resize the images to have their number of channels, ``1``, before their hight and width.


In [17]:
x_train = x_train.view(-1, 1, x_train.size(1), x_train.size(2))
x_test = x_test.view(-1, 1, x_test.size(1), x_test.size(2))

train_set = TensorDataset(x_train, y_train)
train_loader = DataLoader(train_set, batch_size=32, shuffle=True, num_workers=0)

test_set = TensorDataset(x_test, y_test)
test_loader = DataLoader(test_set, batch_size=32, shuffle=True, num_workers=0)

Now, we'll use the same ``fit`` function that we developed above to train our new model.


In [18]:
fit(conv_model, conv_optimizer, conv_loss_fn, train_loader, metrics, epoches=10)

Epoch 1 -  accuracy: 00.947083 ; loss: 00.178761
Epoch 2 -  accuracy: 00.964367 ; loss: 00.118494
Epoch 3 -  accuracy: 00.972544 ; loss: 00.090331
Epoch 4 -  accuracy: 00.977596 ; loss: 00.073194
Epoch 5 -  accuracy: 00.981197 ; loss: 00.061295
Epoch 6 -  accuracy: 00.983819 ; loss: 00.052630
Epoch 7 -  accuracy: 00.985774 ; loss: 00.046186
Epoch 8 -  accuracy: 00.987300 ; loss: 00.041166
Epoch 9 -  accuracy: 00.988456 ; loss: 00.037310
Epoch 10 -  accuracy: 00.989485 ; loss: 00.034013


Also, we'll use the same ``evaluate`` function that we developed above to test our model on an unseen dataset.


In [19]:
evaluate(conv_model, conv_loss_fn, test_loader, metrics)

accuracy: 00.981130 ; loss: 00.106488


You can see that the image classifier is now trained to ~98% accuracy on this dataset.
