# Multilayer Perceptron

### Data

**MNIST Data**: Modified National Institute of Standards and Technology database

Digit representation in pixels, a large database of handwritten digits that is commonly used for training various image processing systems.

The MNIST database contains 60,000 training images and 10,000 testing images taken from American Census Bureau employees and American high school students.

In [0]:
!pip3 install torch
!pip3 install torchvision

In [0]:
import torch
import torch.nn.functional as F
from torchvision import datasets, transforms
from torch import optim

In [0]:
# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,), (0.5,)),
                              ])

In [0]:
trainset = datasets.MNIST('~/.pytorch/MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

In [0]:
testset = datasets.MNIST('~/.pytorch/MNIST_data/', download=True, train=False, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=True)

In PyTorch we usually process multiple samples at the same time. Usually we stack up each training vector along rows to create $X$ matrix.

\begin{equation}
X = \begin{pmatrix}
- & - & x_1 & - & -\\
- & - & x_2 & - & -\\
. & . & . & . & .\\
. & . & . & . & .\\
. & . & . & . & .\\
. & . & . & . & .\\
- & - & x_{batch-size} & - & -\\
\end{pmatrix}
\end{equation}

torch.nn.Linear class when instantiated creates an object with a randomly initialised $W$ matrix of dimensions(in_class features, out_class features) and another randomly initialised matrix $b$ of dimensions(out_class features, 1). These dimensions are passed as arguments to the constructor during creation of object. When the forward function of the created object is invoked with parameters $X$, it applies linear transformation $XW + b$ and returns the appropriate matrix

In [0]:
# Build a Neural Network

# Inherits from the NN module
class MLP(torch.nn.Module):
    def __init__(self):
        super().__init__()
        # Construct objects of the NN module class
        # Input layer with 784 inputs and 128 outputs 
        self.fc1 = torch.nn.Linear(784, 128)
        # Hidden layer with 128 inputs and 64 outputs 
        self.fc2 = torch.nn.Linear(128, 64)
        # Output layer with 64 inputs and 10 outputs 
        self.fc3 = torch.nn.Linear(64, 10)
        self.sigmoid = torch.nn.Sigmoid()
        # Softmax normalizes over the columns 
        self.softmax = torch.nn.Softmax(dim = 1)
        self.relu = torch.nn.ReLU()
    def forward(self, x):
        # x has dimensions of 64, 1, 28, 28
        # Reshape the matrix to get 64, 784
        x = x.view(x.shape[0], -1)
        x = self.fc1.forward(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.fc3(x)
        x = self.softmax(x)
        return x

In [0]:
# Create an object of class MLP
model = MLP()

# Define the loss
criterion = torch.nn.CrossEntropyLoss()

# Optimizers require the parameters to optimize and a learning rate
optimizer = optim.SGD(model.parameters(), lr = 0.05)

In [8]:
epochs = 100
for e in range(epochs):
    print("Epoch ", (e + 1))
    running_loss = 0
    for images, labels in trainloader:
        optimizer.zero_grad()
        logits = model.forward(images)
        loss = criterion(logits, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    else:
        accuracy = 0
        with torch.no_grad():
            for images, labels in testloader:
                logit = model(images)
                ps = logit
                top_p, top_class = ps.topk(1, dim = 1)
                equals = top_class == labels.view(*top_class.shape)
                accuracy += torch.mean(equals.type(torch.FloatTensor))
        print("Loss on training set: ", running_loss)
        print("Accuracy on test set: ", end = " ")
        print((accuracy / len(testloader)).data.numpy())

Epoch  1
Loss on training set:  1842.2246737480164
Accuracy on test set:  0.71616244
Epoch  2
Loss on training set:  1532.93612074852
Accuracy on test set:  0.89938295
Epoch  3
Loss on training set:  1475.395410656929
Accuracy on test set:  0.9111266
Epoch  4
Loss on training set:  1462.8611141443253
Accuracy on test set:  0.9138137
Epoch  5
Loss on training set:  1454.5161266326904
Accuracy on test set:  0.9193869
Epoch  6
Loss on training set:  1448.6384273767471
Accuracy on test set:  0.92157644
Epoch  7
Loss on training set:  1443.8615982532501
Accuracy on test set:  0.9263535
Epoch  8
Loss on training set:  1439.0422087907791
Accuracy on test set:  0.93162817
Epoch  9
Loss on training set:  1435.063784480095
Accuracy on test set:  0.9369029
Epoch  10
Loss on training set:  1431.1449321508408
Accuracy on test set:  0.9403862
Epoch  11
Loss on training set:  1427.2730686664581
Accuracy on test set:  0.9426752
Epoch  12
Loss on training set:  1424.5146894454956
Accuracy on test set: 