**Importing Required Packages**

In [1]:
import torch
import torchvision
from torch import nn, optim

import torch.nn.functional as F
from torchsummary import summary

**Configuring Model**

In [2]:
# Model Configs
batch_size = 64
learning_rate = 0.01
cross_entropy = nn.CrossEntropyLoss()

**Load MNIST Data**

In [3]:
# Data Loader
transform = torchvision.transforms.ToTensor()
train_data = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST(
    'mnist_data', train=True, download=True, transform=transform
    ), batch_size=batch_size
)
val_data = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST(
    'mnist_data', train=False, download=True, transform=transform
    ), batch_size=batch_size
)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to mnist_data/MNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting mnist_data/MNIST/raw/train-images-idx3-ubyte.gz to mnist_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to mnist_data/MNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting mnist_data/MNIST/raw/train-labels-idx1-ubyte.gz to mnist_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to mnist_data/MNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting mnist_data/MNIST/raw/t10k-images-idx3-ubyte.gz to mnist_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to mnist_data/MNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting mnist_data/MNIST/raw/t10k-labels-idx1-ubyte.gz to mnist_data/MNIST/raw



  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


**Define Validation Function**

Function to calculate accuracy of the given validation data

In [4]:
# Validation function
def validate(model, data):
    total = 0
    correct = 0
    for i, (images, labels) in enumerate(data):
        images = images.cuda()
        labels = labels.cuda()
        y_pred = model(images)
        value, pred = torch.max(y_pred, 1)
        total += y_pred.size(0)
        correct += torch.sum(pred == labels)
    return correct * 100 / total

**Define Training Function**

Function to train the model on the training data

In [5]:
# Training Function
def train(model,epochs=5) :
    optimizer = optim.Adam(model.parameters(),lr=learning_rate)    
    for n in range(epochs)  :
        for i , (images , labels) in enumerate(train_data) :
            images = images.cuda()
            labels = labels.cuda()
            optimizer.zero_grad()
            prediction = model(images)
            loss = cross_entropy(prediction, labels)
            loss.backward()
            optimizer.step()
        accuracy = float(validate(model, val_data))
        print("Epoch:", n+1, "Loss: ", float(loss.data), "Accuracy:", accuracy)

**Define Model**

In [6]:
# Model
class ANN(nn.Module) :
    def __init__(self):
        super(ANN,self).__init__()
        self.dense_1 = nn.Linear(in_features=784,out_features=256)
        self.dense_2 = nn.Linear(in_features=256,out_features=10)

        self.relu = nn.ReLU()
    def forward(self,x) :
        x = x.view(x.shape[0],-1)
        x = self.relu(self.dense_1(x))
        x = self.dense_2(x)
        # output = self.tanh(x)
        output = F.log_softmax(x, dim=1)

        return output

**Create Model Instance**

In [7]:
# Model
model = ANN().cuda()

**Model Summary**

In [8]:
# Summary
summary(model, (1, 28, 28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Linear-1                  [-1, 256]         200,960
              ReLU-2                  [-1, 256]               0
            Linear-3                   [-1, 10]           2,570
Total params: 203,530
Trainable params: 203,530
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.78
Estimated Total Size (MB): 0.78
----------------------------------------------------------------


**Train the model**

In [9]:
# Train for 30 Epochs
train(model,epochs=30)

Epoch: 1 Loss:  0.021140538156032562 Accuracy: 93.81999969482422
Epoch: 2 Loss:  0.023417435586452484 Accuracy: 93.75999450683594
Epoch: 3 Loss:  0.02844458445906639 Accuracy: 95.31999969482422
Epoch: 4 Loss:  0.01106941420584917 Accuracy: 96.38999938964844
Epoch: 5 Loss:  0.0032071738969534636 Accuracy: 95.39999389648438
Epoch: 6 Loss:  0.009068711660802364 Accuracy: 96.04999542236328
Epoch: 7 Loss:  0.0013761625159531832 Accuracy: 95.97000122070312
Epoch: 8 Loss:  0.0003605204983614385 Accuracy: 96.45999908447266
Epoch: 9 Loss:  8.415226329816505e-05 Accuracy: 96.36000061035156
Epoch: 10 Loss:  0.003738099941983819 Accuracy: 96.79000091552734
Epoch: 11 Loss:  0.030550900846719742 Accuracy: 96.79000091552734
Epoch: 12 Loss:  0.00013384586782194674 Accuracy: 96.11000061035156
Epoch: 13 Loss:  0.0003037812712136656 Accuracy: 96.79000091552734
Epoch: 14 Loss:  0.1114736795425415 Accuracy: 96.83999633789062
Epoch: 15 Loss:  1.3038490465078212e-07 Accuracy: 96.70999908447266
Epoch: 16 Loss

We can see here that there are even less parameters and the training is much faster. We can see that the loss is fluctuating even after reaching 0. 
Theoritically the accuracy of a Normal Neural Network should be less than that of a Convolution Neural Network because it does not capture spatial information.
Here we see that the model provides a decent accuracy , because there are no complex features needed to be learned for the MNIST dataset.