# Introduction

In this project, you will build a neural network of your own design to evaluate the MNIST dataset.

Some of the benchmark results on MNIST include can be found [on Yann LeCun's page](https://webcache.googleusercontent.com/search?q=cache:stAVPik6onEJ:yann.lecun.com/exdb/mnist) and include:

88% [Lecun et al., 1998](https://hal.science/hal-03926082/document)

95.3% [Lecun et al., 1998](https://hal.science/hal-03926082v1/document)

99.65% [Ciresan et al., 2011](http://people.idsia.ch/~juergen/ijcai2011.pdf)


MNIST is a great dataset for sanity checking your models, since the accuracy levels achieved by large convolutional neural networks and small linear models are both quite high. This makes it important to be familiar with the data.

## Installation

In [1]:
# Update the PATH to include the user installation directory. 
import os
os.environ['PATH'] = f"{os.environ['PATH']}:/root/.local/bin"

# Restart the Kernel before you move on to the next step.

#### Important: Restart the Kernel before you move on to the next step.

In [2]:
# Install requirements
!python -m pip install -r requirements.txt

Defaulting to user installation because normal site-packages is not writeable


## Imports

In [3]:
## This cell contains the essential imports you will need – DO NOT CHANGE THE CONTENTS! ##
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader
import torchvision
import torchvision.transforms as transforms
from torchvision import datasets
import matplotlib.pyplot as plt
import numpy as np

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

## Load the Dataset

Specify your transforms as a list if you intend to .
The transforms module is already loaded as `transforms`.

MNIST is fortunately included in the torchvision module.
Then, you can create your dataset using the `MNIST` object from `torchvision.datasets` ([the documentation is available here](https://pytorch.org/vision/stable/datasets.html#mnist)).
Make sure to specify `download=True`! 

Once your dataset is created, you'll also need to define a `DataLoader` from the `torch.utils.data` module for both the train and the test set.

## Justify your preprocessing

In your own words, why did you choose the transforms you chose? If you didn't use any preprocessing steps, why not?

The transform that I have chosen to use is .ToTensor and .Normalize. The purpose was to normalize the images, so that the algorithm will deal with same intentsity range across the entire dataset.

## Explore the Dataset
Using matplotlib, numpy, and torch, explore the dimensions of your data.

You can view images using the `show5` function defined below – it takes a data loader as an argument.
Remember that normalized images will look really weird to you! You may want to try changing your transforms to view images.
Typically using no transforms other than `toTensor()` works well for viewing – but not as well for training your network.
If `show5` doesn't work, go back and check your code for creating your data loaders and your training/test sets.

In [4]:
## This cell contains a function for showing 5 images from a dataloader – DO NOT CHANGE THE CONTENTS! ##
def show5(img_loader):
    dataiter = iter(img_loader)
    
    batch = next(dataiter)
    labels = batch[1][0:5]
    images = batch[0][0:5]
    for i in range(5):
        print(int(labels[i].detach()))
    
        image = images[i].numpy()
        plt.imshow(image.T.squeeze().T)
        plt.show()

In [5]:
# Explore data

show5(trainLoader)

def stats(img_loader):
    mean = 0.0
    std = 0.0
    for i, data in list(enumerate(img_loader))[:1]:
        imgs,labs = data
        imgs, labs = imgs.to(device), labs.to(device)
        img_mean = torch.mean(imgs,dim=(1,2,3))
        img_std = torch.std(imgs,dim=(1,2,3))
        mean += img_mean.sum()/64
        std += img_std.sum()/64

    print(mean)
    print(std)

#stats(trainLoader)

NameError: name 'trainLoader' is not defined

## Build your Neural Network
Using the layers in `torch.nn` (which has been imported as `nn`) and the `torch.nn.functional` module (imported as `F`), construct a neural network based on the parameters of the dataset.
Use any architecture you like. 

*Note*: If you did not flatten your tensors in your transforms or as part of your preprocessing and you are using only `Linear` layers, make sure to use the `Flatten` layer in your network!

In [None]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.activation = F.relu
        self.fc1 = nn.Linear(28 * 28 * 1, 120)
        self.fc2 = nn.Linear(120, 64)
        self.fc3 = nn.Linear(64, 10)

    def forward(self, x):
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = self.activation(self.fc1(x))
        x = self.activation(self.fc2(x))
        x = F.dropout(x, training=self.training)
        x = self.fc3(x)
        return x
    
mynet = Net()
mynet.to(device)

Specify a loss function and an optimizer, and instantiate the model.

If you use a less common loss function, please note why you chose that loss function in a comment.

In [None]:
opt = optim.SGD(mynet.parameters(), lr=0.005,momentum=0.1)
crit = nn.CrossEntropyLoss()

## Running your Neural Network
Use whatever method you like to train your neural network, and ensure you record the average loss at each epoch. 
Don't forget to use `torch.device()` and the `.to()` method for both your model and your data if you are using GPU!

If you want to print your loss **during** each epoch, you can use the `enumerate` function and print the loss after a set number of batches. 250 batches works well for most people!

In [None]:
def digitTrainer(n_epochs):

# the training loss and validation loss for each epoch will be stored in the following variables
    Training_Loss = list()
    Validation_Loss = list()
    Training_Accuracy = list()
    Validation_Accuracy = list()
# A for loop for each epoch
    for epoch in range(n_epochs):
            T_loss = 0.0
            T_correct = 0.0
            TPred = 0.0
                # A for lopp for the training
    
            for i, data in enumerate(trainLoader):
                imgs,labs = data
                imgs,labs = imgs.to(device), labs.to(device)
                mynet.train()
                opt.zero_grad()
                output = mynet(imgs)
                loss = crit(output, labs)
                loss.backward()
                opt.step()
                _,pred = torch.max(output.data,1)
                T_correct += (pred==labs).sum().item()
                T_loss += loss.item()
                TPred += len(pred)
            Training_Loss.append(100*T_loss/TPred)
            Training_Accuracy.append(100*T_correct/TPred)
            print(f'Epoch  {epoch+1} Training: percentage correct predictions: {100*T_correct/TPred:.2f}%')
            print(f'Epoch  {epoch+1} Training: percentage training loss: {Training_Loss[epoch-1]:.2f}%')
                # A for loop for the validation
    
            V_loss = 0.0
            V_correct = 0.0
            VPred = 0.0
            mynet.eval()
            for i,  data in enumerate(valLoader):
                imgs,labs = data
                imgs,labs = imgs.to(device), labs.to(device)
                output = mynet(imgs)
                loss = crit(output,labs)
                _,pred = torch.max(output.data,1)
                V_correct += (pred==labs).sum().item()
                V_loss += loss.item()
                VPred += len(pred)
            Validation_Loss.append(100*V_loss/VPred)
            Validation_Accuracy.append(100*V_loss/VPred)
            print(f'Epoch  {epoch+1} Validation: percentage correct predictions: {100*V_correct/VPred:.2f}%')
            print(f'Epoch  {epoch+1} Validation: percentage training loss: {Validation_Loss[epoch-1]:.2f}%') 
            
    return Training_Loss,Training_Accuracy, Validation_Loss, Validation_Accuracy
n_epochs =10  
Training_Loss,Training_Accuracy, Validation_Loss, Validation_Accuracy = digitTrainer(n_epochs)

Plot the training loss (and validation loss/accuracy, if recorded).

In [None]:
plt.plot(Training_Loss, label="Training Loss")
plt.plot(Validation_Loss, label="Validation Loss")
plt.legend()
plt.show()

plt.plot(Training_Accuracy, label="Training Accuracy")
plt.plot(Validation_Accuracy, label="Validation Accuracy")
plt.legend()
plt.show()

## Testing your model
Using the previously created `DataLoader` for the test set, compute the percentage of correct predictions using the highest probability prediction. 

If your accuracy is over 90%, great work, but see if you can push a bit further! 
If your accuracy is under 90%, you'll need to make improvements.
Go back and check your model architecture, loss function, and optimizer to make sure they're appropriate for an image classification task.

In [None]:
def test_accuracy(net):

    data = enumerate(test)
    batch_idx, (example_data, example_labels) = next(data)
    example_data, example_labels= example_data.to(device), example_labels.to(device)
    output = mynet(example_data)
    _,pred = torch.max(output,1)
    print(f'Test accuracy: {100*(pred == example_labels).sum().item()/1000}%')

test_accuracy(mynet)

## Improving your model

Once your model is done training, try tweaking your hyperparameters and training again below to improve your accuracy on the test set!

In [None]:
# To improve my model I have increased the momentom parameter and increases the number of epochs.

opt = optim.SGD(mynet.parameters(), lr=0.005,momentum=0.5)
crit = nn.CrossEntropyLoss()

n_epochs =10  
Training_Loss,Training_Accuracy, Validation_Loss, Validation_Accuracy = digitTrainer(n_epochs)

plt.plot(Training_Loss, label="Training Loss")
plt.plot(Validation_Loss, label="Validation Loss")
plt.legend()
plt.show()

plt.plot(Training_Accuracy, label="Training Accuracy")
plt.plot(Validation_Accuracy, label="Validation Accuracy")
plt.legend()
plt.show()

test_accuracy(mynet)

# The model would probably improve further if some of the model arcitecture was convolution

## Saving your model
Using `torch.save`, save your model for future loading.

In [None]:
torch.save(net, 'BChick_Net.pt')