# Pytorch and Classification Test
Finally some real Neural Networks

## Section 1: Dataset and Dataloaders (5 min)
In this section you will implement a torchvision Dataset and Dataload for the MNIST dataset

In [None]:
import numpy as np     # <----- Our old friend

import torch            # |  Our new best friends -- this is the main pytroch library
import torch.nn as nn   # |  This is just shortening the name of this module since we're gonna use it a lot -- this is the one that has neural network objects (nn.modules)
import torchvision      # |  This is for importing the vision datasets we'll use
from torch.utils.data import Dataset, DataLoader, random_split, Subset # | These are particular objects that we use to load our data (and shuffle it and whatnot) we'll talk more about these later
import torchvision.transforms as tt # | Allows us to transform our data while we load it (or after) such as rotating, flipping, ocluding, etc. 

import torch.nn.functional as F # | This is for functional / in-place operations for example if I wanted to do a sigmoid operation, but not as a neural net object (though I can still update through it)


from torchvision.utils import make_grid  # |   Utility stuff for plotting
import matplotlib.pyplot as plt          # |  <- I use this one a lot for plotting, seaborn is a good alternative
from matplotlib.image import imread      # |  it reads images... (png -> usable input (like a numpy array for ex))

import random

## MNIST
You remeber our good friend MNIST? Well it's about time that you two get acquainted. MNIST is a dataset of 60,000 training and 10,000 testing images of handwritten digits, with (human done) labels of which digit is written. You can find the MNIST official website [here](http://yann.lecun.com/exdb/mnist/) and a description a little less 80's [here](https://deepai.org/dataset/mnist)

Notably this dataset has no colors -- and because of how (relatively) simple this dataset is we consider this to be the "hello world" of vision datasets.

Here's what the dataset looks like to us humans:

<center><img src="https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftheanets.readthedocs.io%2Fen%2Fstable%2F_images%2Fmnist-digits-small.png&f=1&nofb=1" width="350" height="250" /><center>

In this assignment you will be loading your own data set in order to practice working with *Datasets* in PyTorch. To iterate over your data, PyTorch has two very helpful mechanisms: ```Dataset``` and ```Dataloader```. A ```Dataset``` is a type of PyTorch class which makes it easier to access your data. 

Here's a reference API that you should 100% use for [Datasets and Dataloaders](https://pytorch.org/vision/0.8/datasets.html#mnist)

You can create an object of a ```Dataset()``` class, and then access the size of the data set as ```len(dataset)```.  You can access the actual data at any given index by calling ```dataset[index]```. 

You can also apply *transformations* to the dataset, which are created by calling some predefined functions in PyTorch (or Torchvision) called [transforms](https://pytorch.org/vision/stable/transforms.html). For this data and HW the only transform you need to use is ```ToTensor()``` which will give us our data as a Tensor (a generalization of scalar, vector, and matrix with arbitrary dimesnions), which will allow us to do gradient descent with Pytorch

Once you create a ```Dataset``` class with your needed transformations, you can feed it into a ```Dataloader```. A ```Dataloader``` is an iterable over the ```Dataset```, which is just a fancy term for a helpful Python class that can run loops over your data without much code. For example, if your ```Dataset``` returns a (sample, target) pair, then you can iterate over the ```Dataloader``` as:

```
train_ds = MyDatasetClass()
train_dl = Dataloader(train_dl, batch_size)
for input, output in train_dl:
    ...
```
Remember that the sizes of ```input``` and ```output``` are specified by the ```batch_size``` that you selected earlier!

In [None]:
def load_mnist(batch_size=32, train=True):

    '''
    Using the dataset and dataloader classes you should be able to make an MNIST set and loader
    the loader should use the 'batch_size' argument and the dataset should use'train'

    Also, the 'ToTensor' transform is given, you should set the transform of the dataset to just this
    '''

    to_tensor_transform = torchvision.transforms.ToTensor()
    # TODO create a dataset and then dataloader object for MNIST using
    # the torchvision library

    #############################################
    
    dataset = None
    dataloader = None

    ##############################################

    return dataset, dataloader

In [None]:
def plot_image_and_label(image, label):
    
    '''
    Takes in an image and label and shows them using matplotlib 
    this is used to visualize the data and also the outputs of our network
    '''

    plt.imshow(image)
    if type(label) is not int:
        _,predicted = torch.max(label,1)
        plt.title("Best label = " + str(predicted.item()) + ", with Score: " + str(round(label[0][predicted].item() * 100,2)))
    else:
        plt.title("Label = " + str(label))
    plt.show()
    return

In [None]:
# TEST

train_dataset, train_dataloader = load_mnist(batch_size=1, train=True)
ex_image, ex_label = train_dataset[random.randint(0,1000)]
plot_image_and_label(ex_image.reshape(28,28), ex_label)

## Section 2: Torch Modules (10 min)

Pytorch is a library that will allow you to define neural network objects (nn.modules) for your entire network of whatever operations you want, select a loss function, and then ...
 automatically calculate the gradient for you

Pytorch is built off of modules (called ```nn.Module```) which consist of 2 parts: The initialization (defined in ```__init__()``` -- note that this the python convention for initalizing classes) and the forward pass (defined aptly as ```forward()```)

What is magical about Pytorch is that you simply define these two things and then the gradient can be found *auotmatically*. So all that grueling code you wrote last time... totally unnecessary now. It still helps you in the long run though -- I promise. 

Documentation for a pytorch module can be found [here](https://pytorch.org/docs/stable/generated/torch.nn.Module.html)

In this section we will create a Mulit-Layer perceptron using 

In [None]:
class MyMLP(nn.Module):

    '''
    in init you should initialize your model
    in forward you just need to use the layers initialized in init to get the output of your model

    The input 'sizes' is a list of the layer sizes, with sizes[0] being the input size and sizes[-1] being the output size.
    This model will only use linear layers so you must flatten the input appropriately.
    '''

    def __init__(self, sizes=[784, 20, 10]):
        super(MyMLP, self).__init__() 

        # TODO initalize your model here 
        #################################

        pass

        #################################


    def forward(self, x):

        # TODO perform the forward pass of you model 
        # use the module you initialized above
        #################################

        out = None

        #################################

        return out

In [None]:
# For testing your model

## Section 3 : Classification (15 min)
In this section we will implement the Cross-Entropy loss and create a training loop to minimize it using our model.

In [None]:
# TODO your implementation of cross entropy -- i.e. A softmax over input scores and then negative log loss on those probabilities and the true probs.
def cross_entropy(scores, true_probs):

    loss = None # Must return a torch Tensor (i.e. you must be able to differentiate through this)

    return loss

In [None]:
# Compare to torch's implementation

x = [9,5,4,2,1]
y = [0.9, 0.01, 0.09, 0, 0]

ce_loss = torch.nn.CrossEntropyLoss()
print("Pytroch CE:",ce_loss(x,y))
print("Your CE:", cross_entropy(x,y))


#### Training loop
Now you are finally ready to train your neural network! Complete your ```training()``` function. You can iterate over your data for 30 epochs (epochs = number of times you iterate over all your data) to begin with. In order to write your training loop, the following general structure can be followed:
```
# initialize loss_function and optimizer
model = MyMLP()
for iteration in range(epochs):
    for input, output in train_dl:
        reset optimizer gradients 
        my_output = model(input)
        loss = loss_function(my_output, loss)
        step over gradients using optimizer
```


[Hint for reseting the optimizer](https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html#torch.optim.Optimizer.zero_grad)

[Hint for stepping with the optimizer](https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.step.html#torch.optim.Optimizer.step) (You'll have to use .backward() to get the gradient)

In [None]:
def training(lr, epochs, train_data, model):
    '''
    returns a trained model, as well as a list of training losses and a list of training accuracies -- each should have 1 element per epoch.
    '''

    return trained_model, train_losses, train_accuracies

In [None]:
# Plot your training curve here -- both loss and accuracy should be reported. 

## Section 4 : Open-Ended (30 min per)
Here I give you 4 open ended challenges. In the live session you may choose one to attempt in the 30 minutes. Afterward I expect that you will complete at least 2 of them. </br> (Additional completed challenges will be noted and will award towards a letter of recommendation.)
</br></br>
You may use any technique that is available in the given libraries to do each of these.

### Challenge 1 (Compute Restricted): Oh No! You only have 1kb of RAM but you need to be able to read these digits! You must create a model that uses fewer than 1000 parameters but achieves over 80% training accuracy and 50% test accuracy to pass this challenge. 

### Challenge 2 (Robustness): Create a model and prove that it is robust to Partial occlution, rotation, and reflection transformations. I do not have a specific number for this one, but it will be up to you to prove through testing or mathematics. 

### Challenge 3 (Oops! all inputs): Unsupervised learning (no labels) -- traing a model without any labels such that it can achieve >=90% test accuracy. 

### Challenge 4 (Black hat beginner): Using any model which you have trained to >=95% train accuracy create 10 examples which are changed from previously correctly-predicted data but are now misclassified. These inputs should be shown side by side (they should not look different). Whoever is able to do this with the smallest magnitude changes will also get special notice. 