# CIS 1902 Homework 4 Part 1: Deep LearningðŸ‘• ðŸ‘  ðŸ‘— ðŸ‘Ÿ ðŸ§¥

**Due Friday March 31, 2023 11:59 pm EST**

## Objectives

- Practice using PyTorch
- Familiarization with deep learning model building and tuning


**First, make a copy of this Colab to your Google Drive by clicking "Copy to Drive" in the upper left or `File -> save a copy in Drive` so you can save any changes you make.**

- **Name:** TODO
- **PennKey:** TODO
- **Number of hours spent on homework** 
    - **Part 1:** TODO

Collaboration is NOT permitted.

In the functions below the "NotImplementedError" exception is raised, for
you to fill in. The interpreter will not consider the empty code blocks
as syntax errors, but the "NotImplementedError" will be raised if you
call the function. You will replace these raised exceptions with your
code completing the function as described in the docstrings.

## Setup

Installing PyTorch or any other deep learning framework can be a huge hassle on your local machine, especially if you're trying to take advantage of GPU acceleration. Thankfully Colab has all of this taken care of for us: PyTorch comes installed by default, and Google even provides free GPU resources for us to use.

To enable GPU acceleration for your notebook, go to the menu and `Runtime -> Change Runtime type`. Under `Hardware accelerator`, choose `GPU`.

Confirm that you have a GPU allocated by importing `torch` and running the following code:

In [None]:
import torch
from torchvision import datasets, transforms
import torch.nn as nn
import torch.optim as optim

import numpy as np
import matplotlib.pyplot as plt

In [None]:
torch.cuda.device_count()

If the above command returns 1, that means Colab has allocated a GPU for your session, and so you should be good to go.

## The Dataset



We looked at the MNIST dataset in class, which is the "Hello world!" dataset of sorts for image classification. However, with deep learning becoming so *fashionable* these days, the old MNIST classification task has become too easy, with many simple architectures achieving well above 99% accuracy. 

Enter the [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset, which is intended as a drop-in replacement to MNIST.

 ![](https://www.seas.upenn.edu/~cis1920/tliu/s23/hws/hw5/fashion-mnist-annotated.png)

There are still 10 classes to classify and the inputs are still 28 by 28 pixel grayscale images, but instead of handwritten digits there are articles of clothing, which is a harder classification task. We will be using Fashion-MNIST to practice using PyTorch as well as explore the full machine learning process of model training and tuning.

### Loading and processing the data

We first need to load and process the data. We've provided the `load_data()` function below, which downloads and loads the Fashion MNIST data.

In [None]:
def load_data(batch_size=64):
    """
    Load the fashion MNIST data using torchvision and apply the appropriate 
    transformations.
    
    transforms.Normalize manipulates the input such that:
    new_input = (old_input - mean) / std

    Args:
        batch_size (int): the batch size for training and testing
        mean (float): the mean to normalize images with
        std (float): the std to normalize images with

    Returns:
        trainloader (torch.utils.DataLoader): data loader for the train set
        validloader (torch.utils.DataLoader): data loader for the validation set
    """
    # Normalize the data to [-1, 1]
    transform = transforms.Compose([transforms.ToTensor(),
                                    transforms.Normalize(mean=0.5, std=0.5)])
    
    # Download and load data
    fmnist = datasets.FashionMNIST('pytorch/', download=True, train=True, 
                                     transform=transform)
    
    # split the data into training and validation, set seed for reproducibility
    rand_seed = torch.Generator().manual_seed(42)
    trainset, validset = torch.utils.data.random_split(fmnist, [30000, 30000], 
                                                       generator=rand_seed)
    
    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, 
                                              shuffle=True)

    validloader = torch.utils.data.DataLoader(validset, batch_size=batch_size, 
                                             shuffle=True)

    return trainloader, validloader


### Visualizing the examples

As we saw in class, we can iterate over `torch.DataLoaders`, which return a `(images, labels)` tuple at each iteration. Let's use the data loaders returned from `load_data()` to explore what the classes look like.

In [None]:
img_dict = {0: 'T-shirt/top',
           1: 'Trouser',
           2: 'Pullover',
           3: 'Dress',
           4: 'Coat',
           5: 'Sandal',
           6: 'Shirt',
           7: 'Sneaker',
           8: 'Bag',
           9: 'Ankle boot'}

def show_random_fmnist():
    """
    Displays random fashion-MNIST examples.
    """
    num_examples = 10
    fig, ax = plt.subplots(figsize=(10,10))

    trainloader, _ = load_data(batch_size=num_examples)

    # grab random images and labels
    # shape (num_examples, 28, 28)
    images, labels = next(iter(trainloader))
    
    # reshape to a grid shape
    images = images.reshape(num_examples*28,1*28)

    # imshow displays pixel images
    ax.imshow(images, cmap="gray")
    ax.set_xticks([])

    custom_ticks = np.arange(14, 14 + (28 * num_examples), 28)
    ax.set_yticks(custom_ticks)
    img_classes = [img_dict[label] for label in labels.numpy()]
    
    ax.set_yticklabels(img_classes)
    ax.set_title('{} random fashion MNIST examples'.format(num_examples))
    fig.show()


show_random_fmnist() 

## Building neural networks

### Fully connected network [1 pt]

Let's begin by creating a fully connected neural network in `create_fnn(input_size, num_classes)`. Your implementation should have:

- two `nn.Linear` hidden layers, each with 64 neurons
- two `nn.ReLU` activations after each hidden layer
- a final `nn.Linear` layer for the output

**Note:** our autograder will check against this **exact** architecture, so please follow these instructions exactly. 

### Your custom architecture

Once you have gotten our specified architecture to work, feel free to build your own custom architecture to achieve better test accuracy. For example, you could increase the number of layers, increase the number of neurons per layer,
or experiment with different modules such as [`nn.Dropout()`](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html) for your network.

**We recommend coming back to `create_custom_nn` after you have implemented the other functions to compute the accuracy and train your networks.**

In [None]:
def create_fnn(in_size, num_classes):
    """
    Create a fully connected neural network as specified above: two hidden 
    linear layers with 64 units, each followed by ReLU activation.

    Note: there are 2 *hidden* layers. This means there must
    be an additional third layer for the output.

    Args:
        in_size (int): the input size of the network, should match data shape
        num_classes (int): the # of classes, which is the network output size

    Returns:
        fnn (nn.Sequential): the fully connected network whose input is of
        size in_size and output is of size num_classes
    """
    raise NotImplementedError
    
    fnn = nn.Sequential(
        # Flatten needed as the images are originally 2D
        nn.Flatten(),
        # TODO fill in the rest
    )

    return fnn

def create_custom_nn(in_size, num_classes):
    """
    Create a custom neural network of your choosing. Feel free to copy your
    implementation of either create_fnn or create_cnn and modify it here.

    Args:
        in_size (int): the input size of the network, should match data shape
        num_classes (int): the # of classes, which is the network output size

    Returns:
        nn.Sequential: a nn Sequential representing your neural network.
    """
    raise NotImplementedError


## Training and evaluating neural networks

Once you have built your neural net architectures, we'll need to train them.


### Calculating accuracy [1 pt]

First implement the `compute_accuracy()` function, which computes the accuracy of a given `net` on the given `dataloader` data.

For an n x 784 input x, net(x) gives an n x 10 output. For each row i in the output, the 10 columns are proportional to the probability of each class. To find the prediction, you must find the index of the column with the highest probability for each row. You can use `torch.argmax()` to achieve this.

In [None]:
def compute_accuracy(net, dataloader, device="cpu"):
    """
    Return the accuracy of the network on all data points in the dataloader. 
    This is the sum of the number of correct predictions divided by the total 
    number of samples in the dataloader.

    Hint: use torch.argmax() to find the element with the highest probability.


    Args:
        net (nn.Sequential): the network to compute the accuracy of
        dataloader (utils.DataLoader): a dataloader to compute accuracy over
        device (str): the device to send Tensors to
    Returns:
        float: the net's accuracy on the data from dataloader
    """
    # no_grad speeds up computation by telling PyTorch not to compute gradients
    with torch.no_grad():
        tot_correct = 0
        tot_samples = 0
        for im, lab in dataloader:
            imgs, labels = im.to(device), lab.to(device)
            
            """
            TODO fill out:
                1. get y_hat predictions by calling net(imgs)
                2. add the number of correct predictions made for the batch
                3. add the number of samples seen in the batch
            """
            raise NotImplementedError

    return (tot_correct / tot_samples).cpu().numpy()



### Training loop [1 pt]

Next, fill out the  `train_nn()` functions, following the steps we discussed in class:

- For each epoch:
    - For each minibatch:
        1. zero the gradients
        2. compute the output predictions
        3. compute the loss
        4. backpropagate
        5. use the optimizer to take a step

Refer to the lecture recording for the corresponding Pytorch commands for each step. We provide the loss function `nn.CrossEntropyLoss()` and optimizer `optim.Adam()` as shown in lecture, but feel free to experiment with other options.

We also provide a `create_acc_curve()` function to plot the training and testing performance of your neural network over the training iterations.


In [None]:
def train_nn(net, trainloader, validloader, eval_freq, num_epochs, device="cpu"):
    """
    Train the network net on data from the trainloader.
    Recall the high-level algorithm:
    For each epoch
        For each minibatch
            zero the gradients
            compute the output predictions
            compute the loss
            backpropagate
            use the optimizer to take a step
    Args:
        net (nn.Sequential): a neural network
        trainloader (DataLoader): the data loader for training set
        validloader (DataLoader): the data loader for the validation set
        eval_freq (int): the frequency at which to compute the train/valid acc
        num_epochs (int): the number of epochs (or complete passes) over the 
                          train data
        device (str): the device to train on, either "cpu" or "cuda"

    ret:
        train_acc: the accuracy over the training set  computed every eval_freq 
            iterations
        valid_acc: the accuracy over the validation set 
    """
    net.to(device)

    optimizer = optim.Adam(net.parameters())
    loss_fn = nn.CrossEntropyLoss()

    train_acc = []
    valid_acc = []
    iter_num = 0

    for epoch in range(num_epochs):
        print("Epoch: {}".format(epoch))
        for imgs, labels in trainloader:
            imgs_train, labels_train = imgs.to(device), labels.to(device)

            # TODO: fill out the steps needed, as outlined above
            

            
            # End TODO

            if iter_num % eval_freq == 0:
                print("Iteration: {}".format(iter_num))
                train_acc.append(compute_accuracy(net, trainloader, device))
                valid_acc.append(compute_accuracy(net, validloader, device))
            iter_num += 1
    return train_acc, valid_acc
    

def create_acc_curve(train_acc, valid_acc, eval_freq):
    """
    Create a accuracy curve plot for both the test and train data.

    Args:
        train_acc (list): a list of the training accuracy over time, one for 
            each eval_freq iterations
        valid_acc (list): a list of the validation accuracy over time, one for 
            each eval_freq iterations
        eval_freq (int): the number of iterations between each accu measurement
    
    Returns: 
        None, but writes to acc_curves.png
    """
    fig, ax = plt.subplots()
    plt.figure()
    x_axis = np.arange(0, len(train_acc) * eval_freq, eval_freq)
    ax.plot(x_axis, train_acc, 
            label='Train final acc: {:.3f}'.format(train_acc[-1]))
    ax.plot(x_axis, valid_acc, 
            label='Validation final acc: {:.3f}'.format(valid_acc[-1]))
    ax.set_title('Accuracy on Fashion-MNIST')
    ax.set_ylabel('accuracy')
    ax.set_xlabel('# iterations')
    ax.legend()
    fig.savefig("acc_curves.png")

## The model training process

Once both `train_nn()` and `compute_accuracy()` are implemented, you are ready to start training your network. Fill out and run the code in the `__main__` function below. On a GPU, this should take about 2 minutes per epoch.

In [None]:
# Note: the fully connected net will take about 2 minutes per epoch to train!
%%time
if __name__ == '__main__':
    device = "cuda"
    # TODO this should be the total number of pixels in an image
    input_size = None
    # TODO this should be the total number of FMNIST classes
    num_classes = None
    
    
    eval_freq = 100
    # TODO tune the number of epochs for better performance.
    num_epochs = 1
    batch_size = 64
    trainloader, validloader = load_data(batch_size=batch_size)

    nn_model = create_fnn(input_size, num_classes)
    # uncomment to experiment with your custom neural net
    # nn_model = create_custom_nn(input_size, num_classes)
    
    train_acc, valid_acc = train_nn(nn_model, 
                                   trainloader, 
                                   validloader, 
                                   eval_freq, 
                                   num_epochs, 
                                   device)

### Visualize the training curves 

After your network has been trained, you can visualize the training curves and print the final accuracies by running the `create_acc_curve()` method below:

In [None]:
if __name__ == '__main__':
    create_acc_curve(train_acc, valid_acc, eval_freq);

### Tuning your networks

Using the neural network created in `create_fnn()` as a base, tune your model implementation in `create_custom_nn()` to maximize validation accuracy. Some things you can experiment with:

- increasing the number of epochs
- adding more linear layers, or changing the number of neurons per layer
- adding dropout or convolutional layers (warning: may take quite a bit more tinkering than the other options)

Once you are done tuning your model, run the code below to save your model to `model.pkl` and include it as part of your final submission. 

**NOTE: the Google Colab runtime will sometimes time out and reset, so be sure to save a copy of your model.pkl file immediately to your local machine as a backup.**

Download and include the `acc_curves.png` plot for your final network as well.

In [None]:
if __name__ == '__main__':
    torch.save(nn_model, "model.pkl")

### Final model test accuracy [1 pt]

In real-world machine learning situations, we have at least **three** data splits:

- **Training data**: Data that is used to train and optimize model parameters
- **Validation data**: Data that is used for tuning the model (choosing hyperparameters like model architecture, batch size, etc).
- **Testing data**: Data used to evaluate the final performance of the model, that is held out separately from the training and validation process.

A very common pitfall is to use a portion of the dataset for both validation and testing. This is **extremely bad practice**, and it will often lead to model overfitting.

Thus, the data the autograder will evaluate your neural network will be a completely unseen test set. This mirrors the set up of real-world machine learning competitions such as those run by Kaggle, where competitors are given a set of the data to train and validate their models against, but their leaderboard ranking is based on a held-out test set that they do not have access to.

You will receive full credit if your model achieves >83% accuracy on the **testing data.** Like all the other homework assignments, you are free to submit to Gradescope as many times without penalty, so you can experiment with model architectures to see what produces the best results. Generally, the validation accuracy is a decent approximation of the test accuracy, so you can use that as a guide.

You can get **1 extra credit point** on this assignment if your model achieves >89%
accuracy.

> **TIP**: if implemented correctly, the fully connected network from `create_fnn()` should be able to achieve this performance just by increasing the number of training epochs -- you won't need to implement `create_custom_nn()`. If you'd like to explore different deep learning architectures to try to get the extra credit point, we recommend looking into [convolutional neural networks](https://cs231n.github.io/convolutional-networks/), which have convenient layer implementations in PyTorch: [`nn.Conv2d()`](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) and [`nn.MaxPool2d()`](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#maxpool2d)



## Part 1 discussion [1 pt]

**Write your response directly in this markdown cell.**

1. Describe what you observe in your `acc_curves.png` plot. Does your neural network appear to "learn" (in terms of better performance) quickly or slowly? Do you see any differences between training accuracy, validation accuracy, and the autograder-reported testing accuracy?

**Your response:** 

1. TODO

# Rubric

| Sections | Points |
|---------|--------|
| **Part 1** |
`create_fnn()` | 1
`train_nn()` | 1
`compute_accuracy()` | 1
`acc_curves.png`, `model.pkl` uploaded | 1
Achieve >83% test accuracy | 1
Part 1 discussion | 1

---

| All Sections | Points |
|---------|--------|
All functions implemented | 0.5
Pennkey, name, and hours estimate | 0.5
Part 1 | 6
Part 2 | 3
 **Total** | 10

## Extra credit

| Section | Points |
|---------|--------|
achieve >89% test accuracy | 1

# Submission

Afer you have completed the worksheet, download the Colab notebook as a `hw5_dl.py` file by going to `File -> Download .py`. Then submit your `hw5_dl.py`, `model.pkl`, and `acc_curves.png` files to Gradescope. 

Since this is only part 1 of the assignment, there will be points missing from the autograder -- the maximum autograder score for part 1 (not including the extra credit) is **5 points**.

**Note: since you will be submitting Colab-formatted .py files, we will not be grading code style for this assignment.**

# Attribution

Elements of this homework were adapted from [Kevin McGuinness](http://www.eeng.dcu.ie/~mcguinne/)' "Introduction to Deep Learning for Computer Vision using PyTorch" as well as the Fall 2019 CIS 520 convolutional neural net homework assignment.