# Assignment 02 - Transforms

This assignment covers robustness testing using data transforms with PyTorch and TorchVision.

## Introduction

As we have seen in class, PyTorch and its related packaged provide a powerful means of transforming data in a way that is decoupled from the actual code in our training loop.

The `torch.utils.data.Dataset` and `torch.utils.data.DataLoader` classes provide an interface for us to quickly download standard ML datasets and to make our our datasets which can benefit from the full functionality of PyTorch and its related packages.

When combined with `torch.transforms`, these tools provide a means to transform our data, and to change which transforms we apply without having to make changes to our core training code. This is incredibly useful for data augmentation, domain generalization, and robustness testing.

For data augmentation, transforms can be applied before training to help us increase the diversity in our dataset, and instill expert knowledge into our models by having experts select which transforms we apply to the training data (e.g., applying rotations to a model which must be perspective invariant).

For domain generalization, transforms can be applied during training to help us improve our model's ability to generalize to new domains (environments in which the model must perform its task). For example, varying the color of the background randomly ensures the model is not incorrectly sensitive to background color and will not break in a domain with different background colors.

For robustness testing, transforms can be applied after training to assess our model's robustness to challenging factors in our data, e.g., blur, color changes, perspective changes, and other variations. We can apply transforms to various degrees using the parameters that PyTorch exposes to adjust the transform. This enables us to extract performance curves for our models which go beyond simply displaying a confusion matrix as the output of model training.

This assignment will focus on applying transforms for robustness testing.

## Directions

In this assignment, we will train a model and assess its robustness using PyTorch. To do this, we will carry out the following steps.

1. **Load Data**: This step will require us to import a torch Dataset and use a torch DataLoader to access it. You may use any dataset you like, as long as it is imagery (since this assignment is focused on image transforms). Datasets built into PyTorch may be used. **0.25 points extra credit will be awarded for implementing a custom Dataset.** Be sure to pick a dataset you have the capacity to train a model on in your development environment. To keep the assignment fair grading will not be based on the size of the dataset used in any way. Plot a few of the images to ensure you have loaded the data correctly.

2. **Define the Training Pipeline**: This step will require us to define the class which specifies our neural network architecture, and to define the functions which allow us to train it. We have seen examples of this before, both in class and prior assignments. Be sure to chose an architecture with sufficient capacity to learn the dataset you have chosen in step #1.

3. **Train the Model**: In this step we will use our training functions from step #2 to train a model.

4. **Use Transforms to Robustness Test the Model Against Blur**: In this step, we will robustness test the model against an image blur transform. This step will require us to run model inference on the **test** set repeatedly, increasing the amount of image blur with each run. For each image blur level, we will need to save off the accuracy, then plot the accuracy against the degree of blur (choose and appropriate metric for the degree of blur). The end result of this step should be a plot of accuracy with respect to degree of blur. Be sure to test the model **to failure** to fully characterize its limitations.

5. **Robustness Test the Model Against a Transform of Your Choice**: In this step, we will robustness test the model against a transform of our choice (perhaps perspective changes, color changes, or contrast changes). Like step #4, this step will require us to run model inference on the **test** set repeatedly

## Grading

This assignment is entirely open ended and will not be auto-graded. Each step is worth 20% of the assignment grade. For each step, ensure the following conditions are met to get full credit.

| **Step** | **Criteria for Full Credit** |
|----|----|
| 1. Data Loading |  The data is successfully loaded and a few of the images are visualized.  |
| 2. Training Pipeline Definition | The training pipeline and neural network architecture are defined and error free.  |
| 3. Training | The model training runs for a sufficient number of epochs and the error curve is negative, indicating some learning is taking place. |
| 4. Robustness Test - Blur  | A curve of accuracy with respect to degree of blur is produced and is of professional quality. |
| 5. Robustness Test - Other  | A curve of accuracy with respect to degree of transformation is produced and is of professional quality. |

## Tips

This assignment is intentionally open ended. **No starter code is provided.** The goal of this assignment is to test your ability to put the concepts we have discussed in class into practice.

The steps should be similar, but not necessarily exactly the same, as steps we have discussed in class.

## Imports

In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib widget

# Your imports here


## Step 1 - Load Data

Use the cell below (or make additional cells) to load a torch Dataset and use a torch DataLoader to access it. You may use any dataset you like, as long as it is imagery (since this assignment is focused on image transforms). Datasets built into PyTorch may be used.

**0.25 points extra credit will be awarded for implementing a custom Dataset.** Be sure to pick a dataset you have the capacity to train a model on in your development environment. To keep the assignment fair grading will not be based on the size of the dataset used in any way. Plot a few of the images to ensure you have loaded the data correctly.


In [2]:
# Your code here

## Step 2 - Define Training Pipeline

Define a training pipeline like those we have seen in class. Define the class which specifies our neural network architecture, and to define the functions which allow us to train it. We have seen examples of this before, both in class and prior assignments.

Be sure to chose an architecture with sufficient capacity to learn the dataset you have chosen in step #1.

Tip: if you see your architecture is insufficient to learn the dataset you have chosen when you run step 3, you may need to return to step #2 to make your architecture wider or deeper.

In [3]:
# Your code here

## Step 3 - Train a Model

Kick off your training run here. Be sure to print or plot the error wrt. number of epochs so you can see if your model is learning.


In [None]:
# Your code here

## Step 4 - Use Transforms to Robustness Test the Model Against Image Blur

Use the cell below (or add more cells), to write code which repeatedly (e.g., in a for-loop) infers your model on your test set. Each time you do so, apply a blur transform with an increasing amount of blur. Compute and save the average accuracy across the entire test set and then plot the accuracy against the degree of blur (choose and appropriate metric for the degree of blur). The end result of this step should be a plot of accuracy with respect to degree of blur. Be sure to test the model **to failure** to fully characterize its limitations.


In [2]:
# Your code here

## Step 5 - Robustness Test the Model Against a Transform of Your Choice

Repeat step 4 for a transform of your choice. Create a plot showing how model accuracy degrades as the transform is applied more drastically.


In [3]:
# Your code here