<a href="https://colab.research.google.com/github/wingated/cs474_labs_f2019/blob/master/DL_Lab2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 2: Intro to PyTorch

## Deliverable

For this lab, you will submit an ipython notebook via learningsuite.
This lab will be mostly boilerplate code, but you will be required to implement a few extras.

**NOTE: you almost certainly will not understand most of what's going on in this lab!
That's ok - the point is just to get you going with pytorch.
We'll be working on developing a deeper understanding of every part of this code
over the course of the next two weeks.**

A major goal of this lab is to help you become conversant in working through pytorch
tutorials and documentation.
So, you should feel free to google whatever you want and need!

This notebook will have four parts:

* Part 1: Your notebook should contain the boilerplate code. See below.

* Part 2: Your notebook should extend the boilerplate code by adding a testing loop.

* Part 3: Your notebook should extend the boilerplate code by adding a visualization of test/training performance over time.

The resulting image could, for example, look like this:
![](http://liftothers.org/dokuwiki/lib/exe/fetch.php?cache=&w=900&h=608&tok=3092fe&media=cs501r_f2018:lab2.png)

* Part 4: Your notebook should contain the completed microtasks and pass all the asserts.

See the assigned readings for pointers to documentation on pytorch.
___

### Grading standards:
Your notebook will be graded on the following:

* 40% Successfully followed lab video and typed in code
* 20% Modified code to include a test/train split
* 20% Modified code to include a visualization of train/test losses
* 10% Tidy and legible figures, including labeled axes where appropriate
* 10% Correct solutions to the microtasks
___

### Description
Throughout this class, we will be using pytorch to implement our deep neural networks. 
Pytorch is a deep learning framework that handles the low-level details of 
GPU integration and automatic differentiation.

The goal of this lab is to help you become familiar with pytorch. 
The four parts of the lab are outlined above.

For part 1, you should watch the video below, and type in the code as it is explained to you.

A more detailed outline of Part 1 is below.

For part 2, you must add a validation (or testing) loop using the 
FashionMNIST dataset with train=False

For part 3, you must plot the loss values.

For part 4, you must complete the microtasks and pass all asserts.

Optional: Demonstrate overfitting on the training data.

The easiest way to do this is to limit the size of your training dataset 
so that it only returns a single batch (ie len(dataloader) == batch_size, 
and train for multiple epochs. For example,
I set my batch size to 42, and augmented my dataloader to produce only 42 
unique items by overwriting the len function to return 42. 
In my training loop, I performed a validation every epoch which basically corresponded 
to a validation every step.

In practice, you will normally compute your validation loss every n steps, 
rather than at the end of every epoch. This is because some epochs can take hours, 
or even days and you don’t often want to wait that long to see your results.

Testing your algorithm by using a single batch and training until overfitting 
is a great way of making sure that your model and optimizer are working the way they should!

___

### Part 0
Watch Tutorial Video

[https://youtu.be/E76hLX9WCLE](https://youtu.be/E76hLX9WCLE)

**TODO:**
* Watch video

**DONE:**

___

### Part 1
Your notebook should contain the boilerplate code. See below.

**TODO:**

* Replicate boilerplate from the video

**DONE:**

___

### Part 2
Your notebook should extend the boilerplate code by adding a testing loop.

**TODO:**

* Add a testing (validation) loop

**DONE:**

In [2]:
!pip3 install torch 
!pip3 install torchvision
!pip3 install tqdm



In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import numpy as np
import matplotlib.pyplot as plt
from torchvision import transforms, utils, datasets
from tqdm import tqdm
 
assert torch.cuda.is_available() # You need to request a GPU from Runtime > Change Runtime Type

# Write the boilerplate code from the video here

class LinearNetwork(nn.Module):
  def __init__(self, dataset):
    super(LinearNetwork, self).__init__()
    x, y = dataset[0]
    c, h, w = x.size()
    out_dim = 10 # hardcoded

    self.net = nn.Sequential

optimizer = optim.SGD(model.parameters(), lr=1e-4)
objective = torch.nn.CrossEntropyLoss()

validation_losses = [None] * 40
counter = 0
train_losses = [None] * 40

num_epochs = 100
for epoch in range(num_epochs):
  # train
  for x, y_truth in train_loader:
    # learn
    x, y_truth = x.cuda(async=True), y_truth.cuda(async=True)
    
    optimizer.zero_grad()

    y_hat = model(x)
    loss = objective(y_hat, y_truth)

    if (epoch % 25 == 0 and batch == 0):
      # validate
      train_losses[counter] = loss
      for val_x, val_y_truth in validation_loader:
        val_x, val_y_truth = val_x.cuda(async=True), val_y_truth.cuda(async=True)
        val_y_hat = model(val_x)
        validation_loss_list.append(objective(val_y_hat, val_y_truth))
      validation_losses[counter] = sum(validation_loss_list) / float(len(validation_loss_list))

      counter += 1

    loop.set_description('batch:{d} loss:{:.4f} val_loss:{:.4f}'.format(batch, loss.item(), validation_losses[-1]))

    loss.backward()
    optimizer.step()
    batch += 1

loop.close()



# Create a dataset class that extends the torch.utils.data Dataset class here

# Extend the torch.Module class to create your own neural network

# Instantiate the train and validation sets

# Instantiate your data loaders

# Instantiate your model and loss and optimizer functions

# Run your training / validation loops


RuntimeError: ignored


___

### Part 3
Your notebook should extend the boilerplate code by adding a visualization of test/training
performance over time. Use matplotlib.pyplot

**TODO:**
* Add a visualization of test/train performance (i.e. loss) over time.

**DONE:**


In [None]:
# Write your code to create a plot of your loss over time


___

### Part 4
Complete the following microtasks to learn some important Pytorch skills. 

**TODO:**
* Complete microtasks

**DONE:**

In [None]:
# Tensors are the the lifeblood of Pytorch. 
# Construct a 5x3 tensor, 'a', of zeros and of dtype long

print(a)
print(a.size())
assert a.size() == torch.Size([5, 3])
assert type(a[0][0].item()) is int

In [None]:
# Many of your bugs will come from incorrect tensor dimensions. 
# Pytorch has several built-in functions to give you the control need. 
# Using only the .unsqueeze() function, turn 'a' into a 5x1x3 tensor. Hint: use the dim= argument

print(a.shape)
assert a.shape == torch.Size([5, 1, 3])

In [None]:
# Each dimension means something different. 
# You can change the order of your dimensions without losing information. 
# Reshape 'a' into a 5x3x1 tensor, using the .view() function

print(a.shape)
assert a.shape == torch.Size([5, 3, 1])

In [None]:
# Dimensions of size 1 can sometimes be necessary for shape matching.
# However, they can be removed without losing information. 
# Squeeze 'a' to remove dimensions of 1

print(a.shape)
assert a.size() == torch.Size([5, 3])

In [None]:
# You can turn any tensor into a tensor of a single dimension. 
# Flatten 'a' to a single dimension

print(a.size())
assert  a.size() == torch.Size([15]) 

In [None]:
# It's easy to integrate other common python data structures. 
# Initialize a tensor, 'b', from a list
my_list = [1,2,3,4,5]

print(b)
assert b.size() == torch.Size([5])

In [None]:
# GPUs will allow tensor operations to run much faster. 
# Assign 'a' and 'b' to run on GPU

print(a, b)
assert a.is_cuda and b.is_cuda

In [None]:
# You might not always have access to a GPU
# Assign 'a' and 'b' to run on CPU

print(a, b)
assert not a.is_cuda and not b.is_cuda

In [None]:
# You will often want to convert tensors to numpy arrays to interact with other python libraries
# Convert 'a' to a numpy array 'c'

print(type(c))
assert type(c) == np.ndarray

In [None]:
# To get your data back into Pytorch
# Convert 'c' to tensor 'd'

print(d.type())
assert torch.is_tensor(d)