<a href="https://colab.research.google.com/github/dylanwalker/BA865/blob/master/BA865_Lecture_08.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Code Preface

In [0]:
import torch
import pandas as pd
import numpy as np


# Defining Neural Network Architectures

In the past examples, we have discussed have multi-layered neural networks, but haven't actually shown any. We built a simply linear model using `nn.linear()`, but how do you build more complex architectures by using pytorch's existing layers?  

There are two conventional ways to do this:
1. Chain layers together using `torch.nn.Sequential()`
2. Create a class for your neural network that inherits from the `torch.nn.Module` base class and  implements the `forward()` method in it.

I will show examples of these two approaches:

In [0]:
model = torch.nn.Sequential(torch.nn.Linear(3,8),torch.nn.ReLU(),torch.nn.Linear(8,2),torch.nn.ReLU(),torch.nn.Softmax(dim=0))
model

In [0]:
#torch.rand(5,3)
model(torch.rand(5,3))

The sequential approach works when you have a simple network, but an alternative (and much more configurable and robust) approach is to define a class with a constructor and a `forward()` method: 

In [0]:
class BasicNet(torch.nn.Module):
  def __init__(self):
    # The constructor calls the base class constructor and then defines the layers that will be used (ordering doesn't matter here, as layers are just properties)
    super().__init__()
    self.fc1 = torch.nn.Linear(3,8)
    self.relu1 = torch.nn.ReLU()
    self.fc2 = torch.nn.Linear(8,2)
    self.relu2 = torch.nn.ReLU()
    self.softmax = torch.nn.Softmax(dim=0)
  
  def forward(self,x):
    # The forward() method describes how an input tensor (the argument x) will be passed through the layers.
    # Here, the order matters.
    # Also note that we can do other things to the data at any point between the layers (such as functionally transform it in some way)
    #  -- we could add noise to the data somewhere in between some layers, normalize it, randomly drop or forget some of it... etc.
    #  Advance NN approaches will often use such tricks. 
    x = self.fc1(x)
    x = self.relu1(x)
    x = self.fc2(x)
    x = self.relu2(x)
    x = self.softmax(x)
    return x


In [0]:
model = BasicNet()
y = model.forward(torch.rand(5,3))
print(y)

I'd like to turn our attention to architecture in pytorch but before I do that, because many of the concepts apply to tensor data that is 2D or higher, we'll take a brief foray into `torchvision` and image processing, so I can explain how we represent images as tensors.   

# Working with Image data

Pytorch has a bunch of useful utilities for working with image data under the `torchvision` module. 

Let's look at some example image datasets and how to use some of these utilities in practice. This will put us in a better position to discuss architecture.

First we'll need to import a bunch of modules:

In [0]:
# imports
import torch
import numpy as np
import matplotlib.pyplot as plt
import torchvision
from torchvision import transforms


We'll use one of the datasets built into torchvision called CIFAR10, a dataset of 60,000 32x32 pixel images -- each belonging to one of ten classes (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck) that is described in detail [here](https://www.cs.toronto.edu/~kriz/cifar.html). (btw, CIFAR stands for Canadian Institute for Advanaced Research, though I always think of it as "Can It Fly And Run"). 

We'll use `torchvision.datasets.CIFAR10()` but we don't want to work with the raw image data alone as it is delivered as a set of PIL (Python Image Library) image objects. We'll want to convert _each color channel of the image_ to a tensor and then normalize it so that the values all fall between (-1,1). 

To accomplish this, we'll use a tool from torchvision's transforms module, `transforms.Compose()` which lets us chain a bunch of transformations together. We'll chain `transforms.ToTensor()` and `transforms.Normalize()`, which takes the mean and std for each of the three color channels. If you work on images, there are tons of useful image transformation in torchvision 

In [0]:
# Chain a bunch of transformations together us torchvision.transforms.Compose
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))]) # First make the input data a tensor, then apply a normalization to each of the 3 color channels of the image

With those transforms defined, we can actually grab the dataset and apply the chain of transformations all in one line of code. The method to download and load CIFAR10 also allows you to set the argument `train=True` if you are going to use the images to train a NN (as opposed to testing it). 

In [0]:
# Grab training data from one of the built-in datasets (CIFAR10)
trainset = torchvision.datasets.CIFAR10(root='./cifar10', train=True,
                                        download=True, transform=transform)

In [0]:
trainset

In [0]:
trainset.data.shape

In [0]:
trainset.classes

Just to see what one image looks like as a bunch of tensors:

In [0]:
exampleImage, exampleClassLabel = trainset[0]
print(exampleImage.shape) # 3 different 32x32 tensors (one for each color channel) -- Each of the 32x32 values represents the intensity of the color for that pixel 
print(exampleImage)
print(exampleClassLabel) # the class label; we have to look at trainset.classes to see what this means

We'll want some way to show one of these color channels, so we'll make a quick function to do this:

In [0]:
def imshow(img):
  img = img/2 + 0.5 # undo our normalization, just to show the image, because plt's imshow() expects numbers to be between (0,1)
  img = img.numpy() # plt's imshow() knows how to work with numpy arrays, not tensors, so we'll convert it first
  plt.imshow(img)


Now lets grab a random example item from our dataset. We know there are 50,000 ( `trainset.data.shape[0]` ) items, so we'll use `np.random.randint()` to grab an index int this range, then we'll display just one of the color channels using our custom `imshow()` function. 

You should run this a few times, to get a feel for the data:

In [0]:
#imshow(exampleImage[0])

exItem = trainset[ np.random.randint(0,trainset.data.shape[0]) ] # note that exItem is a tuple, so exItem[0] is the image and exItem[1] is the class label index
colorChannel = 0 # You can change this to 1 or 2 if you'd like to see a different color channel
imshow(exItem[0][colorChannel])
print(trainset.classes[exItem[1]]) # remember trainset.classes is a list of the class labels, so this will translate the class label index (an int) into the class label


This same approach can be used to work with other image datasets, though the particulars (the number of pixels, color channels) may differ.

Now that you understand how images are represented as tensors, we can talk about the architecture.

In [0]:
import torch
import numpy as np
import matplotlib.pyplot as plt
import torchvision
from torchvision import transforms
from torch.utils.data import DataLoader

def imshow(img):
  if img.shape[0]==3: # its probably (color,width,height) so make it (width,height,color) which is what plt.imshow() wants
    img = img.transpose(1,2,0)
  img = img/2 + 0.5 
  img = img.cpu().numpy()
  plt.imshow(img)

class ImgNet(torch.nn.Module):
  def __init__(self,sizeInput,sizeHiddenLayer1,sizeHiddenLayer2,sizeOutput):
    super().__init__()
    self.fc1 = torch.nn.Linear(sizeInput,sizeHiddenLayer1)
    self.relu1 = torch.nn.ReLU()
    self.fc2 = torch.nn.Linear(sizeHiddenLayer1,sizeHiddenLayer2)
    self.relu2 = torch.nn.ReLU()
    self.fc3 = torch.nn.Linear(sizeHiddenLayer2,sizeOutput)
    self.logsoftmax = torch.nn.LogSoftmax(dim=1)
  
  def forward(self,x):
    x = self.fc1(x)
    x = self.relu1(x)
    x = self.fc2(x)
    x = self.relu2(x)
    x = self.fc3(x)
    x = self.logsoftmax(x)
    return x

def fit(num_epochs, model, train_dl, loss_fn, opt):
  for epoch in range(num_epochs):
    running_loss=0
    for xb,yb in train_dl: 
      xb = xb.view(xb.shape[0],-1)
      xb = xb.to("cuda")
      yb = yb.to("cuda")
      opt.zero_grad()
      pred = model(xb)
      loss = loss_fn(pred, yb)
      loss.backward()
      opt.step()
      running_loss+=loss.item()
    print(f"Epoch {epoch} loss = {running_loss/len(train_dl)}")


In [0]:
# MNIST
transform_mnist = transforms.Compose( [transforms.ToTensor(), transforms.Normalize(mean=(0.5,), std=(0.5,)) ] )
trainset_mnist = torchvision.datasets.MNIST('./mnist', download=True, train=True, transform=transform_mnist)
testset_mnist = torchvision.datasets.MNIST('./mnist', download=True, train=False, transform=transform_mnist)

batch_size = 64
train_dl_mnist = DataLoader(trainset_mnist, batch_size=batch_size, shuffle=True)
test_dl_mnist = DataLoader(testset_mnist, batch_size=batch_size, shuffle=True)
imgnet_mnist = ImgNet(28*28,128,64,10).cuda()

loss_fn_mnist = torch.nn.functional.nll_loss
opt_mnist = torch.optim.SGD(imgnet_mnist.parameters(), lr=0.003, momentum=0.9) # where did I get these "magic numbers?"  Trial and error and voodoo.
fit(15, imgnet_mnist, train_dl_mnist, loss_fn_mnist, opt_mnist)

In [0]:
images, labels = next(iter(test_dl_mnist))
img = images[0].to("cuda")
label = labels[0].to("cuda")

with torch.no_grad():
  predLabel = torch.argmax(imgnet_mnist(img.view(1,-1))).item()

imshow(img.view(28,28))
print(f"Predicted label was: {predLabel}")


In [0]:
# CIFAR10
transform_cifar = transforms.Compose( [ transforms.ToTensor(),transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)) ] )
trainset_cifar = torchvision.datasets.CIFAR10(root='./cifar10', train=True, download=True, transform=transform_cifar)
testset_cifar = torchvision.datasets.CIFAR10(root='./cifar10', train=False, download=True, transform=transform_cifar)

batch_size = 64
train_dl_cifar = DataLoader(trainset_cifar, batch_size=batch_size, shuffle=True)
test_dl_cifar = DataLoader(testset_cifar, batch_size=batch_size, shuffle=True)
imgnet_cifar = ImgNet(3*32*32,128,64,10).cuda()

loss_fn_cifar = torch.nn.functional.nll_loss
opt_cifar = torch.optim.SGD(imgnet_cifar.parameters(), lr=0.003, momentum=0.9) # where did I get these "magic numbers?"  Trial and error and voodoo.
fit(50, imgnet_cifar, train_dl_cifar, loss_fn_cifar, opt_cifar)


In [0]:
images, labels = next(iter(test_dl_cifar))
img = images[0].to("cuda")
label = labels[0].to("cuda")

with torch.no_grad():
  predLabel = torch.argmax(imgnet_cifar(img.view(1,-1))).item()

#imshow(img.permute(1,2,0)) # permute because matplotlib's imshow expects (width,height,color) but we have (color,width,height)
imshow(img)
print(f"Predicted label was: {testset_cifar.classes[predLabel]} ; Actual label was: {testset_cifar.classes[label]}")


# Architecture

Research into building new types of neural networks has advanced rapidly to solve a rich variety of different machine learning problems in the realms of computer vision, natural language processing, and many other contexts. These advances have lead to all sorts of new types of layers that have been implemented in pytorch.

We'll look at the following concepts, the layers associated with them, and discuss the ideas behind them:
- Pooling
- BatchNorm
- Dropout
- Convolution



## Pooling

# Example: Recognizing handwritten digits with a 