<a href="https://colab.research.google.com/github/kani91/ProgrammingAssignment2/blob/master/7_reading_convolutinal_layers_lab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Convolutional Model

### Introduction

In this lesson, we'll work through constructing a convolutional neural network in Pytorch, and understanding the transformations that occur in our different layers.  We'll start by interpreting a premade neural network, and then we'll move onto constructing our own neural network let's get started.

### Loading our Data

To begin, we'll set the device as `cuda` so that we can perform calculations on the GPU and thus speed up training time.

In [None]:
import torch
torch.device("cuda")

device(type='cuda')

Let's begin by loading our Fashion MNIST dataset.

Import the `datasets` and `transforms` methods from the `torchvision` module.  Then import the FashionMNIST dataset making sure to apply the `ToTensor` transformation.

In [None]:
from torchvision import transforms, datasets

In [None]:
train = datasets.FashionMNIST("", train = True, download = True, 
                       transform = transforms.Compose([transforms.ToTensor()]))

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to FashionMNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting FashionMNIST/raw/train-images-idx3-ubyte.gz to FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to FashionMNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting FashionMNIST/raw/train-labels-idx1-ubyte.gz to FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to FashionMNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting FashionMNIST/raw/t10k-images-idx3-ubyte.gz to FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to FashionMNIST/raw
Processing...


  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


Done!


In [None]:
test = datasets.FashionMNIST("", train = False, download = True, 
                       transform = transforms.Compose([transforms.ToTensor()]))

Now let's `reshape` the data so that each batch has $100$ images inside of it, and each image has one channel, with a $28x28$ grid of pixels.

In [None]:
# train.data
X_train_reshaped =train.data.reshape(600, 100, 1, 28, 28)

In [None]:
X_train_reshaped.shape
# torch.Size([1200, 50, 1, 28, 28])

torch.Size([600, 100, 1, 28, 28])

Then resize the test data into batches of 100.

In [None]:
# train.targets
y_reshaped = train.targets.reshape(600, 100)

In [None]:
y_reshaped.shape
# torch.Size([600, 100])

torch.Size([1200, 50])

Finally, we'll zip together the reshaped training data.

In [None]:
combined = list(zip(X_train_reshaped, y_reshaped))

### Initializing a Neural Network

Now this [blog post](https://towardsdatascience.com/build-a-fashion-mnist-cnn-pytorch-style-efb297e22582) did the work of defining a neural network for us. 

It will be our task to unpack and interpret what's occurring in the different layers.

Ok, let's see it.

In [None]:
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
  def __init__(self):
    super().__init__()
    self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
    self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
    self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
    self.fc2 = nn.Linear(in_features=120, out_features=60)
    self.out = nn.Linear(in_features=60, out_features=10)

  # define forward function
  def forward(self, t):
    t = self.conv1(t)
    t = F.relu(t)
    t = F.max_pool2d(t, kernel_size=2, stride=2)
    # conv 2
    t = self.conv2(t)
    t = F.relu(t)
    t = F.max_pool2d(t, kernel_size=2, stride=2)
    # fc1
    t = t.reshape(-1, 12*4*4)
    t = self.fc1(t)
    t = F.relu(t)
    # fc2
    t = self.fc2(t)
    t = F.relu(t)
    # output
    t = self.out(t)
    return F.log_softmax(t, dim = 1)

> Now let's initialize the neural network and pass through a batch of data.

Initialize the neural network.

In [None]:
net = Net()
net

# Net(
#   (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
#   (conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
#   (fc1): Linear(in_features=192, out_features=120, bias=True)
#   (fc2): Linear(in_features=120, out_features=60, bias=True)
#   (out): Linear(in_features=60, out_features=10, bias=True)
# )

Net(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=192, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=60, bias=True)
  (out): Linear(in_features=60, out_features=10, bias=True)
)

And select the first batch of data.

In [None]:
first_batch = X_train_reshaped[0]

first_batch.shape

# torch.Size([100, 1, 28, 28])

torch.Size([100, 1, 28, 28])

In [None]:
import torch.nn as nn
conv = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=3)

Ok, now let's break down what occurs at each layer.  To do so, let's select the first convolutional layer, `net.conv1` from our neural network and pass through some data.

In [None]:
conv_outputs = net.conv1(first_batch.float())
conv_outputs.shape

# torch.Size([100, 6, 24, 24])

torch.Size([100, 6, 24, 24])

Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))

> This data gets passed to a relu function so call `net.conv1` again, and this time pass the outputs of `conv1` to the `F.relu` function.

In [None]:
ouput_relu_1 = F.relu(net.conv1(first_batch.float()))

In [None]:
ouput_relu_1.shape

torch.Size([100, 6, 24, 24])

> Next pass the outputs of `output_relu_1` to a `max_pool2d` function with a `kernel_size` of 2 to see the change in outputs. 

In [None]:
max_pool = nn.MaxPool2d(2)
output_maxpool_1 = max_pool(ouput_relu_1)
output_maxpool_1.shape

# torch.Size([100, 6, 12, 12])

torch.Size([100, 6, 12, 12])

We can see that this reduced the matrix from a 24 by 24 to now a 12x12 matrix.  Now that we finished with the `conv1` layer followed by a relu and a maxpool, let's move onto the `conv2` sequence.

```python
class Net(nn.Module):
  def __init__(self):
    super().__init__()
    self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
    self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
    self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
    self.fc2 = nn.Linear(in_features=120, out_features=60)
    self.out = nn.Linear(in_features=60, out_features=10)

  # define forward function
  def forward(self, t):
    t = self.conv1(t)
    t = F.relu(t)
    t = F.max_pool2d(t, kernel_size=2, stride=2)
    # conv 2
    t = self.conv2(t)
    t = F.relu(t)
    t = F.max_pool2d(t, kernel_size=2, stride=2)
    # fc1
    t = t.reshape(-1, 12*4*4)
    t = self.fc1(t)
    t = F.relu(t)
    # fc2
    t = self.fc2(t)
    t = F.relu(t)
    # output
    t = self.out(t)
    return F.log_softmax(t, dim = 1)
```

Ok, so let's get a sense of occurs in the next convolutional layer. 
```python
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
```

We again have a kernel size of 5, this time applied to an input of $12$, which we get from the preceding layer. 

Now if we look at how this convolutional layer is used in the forward function, we can begin to calculate the output shape that comes next.

In [None]:
t = net.conv2(output_maxpool_1)
t = F.relu(t)
t = F.max_pool2d(t, kernel_size=2, stride=2)

So we start with an output from previous sequence.

In [None]:
output_maxpool_1.shape

torch.Size([100, 6, 12, 12])

Then, the `conv2` layer is executed, passed through a relu, followed by a maxpool of 2.  

* `conv2 > relu > maxpool`

Without using Pytorch, calculate the resulting output.  Remember that our formula for calculating the output shape is.

$output = \frac{i - k + 2p}{s} + 1$ 

First assign the numbers that we'll need for calculation.

In [None]:
i = 12
k = 5
p = 0
s = 2

> Then translate the function above into code to predict the dimensions of the output.

In [None]:
def olp(i,k,p,s):
  return (i-k+2*p)/s+1
olp(i,k,p,s)
#  4.5

1.5

So rounding down, we get a $12$ output channels, each with a 4x4 matrix.

In [None]:
F.max_pool2d(F.relu(net.conv2(output_maxpool_1)), 2).shape

torch.Size([100, 12, 4, 4])

* Transitioning to linear layers

If we keep going, we see that after our two sequences of `conv > relu > maxpool`, the next step is to pass our output to a linear layer.  To do so, we need to take our output of 12 channels each of a 4x4 matrix, and translate it to a vector of length $ 12*4*4 = 192$ feature that we pass to a linear layer.  This explains the following line in the `forward` function.

```python
t = t.reshape(-1, 12*4*4)
```

In [None]:
12*4*4

192

And then passed this vector of 192 features is passed to a linear layer of 120 neurons, to 60 neurons to 10 neurons to pass to the `log_softmax` function.

### Performing Training

Ok, now let's train our neural network.

> Initialize the Adam optimizer, passing through the parameters and the learning rate.

In [None]:
import torch.optim as optim
optimizer  = optim.Adam(net.parameters(), lr=0.0005)
optimizer 
x_loss = nn.CrossEntropyLoss()

And fill in the middle part of the training loop.

In [None]:
for epoch in range(15):
    for X_batch, y_batch in combined:
        net.zero_grad()
        #X_reshaped = X_batch.view(-1,28*28)
        prediction_batch = net(X_batch.float())
        loss = x_loss(prediction_batch, y_batch) 
        loss.backward()  
        optimizer.step()
        
    print(loss)

tensor(0.4872, grad_fn=<NllLossBackward>)
tensor(0.4571, grad_fn=<NllLossBackward>)
tensor(0.3595, grad_fn=<NllLossBackward>)
tensor(0.2545, grad_fn=<NllLossBackward>)
tensor(0.2108, grad_fn=<NllLossBackward>)
tensor(0.2141, grad_fn=<NllLossBackward>)
tensor(0.1989, grad_fn=<NllLossBackward>)
tensor(0.2251, grad_fn=<NllLossBackward>)
tensor(0.2133, grad_fn=<NllLossBackward>)
tensor(0.1762, grad_fn=<NllLossBackward>)
tensor(0.1915, grad_fn=<NllLossBackward>)
tensor(0.1549, grad_fn=<NllLossBackward>)
tensor(0.1690, grad_fn=<NllLossBackward>)
tensor(0.1472, grad_fn=<NllLossBackward>)
tensor(0.1467, grad_fn=<NllLossBackward>)


Ok, now that the we have trained our neural network, let's see how it performed.  

> The target test data is already in an adequate format.

In [None]:
test_y = test.targets
test_y[:5]

tensor([9, 2, 1, 1, 6])

But the feature data is not.

In [None]:
test_X = test.data
test_X.shape

torch.Size([10000, 28, 28])

Reshape the `test_X` data so that each observation has a 28x28 channel.

In [None]:
test_X_reshaped = test.data.reshape(10000, 1, 28, 28)

In [None]:
test_X_reshaped.shape

# torch.Size([10000, 1, 28, 28])

torch.Size([10000, 1, 28, 28])

Now get a set of predictions of the test data from the neural network, find the argmax of each prediction, and use sklearn to compute the accuracy.

In [None]:
predictions = net(test_X_reshaped.float())

In [None]:
import torch
max_predictions = torch.argmax(predictions, axis = 1)

In [None]:
from sklearn.metrics import accuracy_score

accuracy_score(test.targets, max_predictions )

0.8684

### Summary

In this lesson, we learned how to read through and work with a convolutional neural network in Pytorch.  We saw that we can access convolutional layers and pass data through them to find the output returned from each layer.  

We can also use our formula to predict how the data will change shape as it passes through the layers.

$output = \frac{i - k + 2p}{s} + 1$ 

We saw that when passing data from a convolutional layer to a linear layer, we needed to transform our data to be in the form of vectors.

```python
F.max_pool2d(F.relu(net.conv2(output_maxpool_1)), 2).shape
# torch.Size([100, 12, 4, 4])

t = t.reshape(-1, 12*4*4)
```

Then we continued on with our training loop and assessing our neural network.

### Resources

[Fashion MNist](https://towardsdatascience.com/build-a-fashion-mnist-cnn-pytorch-style-efb297e22582)