# Training a Convolutional Neural Network

In this exercise, you will have to create a CNN model and then train it on the CIFAR10 dataset. The data loading and model training, testing logic are already included in your code. Infact, they are the same as for the Feed Forward Neural Network you built in the last exercises.

Here are the steps you need to do to complete this exercise:

1. In Starter Code below, finish the `Model()` class. These should contain the code that defines the layers of your model in the `__init__()` function and the model execution in the `forward()` function.
2. Add a cost function and optimizer. You can use the same cost functions and optimizer from the previous exercise.
3. Run the cells to make sure that the model is training properly.

In case you get stuck, you can look at the solution by clicking the jupyter symbol at the top left and navigating to `training_a_cnn_solution.ipynb`.

## Try It Out!
- Play around with the number of layers and filters in your model. How does the accuracy change? How long does it take to train the model?
- Try to train your model with some other types of convolutional layers like depthwise separable convolutions
- Can you create the same network in TensorFlow as well?


## Package Installations
**NOTE**: Everytime you start the GPU, run this before your code. 

In [None]:
#!pip install ipywidgets
#!pip list

In [None]:
#from IPython.core.display import HTML

#HTML("<script>Jupyter.notebook.kernel.restart()</script>")

## Starter Code

**Remember** to DISABLE the GPU when you are not training.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision
from torchvision import transforms

## Download and load data

In [None]:
batch_size =10

# Define data transformations
training_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

testing_transform = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])

# Download data
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=training_transform)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=testing_transform)

# Instance data loaders
train_loader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,shuffle=True)
test_loader = torch.utils.data.DataLoader(testset, batch_size=batch_size,shuffle=False)

### Create a CNN Class

In [None]:

import torch.nn.functional as F

# Create a Model subclass from nn.Module
class Model(nn.Module):
    def __init__(self):
        #super(Model, self).__init__()
        super().__init__() #In Python 3.x, the super().__init__() call is enough
        self.conv1 = nn.Conv2d(3, 6, 5) #OUT CHANNELS = NUMBER OF KERNELS
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x))) 
        #print(x.shape())
        x = torch.flatten(x, 1) # Flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x) #the activatio function is applied at training when calling the loss function!

        return x

In [None]:
def train(model, train_loader, loss_fn, optimizer, epochs):
    model.train()
    for e in range(1, epochs):
        running_loss=0
        correct=0
        for data, target in train_loader:
            optimizer.zero_grad()
            #NOTE: Notice how we are not changing the data shape here
            # This is because CNNs expects a 3 dimensional input
            pred = model(data)
            loss = loss_fn(pred, target)
            running_loss += loss
            loss.backward()
            optimizer.step()
            pred = pred.argmax(dim=1, keepdim=True) #get the predicted class
            correct += pred.eq(target.view_as(pred)).sum().item() #count the number of correct predictions
            # Accuracy = Correct predictions / All predictions ((TP + TN)/(All))
        print(f"Epoch {e}: Loss {running_loss/len(train_loader.dataset)}, Accuracy {100*(correct/len(train_loader.dataset))}%")

def test(model, test_loader):
    model.eval()
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            #NOTE: Notice how we are not changing the data shape here
            # This is because CNNs expects a 3 dimensional input
            output = model(data)
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability (the predicted class)
            correct += pred.eq(target.view_as(pred)).sum().item() #count the number fo correct predictions

    print(f'Test set: Accuracy: {correct}/{len(test_loader.dataset)} = {100*(correct/len(test_loader.dataset))}%)')

In [None]:
model = Model()

### Instance and train the model

In [None]:
# set model configs
lr = 0.01
loss_fn = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=lr)

# Set model hyperparams
epochs = 5

In [None]:
train(model,
      train_loader,
      loss_fn,
      optimizer,
      epochs)

In [None]:
test(model,
     test_loader)