# PyTorch Assignment: Convolutional Neural Network (CNN)

**[Duke Community Standard](http://integrity.duke.edu/standard.html): By typing your name below, you are certifying that you have adhered to the Duke Community Standard in completing this assignment.**

Name: Manuel Jesús Corbacho Sánchez

### Convolutional Neural Network

Adapt the CNN example for MNIST digit classfication from Notebook 3A. 
Feel free to play around with the model architecture and see how the training time/performance changes, but to begin, try the following:

Image ->  
convolution (32 3x3 filters) -> nonlinearity (ReLU) ->  
convolution (32 3x3 filters) -> nonlinearity (ReLU) -> (2x2 max pool) ->  
convolution (64 3x3 filters) -> nonlinearity (ReLU) ->  
convolution (64 3x3 filters) -> nonlinearity (ReLU) -> (2x2 max pool) -> flatten ->
fully connected (256 hidden units) -> nonlinearity (ReLU) ->  
fully connected (10 hidden units) -> softmax 

Note: The CNN model might take a while to train. Depending on your machine, you might expect this to take up to half an hour. If you see your validation performance start to plateau, you can kill the training.

In [3]:
### YOUR CODE HERE ###
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets, transforms
from tqdm.notebook import tqdm, trange

if torch.cuda.is_available():
  device = torch.device('cuda:0')
else:
  device = torch.device('cpu')

    
# Parameters of the learning
epochs = 5
learning_rate = 0.001
samples_per_batch = 100
stride_size = 1

class MNIST_CNN_assignment(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d( 1, 32, kernel_size=3)
        self.conv2 = nn.Conv2d(32, 32, kernel_size=3)
        self.conv3 = nn.Conv2d(32, 64, kernel_size=3)
        self.conv4 = nn.Conv2d(64, 64, kernel_size=3)
        self.set_fc1 = False
        self.fc2 = nn.Linear(256, 10)
        self.max_pool = nn.MaxPool2d(2,stride_size)
        self.viewsize = 0
    def forward(self, x):
        x = F.relu(self.conv1(x))          # conv layer 1
        # print(x.shape)
        x = F.relu(self.conv2(x))          # conv layer 2
        # print(x.shape)
        x = self.max_pool(x)                    # pool         
        # print(x.shape)
        x = F.relu(self.conv3(x))          # conv layer 3
        # print(x.shape)
        x = F.relu(self.conv4(x))          # conv layer 4         
        # print(x.shape)
        x = self.max_pool(x)                    # pool      
        if not self.set_fc1:
            self.viewsize = x.shape[2]*x.shape[3]*64
            self.fc1 = nn.Linear(self.viewsize, 256)
            self.set_fc1 = True
            self.to(device)
        # print(x.shape)
        x = F.relu(self.fc1(x.view(-1, self.viewsize)))            # fc layer 1        
        # print(x.shape)
        x = self.fc2(x)                    # fc layer 2        
        # print(x.shape)
        return x   
    '''
    torch.Size([100, 32, 26, 26])
    torch.Size([100, 32, 24, 24])
    torch.Size([100, 32, 23, 23])
    torch.Size([100, 64, 21, 21])
    torch.Size([100, 64, 19, 19])
    torch.Size([100, 64, 18, 18])
    torch.Size([2025, 1024])
    torch.Size([2025, 256])
    torch.Size([2025, 10])
    '''
    
# Load the data
mnist_train = datasets.MNIST(root="./datasets", train=True, transform=transforms.ToTensor(), download=True)
mnist_test = datasets.MNIST(root="./datasets", train=False, transform=transforms.ToTensor(), download=True)
train_loader = torch.utils.data.DataLoader(mnist_train, batch_size=samples_per_batch, shuffle=True)
test_loader = torch.utils.data.DataLoader(mnist_test, batch_size=samples_per_batch, shuffle=False)

## Training
# Instantiate model
model = MNIST_CNN_assignment().to(device)
# Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for i in trange(epochs):
    # Iterate through train set minibatchs 
    for images, labels in tqdm(train_loader):        
        # Zero out the gradients
        optimizer.zero_grad()
        # Forward pass        
        x = images.to(device)
        y = model(x)
        loss = criterion(y, labels.to(device))
        # Backward pass
        loss.backward()
        optimizer.step()

## Testing
correct = 0
total = len(mnist_test)

with torch.no_grad():
    # Iterate through test set minibatchs 
    for images, labels in tqdm(test_loader):
        # Forward pass
        x = images.to(device)
        y = model(x)        
        predictions = torch.argmax(y, dim=1)
        correct += torch.sum((predictions == labels.to(device)).float())
print('Test accuracy: {}'.format(correct/total))
# Make sure to print out your accuracy on the test set at the end.

HBox(children=(FloatProgress(value=0.0, max=5.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=600.0), HTML(value='')))

### Short answer

1\. How does the CNN compare in accuracy with yesterday's logistic regression and MLP models? How about training time?

`With the logistic regression, the accuracy was 90%, meanwhile the MLP's accuracy was in the range 92-95%, on the other hand, the CNN can get an accuracy of 98% `

2\. How many trainable parameters are there in the CNN you built for this assignment?

*Note: The total of trainable parameters counts each element in a tensor. For example, a weight matrix that is 10x5 has 50 trainable parameters.*

In [4]:
pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print(pytorch_total_params)

5376234


`Our model has 5376234 trainable parameters`

3\. When would you use a CNN versus a logistic regression model or an MLP?

`Most of the time, the CNN will be used to work with images, where it performs better than the MLP or a logistic regression. CNN should also be able to handle more difficult problems than the ones that can be handled by MLP or Logistic Regression.`