<a href="https://colab.research.google.com/github/jacekwachowiak/UCA-data-science-seminars/blob/master/Lab1_CNN_student.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 1: Convolutional Neural Networks

In this first Lab, we discover a Deep Learning framework (Pytorch), which we use to create our very first CNN (LeNet) and use it to perform handwritten character recognition.

This Lab assumes that you are familiar with the Python language. If you're not, please do Lab 0 first: https://colab.research.google.com/drive/16XlCqmmUQvwBD3D5u0lOy1rUFfDDZdYi

![alt text](https://camo.githubusercontent.com/d440ac2eee1cb3ea33340a2c5f6f15a0878e9275/687474703a2f2f692e7974696d672e636f6d2f76692f3051493378675875422d512f687164656661756c742e6a7067)

# Introduction

## What's a CNN?

Convolutional Neural Networks are a subclass of Neural Networks that use Convolutional layers. These layers are basically sliding filters and work quite well for vision tasks.

![alt text](https://embarc.org/embarc_mli/doc/build/html/_images/image104.jpg)

## What's a framework and why are we using one (PyTorch)?

A programming framework is a collection of functions and utilities that is ready to use. Modern Deep Learning frameworks contain everything that is needed (layers, optimizers, losses, gradient computation...) to create and use neural networks, and make that really easy.

PyTorch, originally created by Facebook, is one of the most used frameworks, especially among researchers. The other most used framework are Tensorflow (created by Google) and Keras (an abstraction layer for multiple frameworks, including Tensorflow). PyTorch has gained a lot of popularity since its 1.0 release in 2018.

![alt text](https://miro.medium.com/max/1200/1*s_BwkYxpGv34vjOHi8tDzg.png)

## Why use Google Colab and GPUs?

Google Colab is a collaborative workspace based on Jupyter Notebook, that lets you use a Python environment on Google Cloud with GPUs, for FREE!

GPUs (Graphical Processing Units) are powerful chips that let you train and use neural networks much faster than CPUs. Having access to a GPU is very important for Deep Learning, as it can often make training more than 100x faster. It might not seem that huge at first, but state of the art neural nets can take days to train on common datasets, even with multiple powerful GPUs. On CPUs, it would take **years**.

Colab gives you access to a free Nvidia Tesla K80 (most of the time), which is a 1000€ graphics card with 24GB of VRAM.

![alt text](https://gitcdn.xyz/cdn/Tony607/blog_statics/b9259799b5bf99efdf761fed8827ff28638c8599/images/colab/pytorch-colab.png)

## /!\ IMPORTANT: Use a GPU Runtime

To use a GPU in Colab, go to Runtime -> Change Runtime Type -> Hardware Accelerator -> GPU.

**Do this step before running any of the code below, otherwise you will have to run it again.**

## Importing libraries

For this lab, we only need PyTorch packages: "torch" and "torchvision".

"**torch**" contains the Deep Learning framework itself. "**torchvision**" contains datasets, pre-trained models, and image manipulation functions.

In [0]:
#Import Pytorch
import torch
import torchvision

#Little commonly used shortcut
import torch.nn as nn

#We need the display function from IPython for Jupyter Notebook/Colab
from IPython.display import display

#A package to make beautiful progress bars :) 
from tqdm import tqdm_notebook

## Downloading the data

We are working with the MNIST dataset, which containts handwritten digits.

We download the data using torchvision. We get the data in the form of a PyTorch "Dataset" object, which is a class that abstracts data loading and that can be iterated like a list.

In [0]:
mnist_dataset = torchvision.datasets.MNIST(".", train=True, download=True)

print(mnist_dataset[1])

We see that each element is a tuple containing a PIL Image (Python Imaging Library) and a label (0 here).
We can visualize PIL images in Colab using the **display** function.

In [0]:
display(mnist_dataset[128][0])

# Part 1: Looking at the data

In Data Science, it is very important to visualize the data we are working with, in order to understand it better and detect issues.

## Q1: Display 10 images with their label

Using a Python for loop and the **display** function, show 10 images from the dataset and their labels.

In [0]:
## YOUR CODE HERE

## Q2: Display 10 images from a specific class

Using a Python "for" loop and conditions (or list comprehensions), display 10 images from a specific class (4, for example).

In [0]:
## YOUR CODE HERE

## Q3: Count the number of elements in each class

Using a "for" loop, the "count" method and the "print" function, display the number of elements in each class of the MNIST dataset.

In [0]:
## YOUR CODE HERE

# Part 2: Creating a Neural Network with torch.nn

For our first CNN, we are going to implement LeNet5, which was used for handwritten digits classification for the first time in 1989 (http://yann.lecun.com/exdb/lenet/).

It is common to represent Neural Network architectures using charts such as this one:

![alt text](https://miro.medium.com/fit/c/1838/551/0*H9_eGAtkQXJXtkoK) 




## Q4: Implementing the network

All networks created with torch.nn are subclasses of nn.Module.

To create our network, we need to define two methods: the init method ("\_\_init\_\_") and the **forward** propagation method.



*   In the "**init**" function, we define the **parts** (layers) of the network we are going to use and store them as attributes 
*   In the "**forward**" function, we define the **order** of the layers by applying them to an input (define-by-run)




**Fill the code below** so that it corresponds to the architecture represented above. Please note that **there is a "ReLu" activation after every convolutional and linear (fully connected) layer in this architecture, that is not pictured.**

The torch.nn documentation will probably be useful https://pytorch.org/docs/stable/nn.html


In [0]:

class MyFirstNetwork(nn.Module):
  
  def __init__(self):
    super(MyFirstNetwork, self).__init__()
    
    ## MNIST images are 28x28 but LeNet5 expects 32x32
    ## -> we pad the images with zeroes
    self.padding = nn.ZeroPad2d(2)
    
    ## First convolution
    self.conv1 = nn.Conv2d(in_channels = 1, out_channels= 5 , kernel_size = 5)
    
    ## Second convolution
    self.conv2 = ## YOUR CODE HERE
    
    ## Pooling (subsampling) layer
    self.maxpool = nn.MaxPool2d(2)
    
    ## Activation layer
    self.relu = nn.ReLU()
    
    ## Fully connected layers
    self.fc1 = nn.Linear(in_features = 400, out_features = 120)
    self.fc2 = ## YOUR CODE HERE
    self.output = ## YOUR CODE HERE
    
    ## Final activation layer
    self.softmax = nn.LogSoftmax(dim=1)
    
  def forward(self, x):
    
    ## Pad the input
    x = self.padding(x)
    
    ## First convolution + activation
    x = self.conv1(x)
    x = self.relu(x)
    
    ## First pooling
    x = self.maxpool(x)
    
    ## Second Convolution + activation
    
    ## YOUR CODE HERE
    
    ## Second Pooling
    
    ## YOUR CODE HERE
    
    ## "Flatten" the output to make it 1D
    x = x.view(-1, 16*5*5)
    
    ## First full connection
    x = self.fc1(x)
    x = self.relu(x)
    
    ## Second full connection
    
    ## YOUR CODE HERE
    
    ## Output layer
    x = self.output(x)
    y = self.softmax(x)
    
    return y

### Testing our implementation

To check that the network is working and possibly correct mistakes, we apply the forward method to an image from the MNIST dataset. 

**Check that your network is working by running the code below.**

At any point in the forward pass, you can execute Python code allowing you to debug the network. This is one of the main benefits of using PyTorch. This framework has "eager execution", and the architecture is "defined-by-run". Other frameworks, such as Tensorflow, compile the network into a fixed graph that cannot be debugged using prints.

PyTorch works on objects called **Tensors**, which are N-dimensional matrices.

Tensors can be displayed using the Python '**print**' function, their size can be accessed using **x.size()**, etc.

See the PyTorch documentation for other operations on Tensors https://pytorch.org/docs/stable/tensors.html


In [0]:
## Create an instance of our network
net = MyFirstNetwork()

## Create a conversion function to convert PIL images into Tensors
convert = torchvision.transforms.ToTensor()

## Get our input image as a tensor. We add a dimension with "unsqueeze", because
## PyTorch is used to working with batches.
x = convert(mnist_dataset[0][0]).unsqueeze(0)

## Apply the network to the input
net(x)

The result should be a 10-dimensional vector (tensor) of probabilities. If your network does not work, edit it and try again!

NOTE: You can notice a "LogSoftmaxBackward" object. PyTorch automatically tracks which operations have been performed on Tensors, so that it can perform automatic gradient computation during training. 

#  Part 3: Training the network



## Choosing a loss function and an optimizer

As a Loss, the negative log likelihood (also called cross-entropy) is well suited here, because we have an N-class classification problem.

Stochastic Gradient Descent is the most basic optimization algorithm. You can try to use another one from torch.optim (https://pytorch.org/docs/stable/optim.html). You can also try to adjust the learning rate (lr parameter).

In [0]:
# Negative log likelihood loss
criterion = nn.NLLLoss()

# Stochastic Gradient Descent
optimizer = torch.optim.SGD(net.parameters(), lr=0.01)

## Basic training bricks

There are 4 parts to train the network on one sample of the dataset:



1.   Compute the forward pass

```
    y = net(x)
```


2.   Compute the loss


```
    loss = criterion(y, label)
```


3.   Reset the gradients


```
    optimizer.zero_grad()
```


4.   Compute the backward pass



```
     loss.backward()

```



5.   Apply one optimization step



```
     optimizer.step()

```





## Q5: Creating a basic training loop

Neural Networks are trained using training loops. We go over the dataset multiple times (each time is called an "epoch"), using "batches" of data.

First, we are going to create the most basic training loop possible, going over the dataset manually using a batch size of 1 (the network sees the images one by one). Every time an image goes through the network, we compute the loss and the gradients and update the network weights.

**Fill the code below using the basic training bricks above to create the training loop.**

In [0]:
## MODIFY THIS LINE IF THE TRAINING TAKES TOO LONG (MAX 60000)
SAMPLES_TO_USE = 1000

## NUMBER OF EPOCHS TO TRAIN
N_EPOCHS = 5

## Put the network in training mode
net.train()

for e in tqdm_notebook(range(N_EPOCHS), desc='Epochs'):
  
  running_loss = 0
  running_accuracy = 0
  
  for i in tqdm_notebook(range(SAMPLES_TO_USE), desc="Samples"):
    
    # Get a sample from the dataset
    sample = mnist_dataset[i]
    x = convert(sample[0]).unsqueeze(0)
    label = torch.tensor([sample[1]])
    
    ###
    ### YOUR CODE HERE
    ###
    
    ## Compute some statistics
    with torch.no_grad():
      running_loss += loss.data
      running_accuracy += 1 if y.max(1)[1] == label else 0
    
  print("Training accuracy:", running_accuracy/SAMPLES_TO_USE)
  print("Training loss:", running_loss/SAMPLES_TO_USE)
      

## Evaluating on a test set

While performance on the training set might be good (you can reach 100% accuracy in this example), what really matters is performance on the test set. The test set is composed of images the network has never seen before.

**Run the following code to evaluate your network.** Make sure that you reach at least approximately 90% test accuracy before moving on to the next part!

In [0]:
#Get the MNIST test set
mnist_test_dataset = torchvision.datasets.MNIST(".", train=False, transform=convert, download=True)

print("Number of test images:", len(mnist_test_dataset))

#Put the network in eval mode
net.eval()

acc = 0
#Disable gradient computation for this (we do not need them, this will speed up testing)
with torch.no_grad():
  for img, label in tqdm_notebook(mnist_test_dataset):

    y = net(img.unsqueeze(0))

    if y.max(1)[1] == label:
      acc +=1

  print("Test Accuracy:", acc/len(mnist_test_dataset))


## Q6: Displaying a few random results from the test set

Let's look at some results, to visualize in which cases our network makes mistakes.

Using Python's **random** package and your trained neural network, display 10 results from the test set.

HINT: to get the network output for a sample as a number, you can use: 

```
    net(convert(sample[0]).unsqueeze(0)).max(1)[1]
```



In [0]:
import random

mnist_test_dataset_PIL = torchvision.datasets.MNIST(".", train=False, download=True)

for i in range(10):
  ### YOUR CODE HERE

# Part 4: Creating a better training loop

Our basic training method works, but as you might have noticed it is pretty slow.
In this section, we improve it to create a "good" PyTorch training loop, by using batches, data loaders, validation data, and GPUs.

## Splitting between validation and training data

It is common to split the training dataset into a training set and a validation set. 

The network is trained on the training set, and its performance is validated at each epoch using the validation set.

This is used to prevent overfitting. If the training accuracy/loss improves but the validation accuracy/loss stagnate or worsen we are **overfitting** the training set.

The final evaluation is still done on the test set.

![Texte alternatif…](https://miro.medium.com/max/1552/1*Nv2NNALuokZEcV6hYEHdGA.png)

This is very easy in PyTorch. Let's split our MNIST **Dataset** into two random **Subsets** (it's a subclass of **Dataset**).

In [0]:
## Load Dataset
mnist_dataset = torchvision.datasets.MNIST(".", train=True, transform=convert, download=True)

## Percentage of validation data
validation_split = 0.1

N_val_samples = round(validation_split * len(mnist_dataset))

## Split into two Subset
train_set, val_set = torch.utils.data.random_split(mnist_dataset, [len(mnist_dataset) - N_val_samples, N_val_samples])

# train and val are Subset objects
print(train_set)
print(val_set)

# Their sizes should be correct
len(train_set) + len(val_set) == len(mnist_dataset)

## DataLoaders in PyTorch

In PyTorch, **DataLoaders** are tools that load **batches** of data from a **Dataset** (or any of its subclasses).

Training in batches of multiple samples (8, 32, 512...) can be a lot faster thanks to parallelism. Also, it helps with gradient descent by averaging the gradient over the whole batch.

Documentation on DataLoaders is here: https://pytorch.org/docs/stable/data.html

In [0]:
## This line creates a basic DataLoader from our mnist training set

## You can change options such as batch size, shuffling, number of workers...

BATCH_SIZE = 64

mnist_train_dl = torch.utils.data.DataLoader(train_set, batch_size=BATCH_SIZE, shuffle=True, num_workers=4)

## Q7: Use the DataLoaders

Using the DataLoader we just created, **print the number of batches it contains (you should see len(train_set)/batch_size), and print a batch from the dataset.**

HINT: you can get a Python iterator from a DataLoader using the **iter** function, because a DataLoader is an **iterable** object. https://wiki.python.org/moin/Iterator

You should see a number of samples and labels equal to your batch size.

In [0]:
## Print the length of the dataloader

### YOUR CODE HERE

## Print a batch

### YOUR CODE HERE


## Q8: Create a DataLoader for the validation set

In the same way, **create a DataLoader for the validation set**.

In [0]:
### YOUR CODE HERE

## Using a GPU

Using a GPU in PyTorch is very easy and will make our training A LOT faster.

To perform calculations on a batch on the GPU, our batch has to be moved to the GPU.

To move a Tensor to the GPU memory, use the **cuda()** method.

If you do not have a GPU, you will get a cuda runtime error. To add a GPU to Colab, go to Runtime -> Change Runtime Type -> Hardware Accelerator -> GPU.

In [0]:
batch = torch.Tensor(next(iter(mnist_train_dl))[0])

batch = batch.cuda()

You can see "device='cuda:0'", which means our Tensor has been moved to the first GPU in the system!

Now, we also need to move our neural network to the GPU, because the computations between the network weights and the training data will happen on the GPU.

For this, we also use the **cuda()** method.

In [0]:
## Create an instance of our network
net = MyFirstNetwork()

## Move it to the GPU
net = net.cuda()

Let's test this by computing an output on the GPU.

In [0]:
output = net(batch)

print(output)

## Q9: Our new training/validation loop

Now, we have all the necessary bricks to build our new improved training/validation loop!

We have already written the training loop for you. Complete the validation loop!

You should see that your network trains very fast on the whole training set now. One epoch should take about 10 seconds, depending on your batch size, compared to about 3 minutes on the CPU.

In [0]:
##RE-RUN THIS CODE TO GET A "NEW" NETWORK

LEARNING_RATE = 0.001
MOMENTUM = 0

## Create an instance of our network
net = MyFirstNetwork()

## Move it to the GPU
net = net.cuda()

# Negative log likelihood loss
criterion = nn.NLLLoss()

# Stochastic Gradient Descent
optimizer = torch.optim.SGD(net.parameters(), lr=LEARNING_RATE, momentum=MOMENTUM)

In [0]:
## NUMBER OF EPOCHS TO TRAIN
N_EPOCHS = 20

epoch_loss, epoch_acc, epoch_val_loss, epoch_val_acc = [], [], [], []

for e in tqdm_notebook(range(N_EPOCHS), desc='Epochs'):
  
  
  ### TRAINING LOOP
  running_loss = 0
  running_accuracy = 0
  
  ## Put the network in training mode
  net.train()
  
  for i, batch in enumerate(tqdm_notebook(mnist_train_dl, desc="Training Batches")):
    
    # Get a batch from the dataloader
    x = batch[0]
    labels = batch[1]
    
    # move the batch to GPU
    x = x.cuda()
    labels = labels.cuda()

    # Compute the network output
    y = net(x)
    
    # Compute the loss
    loss = criterion(y, labels)
    
    # Reset the gradients
    optimizer.zero_grad()
    
    # Compute the gradients
    loss.backward()
    
    # Apply one step of the descent algorithm to update the weights
    optimizer.step()
    
    ## Compute some statistics
    with torch.no_grad():
      running_loss += loss.item()
      running_accuracy += (y.max(1)[1] == labels).sum().item()
    
  print("Training accuracy:", running_accuracy/float(len(train_set)),
        "Training loss:", running_loss/float(len(train_set)))
  
  epoch_loss.append(running_loss/len(train_set))
  epoch_acc.append(running_accuracy/len(train_set))
  
  ### VALIDATION LOOP
  ## Put the network in validation mode
  net.eval()
  
  running_val_loss = 0
  running_val_accuracy = 0
  
  for i, batch in enumerate(tqdm_notebook(mnist_val_dl, desc="Validation Batches")):
    
    ### YOUR CODE HERE
    
    print("You need to write the validation loop!")
    break
      

## Q10: Making a graph with training/validation accuracy and loss

You can visualize your losses and accuracies over time using matplotlib.

Modify the training/val loop above to store the values (in a list), and display them in a graph.

Can you spot the moment where we start to overfit the training set? 

Try adjusting the number of epochs, the batch size, the learning rate or changing the optimizer (add some momentum or try Adam) to get a better result on the validation set.

Optimizers documentation: https://pytorch.org/docs/stable/optim.html#algorithms

In [0]:
import matplotlib.pyplot as plt
import numpy as np

### YOUR CODE HERE


## Q11: Evaluate our network on the test set (on GPU)

In the same way you wrote the validation loop, create a DataLoader on the mnist test dataset, and write a test loop!

You should be able to get 90% accuracy on the test set :)


In [0]:
#Get the MNIST test set
mnist_test_dataset = torchvision.datasets.MNIST(".", train=False, transform=convert, download=True)

### WRITE THE TEST LOOP (HINT: it looks a lot like the validation loop)

# (OPTIONAL) Part 5: Cats vs Dogs with a torchvision network

To test our skills on a new dataset, we will work on the famous Dogs vs Cats Kaggle dataset.

![alt text](http://adilmoujahid.com/images/cats-dogs.jpg)


Kaggle is a website that hosts machine learning/data science competitions. Check it out! https://www.kaggle.com/

![alt text](https://miro.medium.com/max/668/1*GZrTyTz0OKMbxnO5Trhcew.png)

## Downloading the data

First, go to the Kaggle website and create an account.

Then go to your account, click on Create New API Token - It will download kaggle.json file on your machine.

Upload the kaggle.json file using this code:




In [0]:
! pip install -q kaggle

from google.colab import files

files.upload()

Download the dogs vs cats dataset using this code:

In [0]:
! mkdir ~/.kaggle

! cp kaggle.json ~/.kaggle/

! chmod 600 ~/.kaggle/kaggle.json

! kaggle competitions download -c dogs-vs-cats

Extract the archives:

In [0]:
! rm -rf test1 train

! unzip -q train.zip

! unzip -q test1.zip

Put the data in separate directories:

In [0]:
! mkdir train/cats
! mkdir train/dogs
! mv train/cat.* train/cats
! mv train/dog.* train/dogs

## Q12: Load the data

Using ImageFolder from torchvision (https://pytorch.org/docs/stable/torchvision/datasets.html#imagefolder), load the dataset. 

The training set is in the "train" directory.

In [0]:
### YOUR CODE HERE

## Q13: Display a few images

As before, display a few images with the **display** function. You can see that these images have varying sizes.

In [0]:
### YOUR CODE HERE

## Using a torchvision model

torchvision has a repository of popular models ready to use for diverse computer vision tasks (classification, segmentation,...)

https://pytorch.org/docs/stable/torchvision/models.html#classification



In [0]:
## You can change the model if you want
net = torchvision.models.resnet18()

print(net)

## torchvision models are meant to be used on imagenet (1000 classes)
## since we only have two classes, we need to modify the last layer

net.fc = nn.Linear(512,2)

## Q14: Resize the images on the fly using torchvision transforms 

We can see from the documentation that torchvision models expect at least 244x244 images.

1. Using torchvision.transforms, create a new ImageFolder dataset with on-the-fly resizing of images.

2. Split this Dataset into training and validation sets, as before.

3. Create a DataLoader for each set as well, just like before.

In [0]:
### YOUR CODE HERE

## Q15: Training the model

Write the training loop. You should be able to pretty much copy-paste the one from Q9.

In [0]:
LEARNING_RATE = 0.001

## Move model to the GPU
net = net.cuda()

# Negative log likelihood loss
criterion = nn.NLLLoss()

# Stochastic Gradient Descent
optimizer = torch.optim.Adam(net.parameters(), lr=LEARNING_RATE)

In [0]:
### YOUR CODE HERE

## Q16: Test the network

Compute some predictions on the test set.

In [0]:
### YOUR CODE HERE

## Q17: Going further

Try different networks from torchvision, and different parameters. The winner of the competition got more than 98% accuracy. How much can you get?

Data augmentation (modifying your input data to make "more" of it) is a huge thing in deep learning. Try some techniques such as random cropping and rotation using torchvision transforms in your Dataset objects!

PyTorch has a lot of tutorials to get you started: https://pytorch.org/tutorials/index.html

Have fun!

