# Convolution Neural Network

Throughout this practical, you will follow a series of steps to gain a deep understanding of CNNs and their applications. These steps include:

Loading and preprocessing the CIFAR-10 dataset using PyTorch's data transformations. The CIFAR-10 dataset consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. Splitting the dataset into train and validation sets to evaluate the performance of your models. Building a simple CNN architecture from scratch, experimenting with various layers, and exploring the effects of adding batch normalization and dropout. Developing a more advanced CNN model, ResNet, from scratch, and understanding its unique residual connections that enable the training of deeper networks. Training and evaluating your ResNet model on the CIFAR-10 dataset, and comparing its performance to the simpler CNN architectures.

By the end of this practical, you will have gained valuable experience in building and training CNNs for image classification tasks, as well as an understanding of the importance of residual connections in deep neural networks. Let's dive into the world of CNNs and discover their potential for solving complex visual recognition tasks using the CIFAR-10 dataset!


In [3]:
!nvidia-smi

Sun Mar 10 18:42:25 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   38C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [1]:
import torch
import torchvision
import torchvision.transforms as transforms

In this part of the practical, you are required to define a sequence of transformations for the CIFAR-10 dataset using PyTorch's transforms.Compose function. The purpose of these transformations is to preprocess and augment the input images before feeding them into the Convolutional Neural Network (CNN). The provided code block accomplishes this as follows:

Random Horizontal Flip: Apply the transforms.RandomHorizontalFlip transformation with a probability of 0.5. This means that there is a 50% chance that each image will be flipped horizontally. Horizontal flipping is a common data augmentation technique that helps the CNN learn to recognize objects regardless of their orientation, increasing the model's robustness and generalization ability.

Random Rotation: Use the transforms.RandomRotation transformation to randomly rotate the images within a specified range of degrees, in this case, between -15 and 15 degrees. Random rotation is another data augmentation technique that helps the CNN learn to be invariant to slight rotations in the input images. By exposing the model to a variety of rotated images during training, it becomes more resilient to rotational variations in real-world scenarios.

Convert the images to tensors: Apply the transforms.ToTensor transformation to convert each image from PIL Image format to a PyTorch tensor. This step is necessary because CNNs operate on tensors rather than raw image data. The ToTensor transformation automatically scales the pixel values from the range [0, 255] to [0, 1].

By applying these transformations, you are effectively preprocessing and augmenting the CIFAR-10 dataset before passing it to the CNN. Data augmentation techniques like random horizontal flipping and random rotation help increase the diversity of the training data, reducing overfitting and improving the model's ability to generalize to unseen examples.

In [71]:
transform = transforms.Compose([
    # FIXME
])

Now load the dataset by using `torchvision.datasets.CIFAR10`

In [77]:
dataset = torchvision.datasets.CIFAR10(root="train_data/", transform=transform, download=True)

Files already downloaded and verified


Display the type of the torchvision.datasets.CIFAR10 you have loaded

In [None]:
# FIXME

Display the number of samples for train and validation

In [None]:
# FIXME

Get the first element

In [None]:
# FIXME

Display the shape of the first element

In [None]:
# FIXME

Now display some images

In [None]:
# FIXME

Display the list of classes `dataset.classes`
And display the number of classes

In [None]:
# FIXME

Split the dataset between train and validation

You can do this with `SubsetRandomSampler` and then directly build your dataloader

In [83]:
from torch.utils.data import SubsetRandomSampler

In [None]:
# FIXME

In [86]:
from torch.utils.data import DataLoader

In [None]:
# FIXME

Now you have your dataloader extract the first batch of your dataloader

In [None]:
# FIXME

Import the necessary PyTorch modules for building the CNN architecture.

Create a custom CNN class that inherits from PyTorch's nn.Module.

Define the constructor method for your custom CNN class, accepting the number of classes as an argument.

Initialize convolutional layers, max-pooling layers, fully connected (linear) layers, and dropout layers within the constructor.

Implement the forward method in your custom CNN class to pass the input tensor through the defined layers.

Flatten the output tensor from the convolutional and pooling layers before passing it to the fully connected layers.

Return the output tensor from the final fully connected layer in the forward method.

In [None]:
# FIXME

Make the inference of the model with random batch data to verify everything works properly

In [None]:
# FIXME

In [144]:
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

In [None]:
# count_parameters(model)

Build the loss function and the optimizer

In [None]:
# FIXME

Now build the training and validation functions

In [148]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [None]:
# FIXME

Play with the architecture of the model and the hyperparameters.

And implement from scratch ResNet model

In [None]:
# FIXME