# TC 5033
## Deep Learning
## Convolutional Neural Networks
<br>

#### Activity 2b: Building a CNN for CIFAR10 dataset with PyTorch
<br>

- Objective

    The main goal of this activity is to further your understanding of Convolutional Neural Networks (CNNs) by building one using PyTorch. You will apply this architecture to the famous CIFAR10 dataset, taking what you've learned from the guide code that replicated the Fully Connected model in PyTorch (Activity 2a).

- Instructions
    This activity requires submission in teams of 3 or 4 members. Submissions from smaller or larger teams will not be accepted unless prior approval has been granted (only due to exceptional circumstances). While teamwork is encouraged, each member is expected to contribute individually to the assignment. The final submission should feature the best arguments and solutions from each team member. Only one person per team needs to submit the completed work, but it is imperative that the names of all team members are listed in a Markdown cell at the very beginning of the notebook (either the first or second cell). Failure to include all team member names will result in the grade being awarded solely to the individual who submitted the assignment, with zero points given to other team members (no exceptions will be made to this rule).

    Understand the Guide Code: Review the guide code from Activity 2a that implemented a Fully Connected model in PyTorch. Note how PyTorch makes it easier to implement neural networks.

    Familiarize Yourself with CNNs: Take some time to understand their architecture and the rationale behind using convolutional layers.

    Prepare the Dataset: Use PyTorch's DataLoader to manage the dataset. Make sure the data is appropriately preprocessed for a CNN.

    Design the CNN Architecture: Create a new architecture that incorporates convolutional layers. Use PyTorch modules like nn.Conv2d, nn.MaxPool2d, and others to build your network.

    Training Loop and Backpropagation: Implement the training loop, leveraging PyTorch’s autograd for backpropagation. Keep track of relevant performance metrics.

    Analyze and Document: Use Markdown cells to explain your architectural decisions, performance results, and any challenges you faced. Compare this model with your previous Fully Connected model in terms of performance and efficiency.

- Evaluation Criteria

    - Understanding of CNN architecture and its application to the CIFAR10 dataset
    - Code Readability and Comments
    - Appropriateness and efficiency of the chosen CNN architecture
    - Correct implementation of Traning Loop and Accuracy Function
    - Model's performance metrics on the CIFAR10 dataset (at least 65% accuracy)
    - Quality of Markdown documentation

- Submission

Submit via Canvas your Jupyter Notebook with the CNN implemented in PyTorch. Your submission should include well-commented code and Markdown cells that provide a comprehensive view of your design decisions, performance metrics, and learnings.

# Integrantes del equipo:

*   Oscar Villa Cardenas - A01794052
*   Diego Alberto Olarte Mira - A01794028
*   Erick Alexei Cambray Servín - A01794243
*   Andres Javier Galindo Vargas - A01793927

In [None]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torch.utils.data import sampler
import torchvision.datasets as datasets
import torchvision.transforms as T
import matplotlib.pyplot as plt


### Download Cifar10 dataset

In [None]:
torch.cuda.is_available()


In [None]:
DATA_PATH = './data/'
NUM_TRAIN = 50000
NUM_VAL = 5000
NUM_TEST = 5000
MINIBATCH_SIZE = 64

transform_cifar = T.Compose([
                T.ToTensor(),
                T.Normalize([0.491, 0.482, 0.447], [0.247, 0.243, 0.261])
            ])

# Train dataset
cifar10_train = datasets.CIFAR10(DATA_PATH, train=True, download=True,
                             transform=transform_cifar)
train_loader = DataLoader(cifar10_train, batch_size=MINIBATCH_SIZE,
                          sampler=sampler.SubsetRandomSampler(range(NUM_TRAIN)))
#Validation set
cifar10_val = datasets.CIFAR10(DATA_PATH, train=False, download=True,
                           transform=transform_cifar)
val_loader = DataLoader(cifar10_val, batch_size=MINIBATCH_SIZE,
                        sampler=sampler.SubsetRandomSampler(range(NUM_VAL)))
#Test set
cifar10_test = datasets.CIFAR10(DATA_PATH, train=False, download=True,
                            transform=transform_cifar)
test_loader = DataLoader(cifar10_test, batch_size=MINIBATCH_SIZE,
                        sampler=sampler.SubsetRandomSampler(range(NUM_VAL, len(cifar10_test))))


In [None]:
cifar10_train


In [None]:
train_loader.batch_size


In [None]:
for i, (x, y) in enumerate(train_loader):
    print(x, y)


### Using  GPUs

In [None]:
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')
print(device)


### Mostrar imágenes

In [None]:
classes = test_loader.dataset.classes
def plot_figure(image):
    plt.imshow(np.transpose(image,(1,2,0)))
    plt.axis('off')
    plt.show()

rnd_sample_idx = np.random.randint(len(test_loader))
print(f'La imagen muestreada representa un: {classes[test_loader.dataset[rnd_sample_idx][1]]}')
image = test_loader.dataset[rnd_sample_idx][0]
image = (image - image.min()) / (image.max() -image.min() )
plot_figure(image)


# Class visualization
It will create a gridplot to see the 10 classes and examples of each one of the classes.

In [None]:
def plot_cifar10_grid():
    classes = test_loader.dataset.classes
    total_samples = 8
    plt.figure(figsize=(15,15))
    for label, sample in enumerate(classes):
        class_idxs = np.flatnonzero(label == np.array(test_loader.dataset.targets))
        sample_idxs = np.random.choice(class_idxs, total_samples, replace = False)
        for i, idx in enumerate(sample_idxs):
            plt_idx = i*len(classes) + label + 1
            plt.subplot(total_samples, len(classes), plt_idx)
            plt.imshow(test_loader.dataset.data[idx])
            plt.axis('off')

            if i == 0: plt.title(sample)
    plt.show()

plot_cifar10_grid()


## Accuracy
This function calculates the accuracy of a model on a given data loader.

To calculate the accuracy, we first set the model to evaluation mode. This ensures that the model does not calculate gradients during the accuracy calculation. We then move the model to the desired device (CPU or GPU).

Next, we iterate over the data loader and calculate the accuracy for each batch. For each batch, we first move the data and labels to the same device as the model. We then pass the data to the model and get the predicted scores. We then get the predicted labels by taking the maximum score for each batch element. Finally, we compare the predicted labels to the true labels and update the number of correct and total predictions.

After iterating over the entire data loader, we calculate the overall accuracy by dividing the number of correct predictions by the total number of predictions.

This function can be used to evaluate the accuracy of a model on a held-out test set. It can also be used to track the accuracy of a model during training.


In [None]:
def accuracy(model, loader):
  """Calculates the accuracy of a model on a given data loader.

  Args:
    model: A PyTorch model.
    loader: A PyTorch data loader.

  Returns:
    The accuracy of the model on the given data loader.
  """

  num_correct = 0
  num_total = 0
  model.eval()
  model = model.to(device=device)
  with torch.no_grad():
    for xi, yi in loader:
      xi = xi.to(device=device, dtype=torch.float32)
      yi = yi.to(device=device, dtype=torch.long)
      scores = model(xi)
      _, pred = scores.max(dim=1)
      num_correct += (pred == yi).sum()
      num_total += pred.size(0)
  return float(num_correct) / num_total


## Training Loop

This function trains a PyTorch model on a given training dataset for a specified number of epochs. The function takes three arguments:

model: A PyTorch model.
optimiser: A PyTorch optimizer.
epochs: The number of epochs to train the model for.
The function works as follows:

- It moves the model to the desired device (CPU or GPU).
- It starts a loop to iterate over the training dataset for the specified number of epochs.
- For each epoch, it iterates over the training dataset and performs the following steps:
- It sets the model to training mode.
- Moves the data and labels to the same device as the model.
- Runs the model on the data and calculates the predicted scores.
- Calculates the loss function (cross-entropy in this case).
- Sets the gradients to zero.
- Backpropagates the loss through the network to calculate the gradients for all the parameters.
- Updates the model parameters using the optimizer.
After each epoch, the function calculates the accuracy of the model on the validation dataset.
Finally, the function returns the model.

In [44]:
def train(model, optimiser, epochs=100):
    """Trains a PyTorch model on a given training dataset for a specified number of epochs.

    Args:
        model: A PyTorch model.
        optimiser: A PyTorch optimizer.
        epochs: The number of epochs to train the model for.

    Returns:
        The trained model.
    """

    model = model.to(device=device)
    for epoch in range(epochs):
        for i, (xi, yi) in enumerate(train_loader):
            model.train()
            xi = xi.to(device=device, dtype=torch.float32)
            yi = yi.to(device=device, dtype=torch.long)
            scores = model(xi)
            cost = F.cross_entropy(input=scores, target=yi)
            optimiser.zero_grad()
            cost.backward()
            optimiser.step()

        acc = accuracy(model, val_loader)
        print(f'Epoch: {epoch}, costo: {cost.item()}, accuracy: {acc},')

    # return model



## Linear model
To test the base model.

This code creates a linear neural network model and an optimizer. The model has three layers: an input layer, a hidden layer, and an output layer. The input layer has 32323 neurons, which is the number of pixels in a color image. The hidden layer has 256 neurons. The output layer has 10 neurons, which is the number of classes in the dataset.

The optimizer is Adam, which is a popular optimizer for training deep learning models.

In [None]:
def create_model(hidden1=256, hidden=256, lr=0.001, epochs=10):
    """Creates a linear neural network model and an optimizer.

    Args:
        hidden1: The number of neurons in the hidden layer.
        hidden: The number of neurons in the hidden layer.
        lr: The learning rate.
        epochs: The number of epochs to train the model for.

    Returns:
        A tuple containing the model and the optimizer.
    """

    model1 = nn.Sequential(
        nn.Flatten(),
        nn.Linear(in_features=32*32*3, out_features=hidden1), nn.ReLU(),
        nn.Linear(in_features=hidden1, out_features=hidden), nn.ReLU(),
        nn.Linear(in_features=hidden, out_features=10))

    optimiser = torch.optim.Adam(model1.parameters(), lr=lr)

    return model1, optimiser

model1, optimiser = create_model()


In [None]:
epochs = 10
train(model1, optimiser, epochs)


### Sequential CNN

- The model has two convolutional layers, a max pooling layer, and a fully connected layer. 
- First convolutional layer has 16 filters and the second convolutional layer has 32 filters. 
- Max pooling layer downsamples the output of the second convolutional layer by a factor of 2. The fully connected layer has 10 neurons, which is the number of classes in the dataset.

In [None]:
def create_cnn_model(channel1=16, channel2=32, epochs=10, lr=0.0001):
    """Creates a convolutional neural network model and an optimizer.

    Args:
        channel1: The number of filters in the first convolutional layer.
        channel2: The number of filters in the second convolutional layer.
        epochs: The number of epochs to train the model for.
        lr: The learning rate.

    Returns:
        A tuple containing the model and the optimizer.
    """

    modelCNN1 = nn.Sequential(
                nn.Conv2d(in_channels=3, out_channels=channel1,
                        kernel_size=3, padding=1),
                nn.ReLU(),
                nn.Conv2d(in_channels=channel1, out_channels=channel2,
                        kernel_size= 3, padding=1),
                nn.ReLU(),
                nn.MaxPool2d(2, 2),
                nn.Flatten(),
                nn.Linear(in_features=16*16*channel2, out_features=10)
            )

    optimiser = torch.optim.Adam(modelCNN1.parameters(), lr)

    return modelCNN1, optimiser

modelCNN1, optimiser = create_cnn_model()


In [None]:
train(modelCNN1, optimiser, epochs)


This lambda function takes two parameters, channel1 and channel2, and returns a nn.Conv2d() layer with the specified input and output channels, a kernel size of 3, and padding of 1.

Lambda functions are anonymous functions that can be created on the fly. They are useful for creating concise and expressive code. In this case, the lambda function is used to reduce the amount of duplicate code in the main function.

In [None]:
conv_k_3 = lambda channel1, channel2: nn.Conv2d(channel1, channel2, kernel_size=3, padding=1)


This class is built on top of the PyTorch framework, specifically extending the "nn.Module" class. 
The constructor of the "CNN_class4" class accepts three parameters: "in_channels" representing the number of input channels, "channel1" representing the output channels for the first convolutional layer and the input channels for the second convolutional layer, and "channel2" representing the output channels for the second convolutional layer. 

Inside the constructor, it initializes two convolutional layers, applies batch normalization to their outputs, and sets up a max-pooling operation with a 2x2 kernel. 
The "forward" method overrides the parent class's forward function and defines the sequence of operations: applying the first convolutional layer, normalizing and applying the ReLU activation function, then applying the second convolutional layer, normalizing and applying ReLU again, followed by max-pooling. This class facilitates the creation and customization of a CNN architecture for image processing tasks.


In [None]:
class CNN_class4(nn.Module):
    """
    A PyTorch-based Convolutional Neural Network (CNN) class with customizable architecture.

    Args:
        in_channels (int): Number of input channels.
        channel1 (int): Number of output channels for the first convolutional layer and input channels for the second layer.
        channel2 (int): Number of output channels for the second convolutional layer.

    Attributes:
        conv1 (nn.Module): First convolutional layer.
        bn1 (nn.Module): Batch normalization layer for the first convolutional layer.
        conv2 (nn.Module): Second convolutional layer.
        bn2 (nn.Module): Batch normalization layer for the second convolutional layer.
        max_pool (nn.Module): Max-pooling operation with a 2x2 kernel and stride 2.
    """

    def __init__(self, in_channels, channel1, channel2):
        super().__init__()
        self.conv1 = conv_k_3(in_channels, channel1)
        self.bn1 = nn.BatchNorm2d(channel1)
        self.conv2 = conv_k_3(channel1, channel2)
        self.bn2 = nn.BatchNorm2d(channel2)
        self.max_pool = nn.MaxPool2d(2, 2)

    def forward(self, x):
        """
        Forward pass through the CNN architecture.

        Args:
            x (torch.Tensor): Input data.

        Returns:
            torch.Tensor: Output of the CNN after applying convolution, batch normalization, ReLU activation,
            and max-pooling.
        """
        x = F.relu(self.bn2(self.conv2(F.relu(self.bn1(self.conv1(x))))))
        return self.max_pool(x)



In [49]:
# Define the number of channels for different layers
channel1 = 16  # Channels for the first layer
channel2 = 32  # Channels for the second layer
channel4 = 128  # Channels for the third and fourth layers

epochs = 10
lr = 0.001

"""
A PyTorch-based neural network model composed of convolutional layers and linear layers for image classification.

Architecture:
- Layer 1: Input channels = 3, Output channels = channel1.
- Layer 2: Input channels = channel2, Output channels = channel2.
- Layer 3: Input channels = channel2, Output channels = channel4.
- After the convolutional layers, the output is flattened and passed through multiple linear layers
    before the final classification layer with 10 output features.

Args:
    channel1 (int): Number of output channels for the first layer.
    channel2 (int): Number of output channels for the second layer.
    channel4 (int): Number of output channels for the third and fourth layers.

Attributes:
    Sequential: PyTorch Sequential container for defining the model's architecture.
"""
modelCNN5 = nn.Sequential(
    #We initiate an instance of CNN_Class4 and pass the parameters:
    #3 as the initial channel of the network
    #channel1 as both the out_channel of the first layer as well as the in_channel of the second layer
    #channel2 as the out_channel of the second layer
    CNN_class4(3,channel1, channel2),
    #We initiate another instance of CNN_Class4 and pass the parameters:
    #channel2 as the in_channel of the third layer
    #channel4 as the out_channel of the the third layer as well as for both the in_channel and out_channel of the forth layer
    CNN_class4(channel2, channel4, channel4),
    #Then we flatten the output 8x8x64 into a one dimensional vector
    nn.Flatten(),
    #Lastly we apply the last layer as a linear layer with the params:
    #in_features as 8*8*64 because those were the dimensions before the flatten
    #out_feature as 10 since we only have 10 classes
    nn.Linear(in_features=8*8*channel4, out_features=10)
)
#We use as optimizer Adam since that is the one we usually use as default
optimiser = torch.optim.Adam(modelCNN5.parameters(), lr)


In [50]:
train(modelCNN5, optimiser, epochs)


Epoch: 0, costo: 0.7745968103408813, accuracy: 0.6174,
Epoch: 1, costo: 0.7720070481300354, accuracy: 0.6956,
Epoch: 2, costo: 0.9943426251411438, accuracy: 0.7344,
Epoch: 3, costo: 0.7879390716552734, accuracy: 0.7678,
Epoch: 4, costo: 1.01791512966156, accuracy: 0.7642,
Epoch: 5, costo: 0.6820675730705261, accuracy: 0.7818,
Epoch: 6, costo: 0.3965768814086914, accuracy: 0.7702,
Epoch: 7, costo: 0.10358618944883347, accuracy: 0.7908,
Epoch: 8, costo: 0.16872303187847137, accuracy: 0.79,
Epoch: 9, costo: 0.08826413005590439, accuracy: 0.7624,


# Conclusion


In this project, we have demonstrated how convolutional neural networks (CNNs) can be used for image recognition. The provided code achieved an accuracy of up to 0.79 on a dataset of CIFAR10.

This result is promising, as it indicates that CNNs can be used for image classification tasks with a high degree of accuracy. However, it is important to note that this result was obtained with a relatively small dataset. To obtain more accurate results, it would be necessary to use a larger and more diverse dataset.