# TC 5033
## Deep Learning
## Convolutional Neural Networks
<br>

#### Activity 2b: Building a CNN for CIFAR10 dataset with PyTorch
<br>

- Objective

    The main goal of this activity is to further your understanding of Convolutional Neural Networks (CNNs) by building one using PyTorch. You will apply this architecture to the famous CIFAR10 dataset, taking what you've learned from the guide code that replicated the Fully Connected model in PyTorch (Activity 2a).

- Instructions
    This activity requires submission in teams of 5 or 6 members. Submissions from smaller or larger teams will not be accepted unless prior approval has been granted (only due to exceptional circumstances). While teamwork is encouraged, each member is expected to contribute individually to the assignment. The final submission should feature the best arguments and solutions from each team member. Only one person per team needs to submit the completed work, but it is imperative that the names of all team members are listed in a Markdown cell at the very beginning of the notebook (either the first or second cell). Failure to include all team member names will result in the grade being awarded solely to the individual who submitted the assignment, with zero points given to other team members (no exceptions will be made to this rule).

    Understand the Guide Code: Review the guide code from Activity 2a that implemented a Fully Connected model in PyTorch. Note how PyTorch makes it easier to implement neural networks.

    Familiarize Yourself with CNNs: Take some time to understand their architecture and the rationale behind using convolutional layers.

    Prepare the Dataset: Use PyTorch's DataLoader to manage the dataset. Make sure the data is appropriately preprocessed for a CNN.

    Design the CNN Architecture: Create a new architecture that incorporates convolutional layers. Use PyTorch modules like nn.Conv2d, nn.MaxPool2d, and others to build your network.

    Training Loop and Backpropagation: Implement the training loop, leveraging PyTorch’s autograd for backpropagation. Keep track of relevant performance metrics.

    Analyze and Document: Use Markdown cells to explain your architectural decisions, performance results, and any challenges you faced. Compare this model with your previous Fully Connected model in terms of performance and efficiency.

- Evaluation Criteria

    - Understanding of CNN architecture and its application to the CIFAR10 dataset
    - Code Readability and Comments
    - Appropriateness and efficiency of the chosen CNN architecture
    - Correct implementation of Traning Loop and Accuracy Function
    - Model's performance metrics on the CIFAR10 dataset (at least 65% accuracy)
    - Quality of Markdown documentation

- Submission

Submit via Canvas your Jupyter Notebook with the CNN implemented in PyTorch. Your submission should include well-commented code and Markdown cells that provide a comprehensive view of your design decisions, performance metrics, and learnings.

In [1]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torch.utils.data import sampler
import torchvision.datasets as datasets
import torchvision.transforms as T
import matplotlib.pyplot as plt

### Download Cifar10 dataset

In [2]:
torch.cuda.is_available()

True

In [3]:
DATA_PATH = 'CIFAR10/'
NUM_TRAIN = 50000
NUM_VAL = 5000
NUM_TEST = 5000
MINIBATCH_SIZE = 64

transform_cifar = T.Compose([
                T.ToTensor(),
                T.Normalize([0.491, 0.482, 0.447], [0.247, 0.243, 0.261])
            ])

# Train dataset
cifar10_train = datasets.CIFAR10(DATA_PATH, train=True, download=True,
                             transform=transform_cifar)
train_loader = DataLoader(cifar10_train, batch_size=MINIBATCH_SIZE, 
                          sampler=sampler.SubsetRandomSampler(range(NUM_TRAIN)))
#Validation set
cifar10_val = datasets.CIFAR10(DATA_PATH, train=False, download=True,
                           transform=transform_cifar)
val_loader = DataLoader(cifar10_val, batch_size=MINIBATCH_SIZE, 
                        sampler=sampler.SubsetRandomSampler(range(NUM_VAL)))
#Test set
cifar10_test = datasets.CIFAR10(DATA_PATH, train=False, download=True, 
                            transform=transform_cifar)
test_loader = DataLoader(cifar10_test, batch_size=MINIBATCH_SIZE,
                        sampler=sampler.SubsetRandomSampler(range(NUM_VAL, len(cifar10_test))))

Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified


In [None]:
# Exploring the object
cifar10_train

In [None]:
# Printing Batch size
train_loader.batch_size

In [None]:
# Printing data from data loader
for i, (x, y) in enumerate(train_loader):
    print(x, y)


### Using  GPUs

In [4]:
# Setting cuada as device
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')
print(device)

cuda


### Mostrar imágenes

In [None]:
# Function to show images
classes = test_loader.dataset.classes
def plot_figure(image):
    plt.imshow(np.transpose(image,(1,2,0)))
    plt.axis('off')
    plt.show()

rnd_sample_idx = np.random.randint(len(test_loader))
print(f'La imagen muestreada representa un: {classes[test_loader.dataset[rnd_sample_idx][1]]}')
image = test_loader.dataset[rnd_sample_idx][0]
image = (image - image.min()) / (image.max() -image.min() )
plot_figure(image)


In [None]:
# Function to randomly show 8 images by class in a grid
def plot_cifar10_grid():
    classes = test_loader.dataset.classes
    total_samples = 8
    plt.figure(figsize=(15,15))
    for label, sample in enumerate(classes):
        class_idxs = np.flatnonzero(label == np.array(test_loader.dataset.targets))
        sample_idxs = np.random.choice(class_idxs, total_samples, replace = False)
        for i, idx in enumerate(sample_idxs):
            plt_idx = i*len(classes) + label + 1
            plt.subplot(total_samples, len(classes), plt_idx)
            plt.imshow(test_loader.dataset.data[idx])
            plt.axis('off')
            
            if i == 0: plt.title(sample)
    plt.show()

plot_cifar10_grid() 

### Calcular accuracy


In [5]:
def accuracy(model, loader):
    num_correct = 0
    num_total = 0
    model.eval()    # Ponemos el modelo en modo evaluación
    model = model.to(device=device) # Nos aseguramos que el modelo esté en GPUs
    with torch.no_grad():
        for xi, yi in loader:
            xi = xi.to(device=device, dtype = torch.float32)
            yi = yi.to(device=device, dtype = torch.long)
            scores = model(xi) 
            _, pred = scores.max(dim=1) 
            num_correct += (pred == yi).sum() # Comparamos la predicción con los elementos de la clase correcta
            num_total += pred.size(0)
        return float(num_correct)/num_total   # Elementos predichos entre total de elementos

### Loop de entrenamiento

In [6]:
# Función de entrenamiento
def train(model, optimiser, epochs=100):
    model = model.to(device=device)
    for epoch in range(epochs):
        for i, (xi, yi) in enumerate(train_loader):
            model.train()   # Ponemos modelo en modo entrenamiento
            xi = xi.to(device=device, dtype=torch.float32)
            yi = yi.to(device=device, dtype=torch.long)
            scores = model(xi)
            cost = F.cross_entropy(input= scores, target=yi)
            optimiser.zero_grad()           
            cost.backward() # Llamamos a la función backward
            optimiser.step()              
        acc = accuracy(model, val_loader) 
        print(f'Epoch: {epoch+1}, costo: {cost.item()}, accuracy: {acc},')

### Linear model

---

##### **Evaluación 1**  
- **Modelo**: `Linear NN`
- **Hiperparámetros**:  
  - `Learning Rate`: `0.001`  
  - `Epochs`: `10`
  - `Hidden Layer 1`: `256`
  - `Hidden Layer 2`: `256`
  
- **Resultados**:  
  - **Train Accuracy**: **`52.98%`** 
  - **Test Accuracy**: **`51.48%`**  
---
  
##### **Evaluación 2**  
- **Modelo**: `CNN Básica (con Data Augmentation)`
- **Hiperparámetros**:  
  - `Learning Rate`: `0.00055`  
  - `Epochs`: `25`
  - `Hidden Layer 1`: `512`
  - `Hidden Layer 2`: `256`
  - `Hidden Layer 2`: `128`

- **Resultados**:  
  - **Train Accuracy**: **`53.04%`**  
  - **Test Accuracy**: **`53.4%`**
---

In [26]:
# Parametros
hidden2 = 512 
hidden1 = 256 
hidden = 128
lr = 0.00055
epochs = 25
# Creando red neuronal linear
model1 = nn.Sequential(nn.Flatten(),
                       nn.Linear(in_features=32*32*3, out_features=hidden2), nn.ReLU(),
                       nn.Linear(in_features=hidden2, out_features=hidden1), nn.ReLU(),
                       nn.Linear(in_features=hidden1, out_features=hidden), nn.ReLU(),
                       nn.Linear(in_features=hidden, out_features=10))  # 10 elementos de salida pues tenemos 10 clases
optimiser = torch.optim.Adam(model1.parameters(), lr=lr)

In [None]:
# Entrenando modelo
train(model1, optimiser, epochs)

In [None]:
# Ver el accuracy con los datos de prueba
accuracy(model1, test_loader)

# SEQUENTIAL CNN

---

##### **Evaluación 1**  
- **Modelo**: `Convolutional NN`
- **Hiperparámetros**:  
  - `Learning Rate`: `0.0001`  
  - `Epochs`: `30`
  - `Convolutional Layer 1`: `16`
  - `Convolutional Layer 2`: `32`
  - `Max Pooling`: `2x2`
  
- **Resultados**:  
  - **Train Accuracy**: **`66.70%`**  
  - **Test Accuracy**: **`65.14%`**  
---
  
##### **Evaluación 2**  
- **Modelo**: `Convolutional NN`
- **Hiperparámetros**:  
  - `Learning Rate`: `0.0001`  
  - `Epochs`: `20`
  - `Convolutional Layer 1`: `64`
  - `Convolutional Layer 2`: `32`
  - `Max Pooling`: `2x2`

- **Resultados**:  
  - **Train Accuracy**: **`68.44%`**  
  - **Test Accuracy**: **`67.92%`**
---

In [9]:
# Convolusional con kernel = 3 y padding = 1
# Creamos esta función para las redes convolusionales
conv_k_3 = lambda channel1, channel2 : nn.Conv2d(channel1,channel2, kernel_size=3, padding=1)

In [35]:
# Definiendo clase para hacer la red neuronal convolusional con 2 capas y max pooling.
class CNN_class1(nn.Module):
    def __init__(self, in_channel, channel1, channel2):
        super().__init__()
        self.conv1 = conv_k_3(in_channel,channel1)
        self.conv2 = conv_k_3(channel1,channel2)
        self.max_pool = nn.MaxPool2d(2,2)
        self.fc = nn.Linear(in_features=16*16*channel2, out_features=10)
        self.flatten = nn.Flatten()
    def forward(self, x):
        x = F.relu(self.conv2(F.relu(self.conv1(x))))
        x = self.max_pool(x)
        x = self.flatten(x)
        return self.fc(x)

In [None]:
channel1 = 64   # Numero de filtros en el primer layer
channel2 = 32   # Numero de filtros en la segunda layer
epochs = 20
lr = 0.0001
modelCNN1 = CNN_class1(3, channel1, channel2)
optimiser = torch.optim.Adam(modelCNN1.parameters(), lr)

# Entrenamos el modelo
train(modelCNN1, optimiser, epochs)

In [None]:
# Ver el accuracy con los datos de prueba
accuracy(modelCNN1, test_loader)

# CNN CON BATCH NORMALIZATION

---

##### **Evaluación 1**  
- **Modelo**: `Convolutional NN`
- **Hiperparámetros**:  
  - `Learning Rate`: `0.0001`  
  - `Epochs`: `25`
  - `Convolutional Layer 1`: `32`
  - `Batch Normalization`
  - `Convolutional Layer 2`: `64`
  - `Batch Normalization`
  - `Max Pooling`: `2x2`
  - `Convolutional Layer 3`: `128`
  - `Batch Normalization`
  - `Convolutional Layer 4`: `256`
  - `Batch Normalization`
  - `Max Pooling`: `2x2`

- **Resultados**:  
  - **Train Accuracy**: **`71.14%`**  
  - **Test Accuracy**: **`70.94%`**  
---
  
##### **Evaluación 2**  
- **Modelo**: `Convolutional NN`
- **Hiperparámetros**:  
  - `Learning Rate`: `0.0001`  
  - `Epochs`: `25`
  - `Convolutional Layer 1`: `16`
  - `Batch Normalization`
  - `Convolutional Layer 2`: `32`
  - `Batch Normalization`
  - `Max Pooling`: `2x2`
  - `Convolutional Layer 3`: `64`
  - `Batch Normalization`
  - `Convolutional Layer 4`: `128`
  - `Batch Normalization`
  - `Max Pooling`: `2x2`

- **Resultados**:  
  - **Train Accuracy**: **`73.62%`**  
  - **Test Accuracy**: **`72.66%`**
---

In [7]:
# Definiendo clases para hacer la red neuronal convolusional con dos capas y batch normalization después de cada capa
# Se agrega también Max pooling al final de estas
class CNN_class2(nn.Module):
    def __init__(self, in_channel, channel1, channel2, channel3, channel4):
        super().__init__()
        # Primer Capa
        self.conv1 = conv_k_3(in_channel,channel1)
        self.bn1 = nn.BatchNorm2d(channel1)

        # Segunda Capa
        self.conv2 = conv_k_3(channel1,channel2)
        self.bn2 = nn.BatchNorm2d(channel2)

        # Tercera Capa
        self.conv3 = conv_k_3(channel2,channel3)
        self.bn3 = nn.BatchNorm2d(channel3)

        # Cuarta Capa
        self.conv4 = conv_k_3(channel3,channel4)
        self.bn4 = nn.BatchNorm2d(channel4)

        # Max Pooling
        self.max_pool = nn.MaxPool2d(2,2)

        self.fc = nn.Linear(in_features=8*8*channel4, out_features=10)
        self.flatten = nn.Flatten()
    def forward(self, x):
        
        # Aplicamos la primer capa convolusional, después hacemos batch normalization y finalmente su función de activación RELU
        x = F.relu(self.bn1(self.conv1(x)))
        # Aplicamos la segunda capa convolusional, después hacemos batch normalization y finalmente su finción de activación RELU
        x = F.relu(self.bn2(self.conv2(x)))
        # Aplicamos Max Pooling después de este primer par de capas
        x = self.max_pool(x)

        # Aplicamos la tercer capa convolusional, después hacemos batch normalization y finalmente su función de activación RELU
        x = F.relu(self.bn3(self.conv3(x)))
        # Aplicamos la cuarta capa convolusional, después hacemos batch normalization y finalmente su finción de activación RELU
        x = F.relu(self.bn4(self.conv4(x)))
        # Aplicamos Max Pooling después de este segundo par de capas
        x = self.max_pool(x)

        x = self.flatten(x)
        return self.fc(x)

In [13]:
channel1 = 16   # Numero de filtros en el primer layer
channel2 = 32   # Numero de filtros en la segunda layer
channel3 = 64   # Numero de filtros en la tercera layer
channel4 = 128   # Numero de filtros en la cuarta layer

epochs = 25
lr = 0.0001
modelCNN2 = CNN_class2(3, channel1, channel2, channel3, channel4)
optimiser = torch.optim.Adam(modelCNN2.parameters(), lr)

# Entrenamos el modelo
train(modelCNN2, optimiser, epochs)

Epoch: 1, costo: 1.0900850296020508, accuracy: 0.6068,
Epoch: 2, costo: 1.2050460577011108, accuracy: 0.6656,
Epoch: 3, costo: 0.8763821125030518, accuracy: 0.6928,
Epoch: 4, costo: 0.8058571219444275, accuracy: 0.6904,
Epoch: 5, costo: 0.5028868317604065, accuracy: 0.7026,
Epoch: 6, costo: 0.32012438774108887, accuracy: 0.719,
Epoch: 7, costo: 1.0410475730895996, accuracy: 0.7388,
Epoch: 8, costo: 0.7200767397880554, accuracy: 0.7286,
Epoch: 9, costo: 0.37254637479782104, accuracy: 0.734,
Epoch: 10, costo: 0.6962599158287048, accuracy: 0.7248,
Epoch: 11, costo: 0.416248083114624, accuracy: 0.7438,
Epoch: 12, costo: 0.2988698184490204, accuracy: 0.7364,
Epoch: 13, costo: 0.9376838803291321, accuracy: 0.7374,
Epoch: 14, costo: 0.6824215650558472, accuracy: 0.7444,
Epoch: 15, costo: 0.34041354060173035, accuracy: 0.738,
Epoch: 16, costo: 0.5254177451133728, accuracy: 0.741,
Epoch: 17, costo: 0.6479150056838989, accuracy: 0.744,
Epoch: 18, costo: 0.2111598253250122, accuracy: 0.7398,
Epoc

In [14]:
# Ver el accuracy con los datos de prueba
accuracy(modelCNN2, test_loader)

0.7266

# CNN CON BATCH NORMALIZATION Y DROPOUT

---

##### **Evaluación 1**  
- **Modelo**: `Convolutional NN`
- **Hiperparámetros**:  
  - `Learning Rate`: `0.0001`  
  - `Epochs`: `30`
  - `Convolutional Layer 1`: `32`
  - `Batch Normalization`
  - `Convolutional Layer 2`: `64`
  - `Batch Normalization`
  - `Max Pooling`: `2x2`
  - `Convolutional Layer 3`: `128`
  - `Batch Normalization`
  - `Convolutional Layer 4`: `256`
  - `Batch Normalization`
  - `Max Pooling`: `2x2`

  - `Dropout`: `p = 0.5`

- **Resultados**:  
  - **Train Accuracy**: **`71.14%`**  
  - **Test Accuracy**: **`70.94%`**  
---

In [11]:
# Definiendo clases para hacer la red neuronal convolusional con dos capas y batch normalization después de cada capa
# Se agrega también Max pooling al final de estas
class CNN_class2(nn.Module):
    def __init__(self, in_channel, channel1, channel2, channel3, channel4):
        super().__init__()
        # Primer Capa
        self.conv1 = conv_k_3(in_channel,channel1)
        self.bn1 = nn.BatchNorm2d(channel1)

        # Segunda Capa
        self.conv2 = conv_k_3(channel1,channel2)
        self.bn2 = nn.BatchNorm2d(channel2)

        # Tercera Capa
        self.conv3 = conv_k_3(channel2,channel3)
        self.bn3 = nn.BatchNorm2d(channel3)

        # Cuarta Capa
        self.conv4 = conv_k_3(channel3,channel4)
        self.bn4 = nn.BatchNorm2d(channel4)

        # Max Pooling
        self.max_pool = nn.MaxPool2d(2,2)

        # Dropout
        self.dropout = nn.Dropout()

        self.fc = nn.Linear(in_features=8*8*channel4, out_features=10)
        self.flatten = nn.Flatten()
    def forward(self, x):
        
        # Aplicamos la primer capa convolusional, después hacemos batch normalization y finalmente su función de activación RELU
        x = F.relu(self.bn1(self.conv1(x)))
        # Aplicamos la segunda capa convolusional, después hacemos batch normalization y finalmente su finción de activación RELU
        x = F.relu(self.bn2(self.conv2(x)))

        # Aplicamos Max Pooling después de este primer par de capas
        x = self.max_pool(x)
        
        # Aplicamos la tercer capa convolusional, después hacemos batch normalization y finalmente su función de activación RELU
        x = F.relu(self.bn3(self.conv3(x)))
        # Aplicamos la cuarta capa convolusional, después hacemos batch normalization y finalmente su finción de activación RELU
        x = F.relu(self.bn4(self.conv4(x)))

        # Aplicamos Max Pooling después de este segundo par de capas
        x = self.max_pool(x)

        x = self.flatten(x)
        #AGREGAMOS DROPOUT para evitar el sobreentrenamiento
        x = self.dropout(x)

        return self.fc(x)

In [12]:
channel1 = 16   # Numero de filtros en el primer layer
channel2 = 32   # Numero de filtros en la segunda layer
channel3 = 64   # Numero de filtros en la tercera layer
channel4 = 128   # Numero de filtros en la cuarta layer

epochs = 30
lr = 0.0001
modelCNN2 = CNN_class2(3, channel1, channel2, channel3, channel4)
optimiser = torch.optim.Adam(modelCNN2.parameters(), lr)

# Entrenamos el modelo
train(modelCNN2, optimiser, epochs)

Epoch: 1, costo: 1.1141241788864136, accuracy: 0.5822,
Epoch: 2, costo: 1.13069748878479, accuracy: 0.6352,
Epoch: 3, costo: 0.9112893342971802, accuracy: 0.6542,
Epoch: 4, costo: 0.7465352416038513, accuracy: 0.7016,
Epoch: 5, costo: 0.8618942499160767, accuracy: 0.7122,
Epoch: 6, costo: 0.8661471009254456, accuracy: 0.7322,
Epoch: 7, costo: 1.0242751836776733, accuracy: 0.7318,
Epoch: 8, costo: 0.7354901432991028, accuracy: 0.7308,
Epoch: 9, costo: 0.39422786235809326, accuracy: 0.7468,
Epoch: 10, costo: 1.2691540718078613, accuracy: 0.7596,
Epoch: 11, costo: 0.8837869763374329, accuracy: 0.7508,
Epoch: 12, costo: 1.2687300443649292, accuracy: 0.7666,
Epoch: 13, costo: 0.6549330353736877, accuracy: 0.7634,
Epoch: 14, costo: 0.9818738698959351, accuracy: 0.767,
Epoch: 15, costo: 0.4830458164215088, accuracy: 0.7666,
Epoch: 16, costo: 0.7345582842826843, accuracy: 0.774,
Epoch: 17, costo: 0.5041328072547913, accuracy: 0.7782,
Epoch: 18, costo: 0.7752000689506531, accuracy: 0.7774,
Epoc

In [13]:
# Ver el accuracy con los datos de prueba
accuracy(modelCNN2, test_loader)

0.7836