## CIFAR 100 Classification

# Stage 1: Create Dataset and Dataloader

In this case we are going to develop a cifar 100 classification problem, for that we a re going to use the cifar100 dataset availablwe in torchvision, Although we are going to apply data augmentation for make the neuronal network more robust, and batch size of 64

For this time i will use 256x256 size images, letting to the neuronal network learn better, so i have made some changes in stage 2 

In [3]:
import torch,os
from torchvision import datasets, transforms
from torch.utils.data import random_split, ConcatDataset, DataLoader

#Augmentation data for being morerobust
transform_train = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomAdjustSharpness(sharpness_factor=2, p=0.5),
    transforms.RandomResizedCrop(256),
    transforms.ToTensor(),  # Converts the image into a tensor
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) 
])

transform_test = transforms.Compose([
    transforms.Resize((256,256)),  # Resizes the image to 64x64
    transforms.ToTensor(),  # Converts the image into a tensor
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # Normalizes the tensors (mean and std deviation for 3 color channels)
])

# Create Train dataset
train_dataset = datasets.CIFAR100(root = './dataset/train',download=True, train=True, transform=transform_train)
# Create Test dataset
test_dataset = datasets.CIFAR100(root = './dataset/test',download=True, train=False, transform=transform_test) 

#Create train loader
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
#Create test loader
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=True)

Files already downloaded and verified
Files already downloaded and verified


# Stage 2: building neural network model

Stride: The stride refers to how many pixels the filter moves through the image or input volume at each step during the convolution operation. A stride of 1 means that the filter moves one pixel at a time. A stride of 2 means that the filter moves two pixels at a time, and so on. A larger stride will result in a lower spatial dimension output.

Padding: Padding refers to the addition of extra pixels around the input image or volume before applying the convolution operation. The purpose of padding is to control the spatial dimension of the output. It is especially useful when you want to keep the spatial dimensions of the input and output the same after the convolution operation.

For this time i will use 4 convolutional layer, letting to the network learn more complex forms

In [9]:
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # First convolutional layer: input channels = 3 (RGB), output channels = 32
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1)
        # Second convolutional layer: input channels = 32 (from previous layer), output channels = 64
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        # Third convolutional layer: input channels = 64 (from previous layer), output channels = 128
        self.conv3 = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        # Fourth convolutional layer: input channels = 128 (from previous layer), output channels = 256
        self.conv4 = nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1)
        # Max pooling layer
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        # Dropout layer
        self.dropout = nn.Dropout(0.1)
        # First fully connected layer, input size should match the output size of the last conv layer
        self.fc1 = nn.Linear(256 * 16 * 16, 500)
        # Second fully connected layer, output size is the same as the number of classes
        self.fc2 = nn.Linear(500, 100)

    def forward(self, x):
        # Apply first conv layer, followed by ReLU, then max pooling
        x = self.pool(F.relu(self.conv1(x)))
        # Apply second conv layer, followed by ReLU, then max pooling
        x = self.pool(F.relu(self.conv2(x)))
        # Apply third conv layer, followed by ReLU, then max pooling
        x = self.pool(F.relu(self.conv3(x)))
        # Apply fourth conv layer, followed by ReLU, then max pooling
        x = self.pool(F.relu(self.conv4(x)))
        # Flatten the tensor output from the conv layers
        x = x.view(-1, 256 * 16 * 16)
        # Apply first fully connected layer with ReLU after applying dropout
        x = F.relu(self.fc1(self.dropout(x)))
        # Apply second fully connected layer after applying dropout
        x = self.fc2(self.dropout(x))
        return x


# Stage 3: Train model

For this, we need to define a loss function and an optimiser. We will use Cross Entropy as our loss function, as it is a good choice for classification problems. For the optimiser, we will use Adam.

Furthermore, we will divide our dataset into a training set and a validation set. During each epoch, we will train the model on the training set and then evaluate it on the validation set. If the performance on the validation set improves, we will save the model.

At the beginning i used lr = 0.01 and dropdown 0.5, but thesystem couldnt learn, with the the actual system, using lr= 0.001 and dropdown = 0.2 and 42 epochs, improving for 5.7... to 2.622346130905637 and still getting better

In [12]:
import torch.optim as optim
from torch.utils.data import random_split, DataLoader
from torchvision import transforms
from tqdm import tqdm
import torch

# Try to use cuda if posible
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("It is using: " + device.type)

# Initialice the network
model = Net().to(device)

# Path to save the model
model_path = 'best_model.pth'
if os.path.exists(model_path):
    print("Previous mode was loaded.")
    model = Net()
    model.load_state_dict(torch.load(model_path))
    model.to(device)
    
else:
    print("Not previous model found.")
    model = Net().to(device)
    
# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-5)  # L2 regularization

# Define a number of training epochs
epochs = 30

#actually is my best
best_loss = 50

best_val_loss = 368.8841189146042  # Initialize with a high value

# Training loop
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

# Assuming that we have 100 classes for the CIFAR-100 dataset
class_names = [f'class_{i}' for i in range(100)]

for epoch in range(epochs):
    # Set the model to training mode
    model.train()
    
    # Create a progress bar
    progress_bar = tqdm(train_loader, desc=f'Epoch {epoch+1}')
    
    for inputs, labels in progress_bar:
        # Move data to the GPU if available
        inputs, labels = inputs.to(device), labels.to(device)

        # Zero the parameter gradients
        optimizer.zero_grad()

        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, labels)

        # Backward pass and optimization
        loss.backward()
        optimizer.step()

        # Update the progress bar
        progress_bar.set_postfix({'training_loss': loss.item()})

    # Initialize lists to store predictions and labels
    all_preds = []
    all_labels = []

    # Set the model to evaluation mode
    model.eval()
    val_loss = 0
    with torch.no_grad():
        for inputs, labels in test_loader:
            # Move data to the GPU if available
            inputs, labels = inputs.to(device), labels.to(device)

            # Forward pass
            outputs = model(inputs)
            loss = criterion(outputs, labels)

            # Update the validation loss
            val_loss += loss.item()

            # Get predictions
            _, preds = torch.max(outputs, 1)
            all_preds.extend(preds.cpu().numpy())
            all_labels.extend(labels.cpu().numpy())

    # Calculate and print accuracy, recall, and F1-score
    print(classification_report(all_labels, all_preds, target_names=class_names))

    # Calculate and print confusion matrix
     cm = confusion_matrix(all_labels, all_preds)
     plt.figure(figsize=(12, 10))
     sns.heatmap(cm, annot=True, fmt="d", 
                xticklabels=class_names, yticklabels=class_names)
     plt.ylabel("Real value")
     plt.xlabel("Predicted value")
     plt.show()

    # Print epoch loss
    print(f'Epoch {epoch+1}, Validation Loss: {val_loss/len(test_loader)}')

    # Save the model if it has the best validation loss so far
    print(f'El loss actual es {val_loss} y el mejor es {best_val_loss}')
    if val_loss < best_val_loss:
        print("model saved")
        best_val_loss = val_loss
        torch.save(model.state_dict(), 'best_model.pth')

print('Finished Training')

It is using: cuda
Previous mode was loaded.


Epoch 1: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.33]


Epoch 1, Validation Loss: 2.457882341305921
El loss actual es 385.8875275850296 y el mejor es 381.35527324676514


Epoch 2: 100%|██████████| 782/782 [02:57<00:00,  4.40it/s, training_loss=2.65]


Epoch 2, Validation Loss: 2.471396891934097
El loss actual es 388.00931203365326 y el mejor es 381.35527324676514


Epoch 3: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=3.46]


Epoch 3, Validation Loss: 2.4793459670558855
El loss actual es 389.25731682777405 y el mejor es 381.35527324676514


Epoch 4: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=2.68]


Epoch 4, Validation Loss: 2.47534755992282
El loss actual es 388.6295669078827 y el mejor es 381.35527324676514


Epoch 5: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.8] 


Epoch 5, Validation Loss: 2.4352493232982173
El loss actual es 382.33414375782013 y el mejor es 381.35527324676514


Epoch 6: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=3.12]


Epoch 6, Validation Loss: 2.4388904624683843
El loss actual es 382.9058026075363 y el mejor es 381.35527324676514


Epoch 7: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.36]


Epoch 7, Validation Loss: 2.4659176981373196
El loss actual es 387.1490786075592 y el mejor es 381.35527324676514


Epoch 8: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=2.65]


Epoch 8, Validation Loss: 2.432062246237591
El loss actual es 381.83377265930176 y el mejor es 381.35527324676514


Epoch 9: 100%|██████████| 782/782 [02:57<00:00,  4.42it/s, training_loss=2.05]


Epoch 9, Validation Loss: 2.4212390480527453
El loss actual es 380.134530544281 y el mejor es 381.35527324676514
model saved


Epoch 10: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=2.17]


Epoch 10, Validation Loss: 2.4139975931993716
El loss actual es 378.99762213230133 y el mejor es 380.134530544281
model saved


Epoch 11: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.68]


Epoch 11, Validation Loss: 2.418883190033542
El loss actual es 379.7646608352661 y el mejor es 378.99762213230133


Epoch 12: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=3.25]


Epoch 12, Validation Loss: 2.413293953913792
El loss actual es 378.88715076446533 y el mejor es 378.99762213230133
model saved


Epoch 13: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=2.77]


Epoch 13, Validation Loss: 2.4218598308077284
El loss actual es 380.23199343681335 y el mejor es 378.88715076446533


Epoch 14: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.97]


Epoch 14, Validation Loss: 2.4128958427222673
El loss actual es 378.82464730739594 y el mejor es 378.88715076446533
model saved


Epoch 15: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.8] 


Epoch 15, Validation Loss: 2.3972633742982414
El loss actual es 376.3703497648239 y el mejor es 378.82464730739594
model saved


Epoch 16: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.46]


Epoch 16, Validation Loss: 2.384923676016984
El loss actual es 374.43301713466644 y el mejor es 376.3703497648239
model saved


Epoch 17: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=2.48]


Epoch 17, Validation Loss: 2.4300110362897254
El loss actual es 381.5117326974869 y el mejor es 374.43301713466644


Epoch 18: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.51]


Epoch 18, Validation Loss: 2.4310199842331515
El loss actual es 381.6701375246048 y el mejor es 374.43301713466644


Epoch 19: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=2.18]


Epoch 19, Validation Loss: 2.4399983040086783
El loss actual es 383.0797337293625 y el mejor es 374.43301713466644


Epoch 20: 100%|██████████| 782/782 [02:57<00:00,  4.42it/s, training_loss=3.39]


Epoch 20, Validation Loss: 2.3580068782636316
El loss actual es 370.20707988739014 y el mejor es 374.43301713466644
model saved


Epoch 21: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=2.94]


Epoch 21, Validation Loss: 2.4301258135753074
El loss actual es 381.52975273132324 y el mejor es 370.20707988739014


Epoch 22: 100%|██████████| 782/782 [02:56<00:00,  4.43it/s, training_loss=2.48]


Epoch 22, Validation Loss: 2.3499532801330467
El loss actual es 368.94266498088837 y el mejor es 370.20707988739014
model saved


Epoch 23: 100%|██████████| 782/782 [02:56<00:00,  4.43it/s, training_loss=3.04]


Epoch 23, Validation Loss: 2.3892290349219256
El loss actual es 375.1089584827423 y el mejor es 368.94266498088837


Epoch 24: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=2.87]


Epoch 24, Validation Loss: 2.349580375252256
El loss actual es 368.8841189146042 y el mejor es 368.94266498088837
model saved


Epoch 25: 100%|██████████| 782/782 [02:57<00:00,  4.42it/s, training_loss=2.49]


Epoch 25, Validation Loss: 2.4026530312884384
El loss actual es 377.21652591228485 y el mejor es 368.8841189146042


Epoch 26: 100%|██████████| 782/782 [02:56<00:00,  4.43it/s, training_loss=3.27]


Epoch 26, Validation Loss: 2.3621443175965813
El loss actual es 370.85665786266327 y el mejor es 368.8841189146042


Epoch 27: 100%|██████████| 782/782 [02:57<00:00,  4.41it/s, training_loss=2.89]


Epoch 27, Validation Loss: 2.3597951087222735
El loss actual es 370.487832069397 y el mejor es 368.8841189146042


Epoch 28: 100%|██████████| 782/782 [02:56<00:00,  4.42it/s, training_loss=3.24]


Epoch 28, Validation Loss: 2.3563966136069814
El loss actual es 369.9542683362961 y el mejor es 368.8841189146042


Epoch 29: 100%|██████████| 782/782 [02:56<00:00,  4.43it/s, training_loss=2.94]


Epoch 29, Validation Loss: 2.376310438107533
El loss actual es 373.0807387828827 y el mejor es 368.8841189146042


Epoch 30: 100%|██████████| 782/782 [02:56<00:00,  4.43it/s, training_loss=2.94]


Epoch 30, Validation Loss: 2.3991162754168176
El loss actual es 376.66125524044037 y el mejor es 368.8841189146042
Finished Training


# Stage 4 my own tests

As you can see below, the neuronal network still neading more epochs for improve its results

In [13]:
import pickle
# Load the saved model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("It is using: " + device.type)
model = Net().to(device)
model.load_state_dict(torch.load('best_model.pth'))

# Set the model to evaluation mode
model.eval()

# Get a batch of validation data
inputs, labels = next(iter(test_loader))
inputs = inputs.to(device)

# Make predictions
with torch.no_grad():
    outputs = model(inputs)

probabilities = F.softmax(outputs, dim=1)

# The outputs are probabilities for each class. To get the predicted class, we take the index of the highest probability.
_, preds = torch.max(probabilities, 1)

# Cargando las etiquetas de CIFAR-100
with open('./dataset/train/cifar-100-python/meta', 'rb') as file:
    data = pickle.load(file, encoding='bytes')
    fine_label_names = [t.decode('utf8') for t in data[b'fine_label_names']]

# Utilizando las etiquetas para imprimir las clases predichas
print('Predicted:', [fine_label_names[i] for i in preds])
print('True:     ', [fine_label_names[i] for i in labels])


It is using: cuda
Predicted: ['oak_tree', 'shrew', 'otter', 'bee', 'rose', 'snail', 'keyboard', 'clock', 'table', 'squirrel', 'skyscraper', 'cockroach', 'sea', 'lobster', 'table', 'sunflower', 'ray', 'spider', 'train', 'chimpanzee', 'flatfish', 'dinosaur', 'bear', 'willow_tree', 'house', 'butterfly', 'chimpanzee', 'forest', 'plate', 'wardrobe', 'rose', 'caterpillar', 'wolf', 'cockroach', 'whale', 'orchid', 'leopard', 'baby', 'butterfly', 'crocodile', 'raccoon', 'shrew', 'beaver', 'oak_tree', 'porcupine', 'leopard', 'elephant', 'palm_tree', 'clock', 'table', 'pine_tree', 'baby', 'bed', 'train', 'raccoon', 'cup', 'chimpanzee', 'bed', 'bee', 'beetle', 'leopard', 'cockroach', 'snail', 'spider']
True:      ['maple_tree', 'shrew', 'shark', 'bee', 'tulip', 'mushroom', 'telephone', 'hamster', 'sea', 'bicycle', 'skyscraper', 'butterfly', 'sea', 'leopard', 'table', 'sunflower', 'ray', 'spider', 'couch', 'cup', 'flatfish', 'dinosaur', 'mouse', 'willow_tree', 'house', 'pear', 'chimpanzee', 'televi