## 5.2 Going deeper with ResNet Model

⚠️⚠️⚠️ *Please open this notebook in Google Colab* by click below link ⚠️⚠️⚠️<br><br>
<a href="https://colab.research.google.com/github/Muhammad-Yunus/Belajar-Image-Classification/blob/main/Pertemuan%205/5.2%20going_deeper_with_resnet_model.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a><br><br><br>
- Click `Connect` button in top right Google Colab notebook,<br>
<img src="resource/cl-connect-gpu.png" width="250px">
- If connecting process completed, it will turn to something look like this<br>
<img src="resource/cl-connect-gpu-success.png" width="250px">

- Check GPU connected into Colab environment is active

In [None]:
!nvidia-smi

#### 5.2.1 Deeper Network vs Model Performance
- In the past experiment, we are able to achieve a good enough performance for image classification by just simply stacking several layer into the network, <br>
<img src="resource/small-net.png" width="100%"> <br>
- What if we keep adding extra layer and making the network got bigger and bigger?
    - Deeper convolutional neural networks some times beneficial to give a model ability to learn better.
- Until the train and validation accuracy is <font color="orange">saturated (won't changed)</font>, or even worst.<br>
<img src="resource/deep-network.png" width="100%"><br>
- This problem related to the network <font color="orange">degradation</font>.
    - Adding extra layer has no bennefit and accuracy remain the same.
    - The network experiencing <font color="orange">vanishing/exploding gradients</font>, making it hard to <font color="orange">converge</font> by optimizer. <br>
    <img src="resource/vanish-exploding-grad.png" width="500px"><br>
- This is explain why stacking more layers in deep neural network, not always making model learn better.
- When adding extra layer on that kind of situation, will just <font color="orange">learn to do nothing</font> and the result is <font color="orange">unchanged</font>.
    - The layer now act like an <font color="orange">Identity Function</font>, <br>
    <img src="resource/Identity-Function.png" width="550px"><br>
- On above illustration, we can verify that <font color="orange">with or without the additional layer</font>, the result is <font color="orange">unchanged</font>, 
    - So it makes sense if we just <font color="orange">skip it</font> (a.k.a <font color="cyan">Skip Connection</font>).    
    - This ensures the network <font color="orange">won’t degrade</font>.
- <font color="cyan">Skip Connection</font> allow the input to <font color="orange">skip</font> a layer and get <font color="orange">added</font> to the output of the next layer. 
    - This effectively means the network learns both the <font color="orange">transformation</font> and an <font color="orange">identity mapping</font>.<br>
    <img src="resource/residual-block-cat.png" width="600px"><br>

#### 5.2.2 Residual Block
- Based on above idea Kaiming He et al in his paper *['Deep Residual Learning for Image Recognition' - arxiv.org](https://arxiv.org/abs/1512.03385)*, proposing <font color="orange">Residual Block</font> to handling degradation problem in a very deep neural network. <br>
<img src="resource/residual-block.png" width="500px"><br><i>regular block (left) vs Residual Block (right) - source [[link](https://d2l.ai/chapter_convolutional-modern/resnet.html)]</i><br><br>
- It's called <font color="orange">"residual"</font> because it represents the difference between the <font color="orange">original signal</font> (input : $x$) and the <font color="orange">modified signal</font> (output : $F(x)$).
    - In the context of neural networks, a residual image captures what <font color="orange">remains after subtracting</font> the modified from the original signal ($G(x)$). 
    - It’s like a <font color="orange">visual residue</font> of the changes made. 
    - This concept helps models focus on learning changes or details that improve performance, rather than relearning everything from scratch.


- Implementation of <font color="orange">Residual Block</font> in Pytorch

In [None]:
import cv2
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms as transforms

- Define <font color="orange">Residual Block</font> following this structure, <br>
<img src="resource/residual-block-2.png" width="700px">

In [None]:
class ResidualBlock(nn.Module):
    def __init__(self, channels):
        super(ResidualBlock, self).__init__()
        # Define first convolutional layer with batch normalization
        self.conv1 = nn.Conv2d(channels, channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(channels)

        # Define second convolutional layer with batch normalization
        self.conv2 = nn.Conv2d(channels, channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(channels)

    def forward(self, x):
        # Apply first convolution -> batch normalization -> ReLU activation
        out = F.relu(self.bn1(self.conv1(x)))
        # Apply second convolution -> batch normalization
        out = self.bn2(self.conv2(out))
        # Add the input to the output
        out += x
        # Apply final ReLU activation
        out = F.relu(out)
        return out

In [None]:
# Load and preprocess an image
image = cv2.imread("cat.jpg")  # Load image 'cat.jpg' using OpenCV
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert BGR to RGB

# Convert image to tensor
image = torch.from_numpy(image).to(torch.float32) # Convert Numpy array to PyTorch tensor
image = image.permute(2, 0, 1)  # Change the order of dimensions from (H, W, C) to (C, H, W)
input_image = image.unsqueeze(0)  # Add batch dimension (1, C, H, W)
input_image = input_image / 255.0 # Normalize tensor to [0, 1]

In [None]:
# Create a residual block and process the image
rb = ResidualBlock(channels=3)
residual_image = rb(input_image)

In [None]:
# Calculate the residual (absolute difference between input and output)
visual_residual = torch.abs(input_image - residual_image)

In [None]:
import numpy as np

def rescale_tensor(tensor, ):
    # find min max value on current tensor
    min_val = torch.min(tensor)
    max_val = torch.max(tensor)
    
    # Normalize the tensor to range [0, 1]
    normalized_tensor = (tensor - min_val) / (max_val - min_val)
    
    # Scale to the new range to 0 - 255
    rescaled_tensor = normalized_tensor * 255

    # convert tensort to numpy array 8bit
    rescaled_tensor = rescaled_tensor.squeeze().permute(1, 2, 0).detach().numpy().astype(np.uint8)
    return rescaled_tensor

# convert & rescale tensort to numpy array 8bit
input_image_np = rescale_tensor(input_image)
residual_image_np = rescale_tensor(residual_image)
visual_residual_np = rescale_tensor(visual_residual)

In [None]:
import matplotlib.pyplot as plt

def imshow(input_image_np, residual_image_np, visual_residual_np) :
    fig, ax = plt.subplots(1, 3, figsize=(15, 5))
    ax[0].imshow(input_image_np)
    ax[0].set_title('Input Image')
    ax[0].axis('off')

    ax[1].imshow(residual_image_np)
    ax[1].set_title('Output Image (After Residual Block)')
    ax[1].axis('off')

    ax[2].imshow(visual_residual_np)
    ax[2].set_title('Visual Residual')
    ax[2].axis('off')


# Plot the images
imshow(input_image_np, residual_image_np, visual_residual_np)

### 5.2.3 Residual Block vs Dimension Missmatch
- On above example, we discover how to implement Residual Block in Pytorch with some limitation,
    1. <font color="orange">Number of channel</font> in input and output must be <font color="orange">exactly same</font>,
        ```
        rb = ResidualBlock(channels=3)
        ```
    2. <font color="orange">Dimension size (W, H)</font> in input and output must be <font color="orange">exactly same</font>, by setting convolution stride and padding to 1,
        ```
        nn.Conv2d(.... stride=1, padding=1 ....)
        ```
- Usually we <font color="orange">reduce</font> dimension size (W, H) and increase number of output channel gradually in neural network.
- This is requred by the network to learn <font color="orange">more feature</font> and <font color="orange">simplify abstaction</font> from the input into the output of the network.
- With above implementation, we can't simply change the number of channel or reduce dimention, since that will make the network experiencing <font color="cyan">dimension missmatch</font> when applying skip connection inside <font color="orange">Residual Block</font>.<br>
<img src="resource/dimention-missmatch.png" width="700px">

- To handling this problem, we can apply <font color="orange">1x1 Convolution</font> before applying skip connection on input layer with <font color="orange">parameterize</font> number of output <font color="orange">channel</font> and <font color="orange">stride</font>.
- <font color="orange">Residual block</font> with vs without <font color="orange">1x1 Convolution</font>, which transforms the input into the desired shape for the addition operation.<br>
<img src="resource/residual-block-3.png" width="700px"><br>

- Modified <font color="orange">ResidualBlock()</font> implementation with additional 1x1 Convolution,

In [None]:
class ResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1):
        super(ResidualBlock, self).__init__()
        # Define first convolutional layer with batch normalization
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        
        # Define second convolutional layer with batch normalization
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)
        
        # Shortcut connection to match input and output dimensions if needed
        self.shortcut = nn.Sequential()
        if stride != 1 or in_channels != out_channels: # check dimension reduced (stride > 1) or number of channel change
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False), # apply 1x1 convolution
                nn.BatchNorm2d(out_channels) # additional normalization
            )

    def forward(self, x):
        # Apply first convolution and batch normalization, then ReLU activation
        out = F.relu(self.bn1(self.conv1(x)))
        # Apply second convolution and batch normalization
        out = self.bn2(self.conv2(out))
        # Add the shortcut connection to the output
        out += self.shortcut(x)
        # Apply final ReLU activation
        out = F.relu(out)
        return out

In [None]:
# Create a residual block and process the image
# double the number of channel with half size dimension

rb = ResidualBlock(in_channels=3, out_channels=6, stride=2)
residual_image = rb(input_image)

In [None]:
# check the output shape

residual_image.shape

#### 5.2.3 ResNet Architecture

- On above example we already cover how to implement Residual block even if the dimantion size (W,H) and/or number of channel (C) is missmatch.
- Now we will learn how Kaiming He et al implementing Residual Block into ResNet model architecture.
- There is five version ResNet model which contain 18, 34, 50, 101, 152 layers.
- Here now we will discover the simple one, ResNet-18 which stacked by 18 layers. 
<img src="resource/Resnet-18-Model.png" width="100%"><br><i>simplified ResNet18 architecture - source [[link](https://d2l.ai/chapter_convolutional-modern/resnet.html)]</i><br><br><br>
- Full structured ResNet-18 Architecture
<img src="resource/Resnet-18-Model-Long.png" width="100%">



- Implementation ResNet-18 in Pytorch

In [None]:
# Define the Residual Block

class ResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1):
        super(ResidualBlock, self).__init__()
        
        # Define first convolutional layer with batch normalization
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        
        # Define second convolutional layer with batch normalization
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)
        
        # Shortcut connection to match input and output dimensions if needed
        self.shortcut = nn.Sequential()
        if stride != 1 or in_channels != out_channels: # check dimension reduced (stride > 1) or number of channel change
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False), # apply 1x1 convolution
                nn.BatchNorm2d(out_channels) # additional normalization
            )

    def forward(self, x):
        # Apply first convolution and batch normalization, then ReLU activation
        out = F.relu(self.bn1(self.conv1(x)))
        # Apply second convolution and batch normalization
        out = self.bn2(self.conv2(out))
        # Add the shortcut connection to the output
        out += self.shortcut(x)
        # Apply final ReLU activation
        out = F.relu(out)
        return out

In [None]:
class ResNet(nn.Module):
    def __init__(self, block, num_blocks, num_classes=1000):
        super(ResNet, self).__init__()

        self.in_channels = 64 # initial input channel size

        # Initial convolutional layer
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

        # Layer definitions
        self.layer1 = self._make_layer(block, num_blocks[0], out_channels=64, stride=1)
        self.layer2 = self._make_layer(block, num_blocks[1], out_channels=128, stride=2)
        self.layer3 = self._make_layer(block, num_blocks[2], out_channels=256, stride=2)
        self.layer4 = self._make_layer(block, num_blocks[3], out_channels=512, stride=2)

        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        
        self.fc = nn.Linear(512, num_classes)

    # Function to create layers
    def _make_layer(self, block, num_block, out_channels, stride):
        
        # first block has specified stride, others have stride 1
        # stride = 1, num_block = 2 then will resulting stride_list = [1, 1]
        # stride = 2, num_block = 2 then will resulting stride_list = [2, 1]
        stride_list = [stride] + [1] * (num_block - 1)  

        layers = []
        for stride in stride_list: 
            layers.append(block(self.in_channels, out_channels, stride))
            self.in_channels = out_channels  # ppdate input channel size for next block

        # *layers: The * operator unpacks the layers list, 
        # so each layer/block is treated as an individual argument to nn.Sequential.
        return nn.Sequential(*layers)

    def forward(self, x):
        # Forward pass through the network
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.maxpool(out)
        
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.layer4(out)

        out = self.avgpool(out)
        out = torch.flatten(out, 1)
        out = self.fc(out)
        return out

- Instantiate the ResNet-18 Model

In [None]:
def ResNet18(num_classes=1000):
    
    # ResNet-18 stacked by 4 x 2 residual block.
    # with dimentionality reduction in the 2nd, 3rd and 4th residual block by setting stride = 2. 
    return ResNet(ResidualBlock, [2, 2, 2, 2], num_classes=num_classes) 

# Instantiate the model
model = ResNet18(num_classes=1000)

- Testing the Model

In [None]:
# Create a dummy input tensor
input_tensor = torch.randn(1, 3, 224, 224)  # Batch size of 1, 3 color channels, 224x224 image

# Pass the dummy input through the model
output = model(input_tensor)

# Print the output shape
print("Output shape:", output.shape)  # Should be (1, 1000) for ImageNet classes

#### 5.2.4 Train ResNet-18 using MNIST Digit Dataset

In [None]:
!pip install gdown

import os
import gdown
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, random_split

import torchvision
from torchvision import transforms

from IPython import display

# clear output cell
display.clear_output()

print(f"torch : {torch.__version__}")
print(f"torch vision : {torchvision.__version__}")

- Download MNIST Dataset

In [None]:
DATASET_NAME = 'MNIST' # the dataset name
DATASET_NUM_CLASS = 10 # number of class in dataset

In [None]:
# default using gdrive_id Dataset `mnist_dataset.zip` (1-FfwJrllyHofQwIbMb_IxAkxnfMGSFmR)
gdrive_id = '1-FfwJrllyHofQwIbMb_IxAkxnfMGSFmR' # <-----  ⚠️⚠️⚠️ USE YOUR OWN GDrive ID FOR CUSTOM DATASET ⚠️⚠️⚠️

# download zip from GDrive
url = f'https://drive.google.com/uc?id={gdrive_id}'
gdown.download(url, DATASET_NAME + ".zip", quiet=False)

# unzip dataset
!unzip {DATASET_NAME}.zip -d {DATASET_NAME}

# clear output cell
display.clear_output()

- Load MNIST Dataset

In [None]:
# Define Custom Dataset class
# it's just helper to load image dataset using OpenCV and convert to pytorch tensor
# also doing a label encoding using one-hot encoding
class CustomDataset(Dataset):
    def __init__(self, root_dir):
        self.root_dir = root_dir
        self.image_files = sorted([file for file in os.listdir(root_dir) if file.lower().endswith('.png')])

    def __len__(self):
        return len(self.image_files)

    def __getitem__(self, idx):
        # Read image from corresponding .png file
        image_path = os.path.join(self.root_dir, self.image_files[idx])
        image = cv2.imread(image_path)  # Load image using OpenCV
        image = cv2.resize(image, (224,224)) # resize to make it comply with ResNet input size
        image = torch.from_numpy(image).to(torch.float32)  # Convert NumPy array to PyTorch tensor
        image = image.permute(2, 0, 1)  # Change the order of dimensions from (H, W, C) to (C, H, W)

        # Read label from corresponding .txt file
        label_path = os.path.splitext(image_path)[0] + ".txt"
        with open(label_path, 'r') as label_file:
            label = int(label_file.read().strip())  # Assuming labels are integers

        # Apply one-hot encoding into label
        labels_tensor = torch.tensor(label)
        one_hot_encoded = F.one_hot(labels_tensor, num_classes=DATASET_NUM_CLASS).to(torch.float32)

        return image, one_hot_encoded



# instantiate dataset
# in here the image dataset is not loaded yet
# we only read all image files names in fataset folder
all_train_dataset = CustomDataset(root_dir=f'{DATASET_NAME}/dataset/train')
test_dataset = CustomDataset(root_dir=f'{DATASET_NAME}/dataset/test')

In [None]:
print(f"All Train Dataset : {len(all_train_dataset)} data")
print(f"Test Dataset : {len(test_dataset)} data")

In [None]:
# Split 'all_train_dataset' into 'train' and 'validation' set using `random_split()` function
train_dataset, validation_dataset = random_split(all_train_dataset, [50000, 10000])

print(f"Train Dataset : {len(train_dataset)} data")
print(f"Validation Dataset : {len(validation_dataset)} data")

In [None]:
# Create data loaders
BATCH_SIZE = 128

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
validation_loader = DataLoader(validation_dataset, batch_size=BATCH_SIZE, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)

- Instantiate <font color="orange">ResNet-18</font>

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# instantiate ResNet-18 with MNIST Dataset of 10 class data
model = ResNet18(num_classes=10).to(device)

In [None]:
# setup optimizer, loss function & metric
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
loss_function = nn.CrossEntropyLoss()

- To run training process, we can use the following code

In [None]:
!pip install tqdm

from tqdm import tqdm

In [None]:
def train(model, train_loader, optimizer, loss_function):
    model.train()
    running_loss = 0.0
    correct_predictions = 0
    total_predictions = 0

    # Add progress bar for training loop
    progress_bar = tqdm(train_loader, desc='Training', leave=False)

    for inputs, labels in progress_bar:
        inputs = inputs.to(device) # move inputs to device
        labels = labels.to(device) # move labels to device

        # resets the gradients of all the model's parameters before the backward pass
        optimizer.zero_grad()
        # pass 2D 28x28 input tensor to CNN model
        outputs = model(inputs)
        # calc loss value
        loss = loss_function(outputs, labels)
        # computes the gradient of the loss with respect to each parameter in model
        loss.backward()
        # adjust model parameter
        optimizer.step()
        # sum loss value
        running_loss += loss.item()

        # Calculate correct & total prediction
        _, predicted = torch.max(outputs, 1)
        correct_predictions += (predicted == labels.argmax(1)).sum().item()
        total_predictions += labels.size(0)

        # Update progress bar description with current loss
        progress_bar.set_postfix(loss=loss.item())

    # Calculate average training loss
    average_train_loss = running_loss / len(train_loader.dataset)
    # Calculate training accuracy
    train_accuracy = correct_predictions / total_predictions
    return average_train_loss, train_accuracy

def validate(model, val_loader, loss_function):
    model.eval()
    running_loss = 0.0
    correct_predictions = 0
    total_predictions = 0

    # Add progress bar for validation loop
    progress_bar = tqdm(val_loader, desc='Validating', leave=False)

    with torch.no_grad():
        for inputs, labels in progress_bar:
            inputs = inputs.to(device) # move inputs to device
            labels = labels.to(device) # move labels to device

            # pass 2D 28x28 input tensor to CNN model
            outputs = model(inputs)
            # calc loss value
            loss = loss_function(outputs, labels)
            # sum loss value
            running_loss += loss.item()

            # Calculate correct & total prediction
            _, predicted = torch.max(outputs, 1)
            correct_predictions += (predicted == labels.argmax(1)).sum().item()
            total_predictions += labels.size(0)

            # Update progress bar description with loss
            progress_bar.set_postfix(loss=loss.item())

    # Calculate average validation loss
    average_val_loss = running_loss / len(val_loader.dataset)
    # Calculate validation accuracy
    val_accuracy = correct_predictions / total_predictions
    return average_val_loss, val_accuracy





# This is a training loop for selected Epoch
# each epoch will process all training and validation set, chunked into small batch size data
# then measure the loss & accuracy of training and validation set
NUM_EPOCH = 10      # you can change this value

train_losses = []
val_losses = []
train_accuracies = []
val_accuracies = []

for epoch in range(NUM_EPOCH):
    print(f"Epoch {epoch+1}/{NUM_EPOCH}")

    train_loss, train_accuracy = train(model, train_loader, optimizer, loss_function)
    val_loss, val_accuracy = validate(model, validation_loader, loss_function)

    train_losses.append(train_loss)
    val_losses.append(val_loss)
    train_accuracies.append(train_accuracy * 100)  # convert to percentage
    val_accuracies.append(val_accuracy * 100)  # convert to percentage

    print(f"Train Loss = {train_loss:.4f}, Val Loss = {val_loss:.4f}, Train Accuracy = {train_accuracy:.4f}, Val Accuracy = {val_accuracy:.4f}\n")

- Plot Loss and Accuracy of Training vs Validation Set 

In [None]:
# visualize Loss & Accuracy
import matplotlib.pyplot as plt

epochs = list(range(1, NUM_EPOCH + 1))

# Plotting loss
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(epochs, train_losses, 'b', label='Training Loss')
plt.plot(epochs, val_losses, 'r', label='Validation Loss')
plt.title('Training and Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.grid(True)

# Plotting accuracy
plt.subplot(1, 2, 2)
plt.plot(epochs, train_accuracies, 'b', label='Training Accuracy')
plt.plot(epochs, val_accuracies, 'r', label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy (%)')
plt.legend()
plt.grid(True)

plt.tight_layout()


- Evaluate Model, find Precision, Recal each class data, measure accuracy and compute confusion matrix

In [None]:
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
import seaborn as sns
import numpy as np

# define evaluate function for test set
def evaluate(model, test_loader):
    model.eval()
    all_labels = []
    all_preds = []

    # Add progress bar for validation loop
    progress_bar = tqdm(test_loader, desc='Evaluating', leave=False)

    with torch.no_grad():
        # iterate over all batched test set
        for inputs, labels in progress_bar:
            inputs = inputs.to(device) # move inputs to device
            labels = labels.to(device) # move labels to device

            # pass 2D 28x28 input tensor to CNN model
            outputs = model(inputs)
            # get prediction
            _, preds = torch.max(outputs, 1)
            # collect all labels & preds
            all_labels.extend(labels.cpu().numpy())
            all_preds.extend(preds.cpu().numpy())

    return all_labels, all_preds

# Evaluation on test set
all_labels, all_preds = evaluate(model, test_loader)
all_labels = np.argmax(all_labels, axis=1)

# Calculate classification report
labels = [str(i) for i in range(DATASET_NUM_CLASS)]
print(classification_report(all_labels, all_preds, target_names=labels))

# Confusion Matrix
conf_matrix = confusion_matrix(all_labels, all_preds)

# Plotting the confusion matrix
plt.figure(figsize=(10, 7))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues")
plt.xlabel('Predicted Class')
plt.ylabel('Actual Class')
plt.title('Confusion Matrix')
plt.show()

- Download Model 

In [None]:
# Save the model
torch.save(model.state_dict(), 'trained_cnn_model.pt')

# Download the model file
from google.colab import files
files.download('trained_cnn_model.pt')