# Lab : Image Classification using Convolutional Neural Networks

At the end of this laboratory, you would get familiarized with

*   Creating deep networks using Keras
*   Steps necessary in training a neural network
*   Prediction and performance analysis using neural networks

---

# **In case you use a colaboratory environment**
By default, Colab notebooks run on CPU.
You can switch your notebook to run with GPU.

In order to obtain access to the GPU, you need to choose the tab Runtime and then select “Change runtime type” as shown in the following figure:

![Changing runtime](https://miro.medium.com/max/747/1*euE7nGZ0uJQcgvkpgvkoQg.png)

When a pop-up window appears select GPU. Ensure “Hardware accelerator” is set to GPU.

# **Working with a new dataset: CIFAR-10**

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. More information about CIFAR-10 can be found [here](https://www.cs.toronto.edu/~kriz/cifar.html).

In Keras, the CIFAR-10 dataset is also preloaded in the form of four Numpy arrays. x_train and y_train contain the training set, while x_test and y_test contain the test data. The images are encoded as Numpy arrays and their corresponding labels ranging from 0 to 9.

Your task is to:

*   Visualize the images in CIFAR-10 dataset. Create a 10 x 10 plot showing 10 random samples from each class.
*   Convert the labels to one-hot encoded form.
*   Normalize the images.




In [None]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import layers, models
from sklearn.metrics import confusion_matrix
import seaborn as sns
from sklearn.metrics import classification_report, confusion_matrix

In [None]:

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print(f"x_train shape: {x_train.shape}")
print(f"x_test shape: {x_test.shape}")
print("Labels: ", np.unique(y_train))

In [None]:
# Your code here :

# Plot 10 random samples from each class
fig, axes = plt.subplots(10, 10, figsize=(10, 10))
fig.suptitle("10 Random Samples from Each CIFAR-10 Class", fontsize=14)

for class_idx in range(10):
    class_images = x_train[y_train.flatten() == class_idx]  # Filter images by class
    sample_images = class_images[np.random.choice(class_images.shape[0], 10, replace=False)]  # Randomly pick 10

    for j in range(10): # Display the samples
        axes[class_idx, j].imshow(sample_images[j])
        axes[class_idx, j].axis('off')

plt.tight_layout()
plt.subplots_adjust(top=0.92)
plt.show()


In [None]:

# Convert labels to one-hot encoding
y_train_one_hot = to_categorical(y_train, num_classes=10)
y_test_one_hot = to_categorical(y_test, num_classes=10)

# Normalize images (convert pixel values from [0, 255] to [0, 1])
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

print(f"y_train shape: {y_train_one_hot.shape}")
print(f"y_test shape: {y_test_one_hot.shape}")


## Define the following model (same as the one in tutorial)

For the convolutional front-end, start with a single convolutional layer with a small filter size (3,3) and a modest number of filters (32) followed by a max pooling layer.

Use the input as (32,32,3).

The filter maps can then be flattened to provide features to the classifier.

Use a dense layer with 100 units before the classification layer (which is also a dense layer with softmax activation).

In [None]:
from keras.backend import clear_session
clear_session()

In [None]:
# Your code here :

# Define input shape
input_shape = (32, 32, 3)
num_classes = 10

# Build the CNN model
model1 = models.Sequential([
    layers.Input(shape=input_shape),

    # Convolutional layer with 32 filters, 3x3 kernel, ReLU activation
    layers.Conv2D(32, kernel_size=(3, 3), activation="relu", padding='same'),

    # Max pooling layer
    layers.MaxPooling2D(pool_size=(2, 2)),

    # Flattening the feature maps
    layers.Flatten(),

    # Fully connected layer with 100 units
    layers.Dense(100, activation="relu"),

    # Output layer with softmax activation
    layers.Dense(num_classes, activation="softmax")
])

# Print model summary
model1.summary()


*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 50 epochs with a batch size of 512.

In [None]:
# Your code here :
# Compile the model
model1.compile(
    loss="categorical_crossentropy",
    optimizer=tf.keras.optimizers.SGD(),  # Stochastic Gradient Descent
    metrics=["accuracy"]
)

# Train the model
history1 = model1.fit(
    x_train, y_train_one_hot,
    validation_data=(x_test, y_test_one_hot),
    epochs=50,
    batch_size=512,
    validation_split=0.1,  # 10% of training data used for validation
    verbose=1 #Shows a progress bar for each epoch. - Displays loss and accuracy after each epoch.
)

# Print model summary
model1.summary()



## Save the trained model then Load it

In [None]:
# Save the trained model
model1.save('cifar10_model.h5')

# Load the saved model
loaded_model = tf.keras.models.load_model('cifar10_model.h5')

*   Plot the cross entropy loss curve and the accuracy curve

In [None]:
# Your code here :
# Extract loss and accuracy values
train_loss = history1.history['loss']
val_loss = history1.history['val_loss']
train_acc = history1.history['accuracy']
val_acc = history1.history['val_accuracy']

# Plot Loss Curve
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)  # Create subplot (1 row, 2 cols, 1st plot)
plt.plot(train_loss, label='Training Loss', color='blue')
plt.plot(val_loss, label='Validation Loss', color='red')
plt.xlabel('Epochs')
plt.ylabel('Cross-Entropy Loss')
plt.title('Loss Curve')
plt.legend()

# Plot Accuracy Curve
plt.subplot(1, 2, 2)  # Create subplot (1 row, 2 cols, 2nd plot)
plt.plot(train_acc, label='Training Accuracy', color='blue')
plt.plot(val_acc, label='Validation Accuracy', color='red')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Accuracy Curve')
plt.legend()

# Show plots
plt.show()

## Defining Deeper Architectures: VGG Models

*   Define a deeper model architecture for CIFAR-10 dataset and train the new model for 50 epochs with a batch size of 512. We will use VGG model as the architecture.

Stack two convolutional layers with 32 filters, each of 3 x 3.

Use a max pooling layer and next flatten the output of the previous layer and add a dense layer with 128 units before the classification layer.

For all the layers, use ReLU activation function.

Use same padding for the layers to ensure that the height and width of each layer output matches the input


In [None]:
from keras.backend import clear_session
clear_session()

In [None]:
# Your code here :
# Define the model
model2 = models.Sequential()

# First convolutional block with 32 filters, 3x3 kernel, ReLU activation, and same padding
model2.add(layers.Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)))
model2.add(layers.Conv2D(32, (3, 3), activation='relu', padding='same'))

# Max pooling layer
model2.add(layers.MaxPooling2D(pool_size=(2, 2)))

# Flatten the output of the previous layer
model2.add(layers.Flatten())

# Dense layer with 128 units and ReLU activation
model2.add(layers.Dense(128, activation='relu'))

# Output layer with 10 units (one for each class) and softmax activation
model2.add(layers.Dense(10, activation='softmax'))


*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 50 epochs with a batch size of 512.

In [None]:
# Your code here :
# Compile the model with SGD optimizer and categorical_crossentropy loss
model2.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])

# Summary of the model
model2.summary()

# Load the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Convert labels to one-hot encoding
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

# Normalize the images to the range [0, 1]
x_train, x_test = x_train / 255.0, x_test / 255.0

# Train the model for 50 epochs with a batch size of 512
history2 = model.fit(x_train, y_train, epochs=50, batch_size=512, validation_data=(x_test, y_test))


*   Compare the performance of both the models by plotting the loss and accuracy curves of both the training steps. Does the deeper model perform better? Comment on the observation.


In [None]:
# Your code here :
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
# Plot Loss Curves
axes[0].plot(history1.history['loss'], label='Model 1 Train')
axes[0].plot(history1.history['val_loss'], label='Model 1 Validation')
axes[0].plot(history2.history['loss'], label='VGG Model Train')
axes[0].plot(history2.history['val_loss'], label='VGG Model Validation')
axes[0].set_title('Loss Comparison')
axes[0].set_xlabel('Epochs')
axes[0].set_ylabel('Loss')
axes[0].legend()

# Plot Accuracy Curves
axes[1].plot(history1.history['accuracy'], label='Model 1 Train')
axes[1].plot(history1.history['val_accuracy'], label='Model 1 Validation')
axes[1].plot(history2.history['accuracy'], label='VGG Model Train')
axes[1].plot(history2.history['val_accuracy'], label='VGG Model Validation')
axes[1].set_title('Accuracy Comparison')
axes[1].set_xlabel('Epochs')
axes[1].set_ylabel('Accuracy')
axes[1].legend()

plt.show()

**Comment on the observation**

*( Deeper Model Performance: The deeper model with the SGD optimizer may take longer to converge, but it could perform similarly or better in terms of generalization (validation accuracy) after some epochs.
)*

...

*   Use predict function to predict the output for the test split
*   Plot the confusion matrix for the new model and comment on the class confusions.


In [None]:
# Your code here :
# Predict using model1 and model2
y_pred_model1 = model1.predict(x_test)
y_pred_model2 = model2.predict(x_test)

# Convert predicted probabilities to class labels (argmax)
y_pred_model1 = np.argmax(y_pred_model1, axis=1)
y_pred_model2 = np.argmax(y_pred_model2, axis=1)

# Predict the results with the model
new_pred = model2.predict(x_test)

# Convert predictions to the class with the highest probability
new_pred = np.argmax(new_pred, axis=1)

# Convert ground truth labels to a 1D array (class indices)
gt = np.argmax(y_test, axis=1)

# Compute the confusion matrix
cm = confusion_matrix(gt, new_pred)

# Plot the confusion matrix using seaborn
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=np.arange(10), yticklabels=np.arange(10))
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix for VGG Model')
plt.show()

**Comment here :**

*(Double-click or enter to edit)*

...

*    Print the test accuracy for the trained model.

In [None]:
# Your code here :
# Evaluate model1 on test data
test_loss1, test_acc1 = model1.evaluate(x_test, y_test, verbose=0)
print(f"Test Accuracy for Model 1: {test_acc1:.4f}")

# Evaluate model2 on test data
test_loss2, test_acc2 = model2.evaluate(x_test, y_test, verbose=0)
print(f"Test Accuracy for Model VGG: {test_acc2:.4f}")

## Define the complete VGG architecture.

Stack two convolutional layers with 64 filters, each of 3 x 3 followed by max pooling layer.

Stack two more convolutional layers with 128 filters, each of 3 x 3, followed by max pooling, followed by two more convolutional layers with 256 filters, each of 3 x 3, followed by max pooling.

Flatten the output of the previous layer and add a dense layer with 128 units before the classification layer.

For all the layers, use ReLU activation function.

Use same padding for the layers to ensure that the height and width of each layer output matches the input

*   Change the size of input to 64 x 64.

In [None]:
from keras.backend import clear_session
clear_session()

In [None]:
# Your code here :
# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# 1. Resize images from (32, 32, 3) to (64, 64, 3)
X_train = tf.image.resize(x_train, (64, 64))
X_test = tf.image.resize(x_test, (64, 64))

# 2. Convert the labels to one-hot encoded form
Y_train = to_categorical(y_train, num_classes=10)
Y_test = to_categorical(y_test, num_classes=10)

# 3. Normalize the images
X_train, X_test = X_train / 255.0, X_test / 255.0


# Define the VGG-style CNN model
model3 = models.Sequential([

    layers.Conv2D(64, (3, 3), activation='relu', input_shape=(64, 64, 3), padding='same'),
    layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),

    layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
    layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),

    layers.Conv2D(256, (3, 3), activation='relu', padding='same'),
    layers.Conv2D(256, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),

    # Flatten the output to convert 2D feature maps into a 1D feature vector
    layers.Flatten(),

    # Fully connected dense layer with 128 neurons
    layers.Dense(128, activation='relu'),

    # Output layer: 10 neurons (one for each class) with softmax activation
    layers.Dense(10, activation='softmax')
])

*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 10 epochs with a batch size of 512.
*   Predict the output for the test split and plot the confusion matrix for the new model and comment on the class confusions.

In [None]:
# Train the model for 50 epochs with batch size 512
model3.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
history3 = model3.fit(x_train, y_train, batch_size=512, epochs=10, validation_split=0.1)

# Predict on the test set
new_VGG = model3.predict(x_test)

# Get the class predictions (highest probability class)
y_pred_classes = np.argmax(new_VGG, axis=1)

# Get the true labels (original labels for the test set)
y_true = np.argmax(y_test, axis=1)

# Generate confusion matrix
cm = confusion_matrix(y_true, y_pred_classes)

# Plot the confusion matrix using Seaborn heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=[f'{i+1}' for i in range(10)], yticklabels=[f'{i+1}' for i in range(10)])
plt.title("Confusion Matrix")
plt.xlabel("Predicted Labels")
plt.ylabel("True Labels")
plt.show()

# Generate classification report
report = classification_report(y_true, y_pred_classes, target_names=[str(i) for i in range(1, 11)])

# Print classification report
print("Classification Report for each class:\n")
print(report)

# Understanding deep networks

*   What is the use of activation functions in network? Why is it needed?
*   We have used softmax activation function in the exercise. There are other activation functions available too. What is the difference between sigmoid activation and softmax activation?
*   What is the difference between categorical crossentropy and binary crossentropy loss?

**Write the answers below :**

1 - Use of activation functions:



_ Activation functions help neural networks learn complex patterns by introducing non-linearity, enabling them to model real-world data effectively.


2 - Key Differences between sigmoid and softmax:



_ Sigmoid vs. Softmax: Sigmoid outputs a probability for a single neuron (used for binary classification), while Softmax assigns probabilities across multiple classes (used for multi-class classification).


3 - Key Differences between categorical crossentropy and binary crossentropy loss:


_ Categorical vs. Binary Crossentropy: Categorical Crossentropy is used for multi-class classification (with one-hot encoded labels), while Binary Crossentropy is for binary classification (labels as 0 or 1).
