# Lab : Image Classification using Convolutional Neural Networks

At the end of this laboratory, you would get familiarized with

*   Creating deep networks using Keras
*   Steps necessary in training a neural network
*   Prediction and performance analysis using neural networks

---

# **In case you use a colaboratory environment**
By default, Colab notebooks run on CPU.
You can switch your notebook to run with GPU.

In order to obtain access to the GPU, you need to choose the tab Runtime and then select “Change runtime type” as shown in the following figure:

![Changing runtime](https://miro.medium.com/max/747/1*euE7nGZ0uJQcgvkpgvkoQg.png)

When a pop-up window appears select GPU. Ensure “Hardware accelerator” is set to GPU.

# **Working with a new dataset: CIFAR-10**

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. More information about CIFAR-10 can be found [here](https://www.cs.toronto.edu/~kriz/cifar.html).

In Keras, the CIFAR-10 dataset is also preloaded in the form of four Numpy arrays. x_train and y_train contain the training set, while x_test and y_test contain the test data. The images are encoded as Numpy arrays and their corresponding labels ranging from 0 to 9.

Your task is to:

*   Visualize the images in CIFAR-10 dataset. Create a 10 x 10 plot showing 10 random samples from each class.
*   Convert the labels to one-hot encoded form.
*   Normalize the images.




In [2]:
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

In [None]:

# Step 1: Load the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Define class names for visualization
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

# Create a 10x10 plot showing 10 random samples from each class
fig, axes = plt.subplots(10, 10, figsize=(15, 15))
fig.subplots_adjust(hspace=0.5, wspace=0.5)

for i in range(10):
    class_indices = np.where(y_train[:, 0] == i)[0]
    random_indices = np.random.choice(class_indices, 10, replace=False)
    for j, index in enumerate(random_indices):
        ax = axes[i, j]
        ax.imshow(x_train[index])
        ax.axis('off')
        if j == 0:
            ax.set_title(class_names[i])

plt.show()

# Step 2: Convert labels to one-hot encoded form
y_train_one_hot = to_categorical(y_train, num_classes=10)
y_test_one_hot = to_categorical(y_test, num_classes=10)

# Step 3: Normalize the images
x_train_normalized = x_train.astype('float32') / 255.0
x_test_normalized = x_test.astype('float32') / 255.0

# Print shapes to verify the preprocessing steps
print(f"x_train shape: {x_train_normalized.shape}")
print(f"y_train shape: {y_train_one_hot.shape}")
print(f"x_test shape: {x_test_normalized.shape}")
print(f"y_test shape: {y_test_one_hot.shape}")




## Define the following model (same as the one in tutorial)

For the convolutional front-end, start with a single convolutional layer with a small filter size (3,3) and a modest number of filters (32) followed by a max pooling layer. 

Use the input as (32,32,3). 

The filter maps can then be flattened to provide features to the classifier. 

Use a dense layer with 100 units before the classification layer (which is also a dense layer with softmax activation).

In [4]:
from keras.backend import clear_session
clear_session()

In [None]:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Define the model
model = Sequential()

# Add a single convolutional layer
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))

# Add a max pooling layer
model.add(MaxPooling2D(pool_size=(2, 2)))

# Flatten the feature maps
model.add(Flatten())

# Add a dense layer with 100 units
model.add(Dense(100, activation='relu'))

# Add the classification layer with softmax activation
model.add(Dense(10, activation='softmax'))



# Print the model summary
model.summary()


*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 50 epochs with a batch size of 512.

In [None]:
from tensorflow.keras.optimizers import SGD

model.compile(loss="categorical_crossentropy", optimizer="SGD", metrics=["accuracy"])
model.summary()

# Train the model for 50 epochs with a batch size of 512
history = model.fit(x_train_normalized, y_train_one_hot,
                    batch_size=512,
                    epochs=50,
                    verbose=1,
                    validation_data=(x_test_normalized, y_test_one_hot))

# Evaluate the model on test data
test_loss, test_acc = model.evaluate(x_test_normalized, y_test_one_hot)
print(f"Test Loss: {test_loss}")
print(f"Test Accuracy: {test_acc}")

# Save the model
model.save('cifar10_model.h5')
print("Model saved as cifar10_model.h5")


*   Plot the cross entropy loss curve and the accuracy curve

In [None]:


# Step 7: Plot the cross-entropy loss curve and accuracy curve
# Plot training & validation loss values
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')

# Plot training & validation accuracy values
plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')

plt.show()


## Defining Deeper Architectures: VGG Models

*   Define a deeper model architecture for CIFAR-10 dataset and train the new model for 50 epochs with a batch size of 512. We will use VGG model as the architecture.

Stack two convolutional layers with 32 filters, each of 3 x 3. 

Use a max pooling layer and next flatten the output of the previous layer and add a dense layer with 128 units before the classification layer. 

For all the layers, use ReLU activation function. 

Use same padding for the layers to ensure that the height and width of each layer output matches the input


In [8]:
from keras.backend import clear_session
clear_session()

In [9]:
from tensorflow.keras.layers import Dropout
# Step 1: Define the deeper VGG-inspired model
model2 = Sequential()
model2.add(Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)))
model2.add(Dropout(0.2))
model2.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model2.add(Dropout(0.3))
model2.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model2.add(Flatten())
model2.add(Dense(128, activation='relu'))
model2.add(Dropout(0.4))
model2.add(Dense(10, activation='softmax'))

# Step 2: Compile the model

model2.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])



*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 50 epochs with a batch size of 512.

In [None]:
# Step 2: Compile the model
optimizer = SGD()
model2.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

# Step 6: Train the model
history2 = model2.fit(x_train_normalized, y_train_one_hot,
                     batch_size=512,
                     epochs=50,
                     verbose=1,
                     validation_data=(x_test_normalized, y_test_one_hot))


# Evaluate the model on test data
test_loss, test_acc = model2.evaluate(x_test_normalized, y_test_one_hot)
print(f"Test Loss: {test_loss}")
print(f"Test Accuracy: {test_acc}")

*   Compare the performance of both the models by plotting the loss and accuracy curves of both the training steps. Does the deeper model perform better? Comment on the observation.
 

In [None]:
import matplotlib.pyplot as plt

# Assuming history1 contains the training history for the initial model
# and history2 contains the training history for the deeper VGG-inspired model

# Plot training & validation loss values for both models
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Initial Model - Train Loss')
plt.plot(history.history['val_loss'], label='Initial Model - Val Loss')
plt.plot(history2.history['loss'], label='Deeper Model - Train Loss')
plt.plot(history2.history['val_loss'], label='Deeper Model - Val Loss')
plt.title('Model Loss Comparison')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(loc='upper right')

# Plot training & validation accuracy values for both models
plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'], label='Initial Model - Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Initial Model - Val Accuracy')
plt.plot(history2.history['accuracy'], label='Deeper Model - Train Accuracy')
plt.plot(history2.history['val_accuracy'], label='Deeper Model - Val Accuracy')
plt.title('Model Accuracy Comparison')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(loc='lower right')

plt.show()


**Comment on the observation**

*(Double-click or enter to edit)*

...

*   Use predict function to predict the output for the test split
*   Plot the confusion matrix for the new model and comment on the class confusions.


In [None]:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

# Step 5: Use the predict function to predict the output for the test split
y_pred_prob = model2.predict(x_test_normalized)
y_pred = np.argmax(y_pred_prob, axis=1)

# Step 6: Plot the confusion matrix
cm = confusion_matrix(y_test, y_pred)
cmd = ConfusionMatrixDisplay(cm, display_labels=['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'])
cmd.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix for VGG-inspired Model with Dropouts')
plt.show()

# Comment on class confusions
print("Confusion Matrix Analysis:")
print("Each row represents the actual class, while each column represents the predicted class.")
print("Higher values off the diagonal indicate greater confusion between those classes.")

# Optionally, print the confusion matrix values
print(cm)


**Comment here :**

*(Double-click or enter to edit)*

...

*    Print the test accuracy for the trained model.

In [None]:
# Evaluate the model on the test data and print the test accuracy
test_loss, test_acc = model2.evaluate(x_test_normalized, y_test_one_hot, verbose=0)
print(f"Test Loss: {test_loss}")
print(f"Test Accuracy: {test_acc}")


## Define the complete VGG architecture.

Stack two convolutional layers with 64 filters, each of 3 x 3 followed by max pooling layer. 

Stack two more convolutional layers with 128 filters, each of 3 x 3, followed by max pooling, followed by two more convolutional layers with 256 filters, each of 3 x 3, followed by max pooling. 

Flatten the output of the previous layer and add a dense layer with 128 units before the classification layer. 

For all the layers, use ReLU activation function. 

Use same padding for the layers to ensure that the height and width of each layer output matches the input

*   Change the size of input to 64 x 64.

In [14]:
from keras.backend import clear_session
clear_session()

In [None]:
import tensorflow as tf




# Step 3: Normalize the images and resize to 64x64
x_train_resized = tf.image.resize(x_train, (64, 64)).numpy()
x_test_resized = tf.image.resize(x_test, (64, 64)).numpy()
x_train_normalized = x_train_resized.astype('float32') / 255.0
x_test_normalized = x_test_resized.astype('float32') / 255.0

# Step 4: Define the complete VGG architecture
model_vgg = Sequential()
# First stack of convolutional layers with 64 filters
model_vgg.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(64, 64, 3)))
model_vgg.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model_vgg.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
# Second stack of convolutional layers with 128 filters
model_vgg.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model_vgg.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model_vgg.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
# Third stack of convolutional layers with 256 filters
model_vgg.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model_vgg.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model_vgg.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
# Flatten and add dense layers
model_vgg.add(Flatten())
model_vgg.add(Dense(128, activation='relu'))
model_vgg.add(Dense(10, activation='softmax'))



*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 10 epochs with a batch size of 512.
*   Predict the output for the test split and plot the confusion matrix for the new model and comment on the class confusions.

In [None]:

model_vgg.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])

# Step 6: Train the model
history_vgg = model_vgg.fit(x_train_normalized, y_train_one_hot,
                            batch_size=512,
                            epochs=10,
                            verbose=1,
                            validation_data=(x_test_normalized, y_test_one_hot))

# Step 7: Use the predict function to predict the output for the test split
y_pred_prob = model_vgg.predict(x_test_normalized)
y_pred = np.argmax(y_pred_prob, axis=1)

# Step 8: Plot the confusion matrix
cm = confusion_matrix(y_test, y_pred)
cmd = ConfusionMatrixDisplay(cm, display_labels=['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'])
cmd.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix for Complete VGG Model')
plt.show()

# Step 9: Print the test accuracy for the trained model
test_loss, test_acc = model_vgg.evaluate(x_test_normalized, y_test_one_hot, verbose=0)
print(f"Test Loss: {test_loss}")
print(f"Test Accuracy: {test_acc}")

# Comment on class confusions
print("Confusion Matrix Analysis:")
print("Each row represents the actual class, while each column represents the predicted class.")
print("Higher values off the diagonal indicate greater confusion between those classes.")

# Optionally, print the confusion matrix values
print(cm)


# Understanding deep networks

*   What is the use of activation functions in network? Why is it needed?
*   We have used softmax activation function in the exercise. There are other activation functions available too. What is the difference between sigmoid activation and softmax activation?
*   What is the difference between categorical crossentropy and binary crossentropy loss?

**Write the answers below :**

1 - Use of activation functions:


Activation functions introduce non-linearity into the network, allowing it to learn and model complex relationships between inputs and outputs.
_

2 - Key Differences between sigmoid and softmax:

Sigmoid Activation Function:

Range: Outputs values between 0 and 1.
Use Case: Commonly used for binary classification problems where the output represents the probability of a particular class.
Independence: Each output is independent of the others.

Softmax Activation Function:

Range: Outputs a probability distribution that sums to 1 for a multi-class classification problem.
Use Case: Typically used for multi-class classification problems where the output represents the probability of each class.
Dependence: The output probabilities are interdependent and sum to 1.

_

3 - Key Differences between categorical crossentropy and binary crossentropy loss:

Categorical Crossentropy Loss:

Use Case: Used for multi-class classification problems where each sample belongs to one of many classes.

Label Representation: Works with one-hot encoded labels.

Calculation: Measures the difference between the predicted probability distribution and the actual distribution (one-hot encoded).

Binary Crossentropy Loss:

Use Case: Used for binary classification problems where each sample belongs to one of two classes.

Label Representation: Works with binary labels (0 or 1).

Calculation: Measures the difference between the predicted probability and the actual binary label.
_
