# Convolutional Neural Netowrk to predict images from CIFAR-10 dataset

In this lecture we are going to learn about CNN (Convolutional Neural Networks).
We will learn how to build and how to use them to make predictions.

The dataset of today's classification task is: CIFAR-10 https://www.cs.toronto.edu/~kriz/cifar.html

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
import matplotlib.pyplot as plt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow import keras
from keras.utils import plot_model

### Dataset loading and some data preprocessing

In [None]:
# TODO: loading CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = ...

# print dataset shape and a single image shape
print("Train dataset shape:", x_train.shape)
print("Single image shape:", x_train[0].shape)

# Normalize pixel values to be between 0 and 1
x_train = ...
x_test =  ...

print("Train label shape:", y_train.shape)
print("Unique labels: ", ...)

In [None]:
# We plot the first 25 images in the dataset (in a grid 5x5) with the corresponding labels
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',     # We add the dataset labels just to understand better the output.
               'dog', 'frog', 'horse', 'ship', 'truck']             # They are provided in the dataset documentation


plt.figure(figsize=(8, 8))
for i in range(25):
    plt.subplot(5, 5, i+1)
    plt.imshow(x_train[i])
    plt.title(class_names[y_train[i][0]], fontsize=10)
    plt.axis('off')

plt.show()

## Building the CNN

We are going to create a CNN model having these hidden layers:
1. `layer1`: conv2D having 32 filters of size 3x3, stride=1, ReLu activation, padding "same"
2. `layer2`: maxPool with filter size 2x2 and stride=1
3. `layer3`: conv2D having 64 filters of size 3x3, stride=1, ReLu activation, padding "same"
4. `layer4`: maxPool with filter size 2x2 and stride=1
5. `layer5`: conv2D having 64 filters of size 3x3, stride=1, ReLu activation, padding "same"
6. `layer6`: maxPool with filter size 2x2 and stride=1
7. `layer7`: MLP with 64 nodes

- **Keras sequential** documentation: https://keras.io/guides/sequential_model/
- **Keras documentation for Conv2D** class: https://keras.io/api/layers/convolution_layers/convolution2d/

In [None]:
# TODO: define the CNN model as described
# HINT: pay attention when you are coding the 6th-7th layer, we need something in between !
model = Sequential([
    ...
])

# Model architecture visualization
model.summary()

#### Visualize and plot the model architeture

In [None]:
!pip install visualkeras

In [None]:
import visualkeras

visualkeras.layered_view(model).show() # display using your system viewer
visualkeras.layered_view(model, to_file='output.png') # write to disk

In [None]:
plot_model(model, to_file='model.png')

### CNN Training

In [None]:
# TODO: compile the model as follows:
#       - use Adam optimizer
#       - use Sparse Categorical Crossentripy loss
#       - add the Accuracy metric
#
# NOTE: Sparse Categorical Crossentropy is similar to Categorical Crossentropy but is designed for cases
#       where the target labels are not one-hot encoded. Instead, the labels are represented as integers
#       corresponding to the class indices. The true labels are integers, where each integer represents the class index.
...

# TODO: train the Model for 15 epochs
history = ...

### CNN evaluation

- All the training data have been stored in a **History** object.
- Its `History.history` attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values.
- If you don't remember how history is made you can run
    ```python
    type(history.history)
    ```
- Moreover, since it is a dictionary (a structure key:value) you can list the metrics stored in history (the keys) using
    ```python
    history.history.keys()
    ```

**Model evaluation**

In order to evaluate our model we want to:
- plot accuracy curve on training and validation sets
- test the model on the test set

In [None]:
import matplotlib.pyplot as plt

# Degine a subplot grid 1x2
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)

# Plot for loss and val_loss
plt.title("Loss Function")
plt.plot(..., label='loss')
plt.xlabel('Epoch', fontsize=13)
plt.ylabel('Loss', fontsize=13)
plt.ylim([0.0, 2])
plt.legend(loc='upper right')

plt.subplot(1, 2, 2)

# Plot for accuracy and val_accuracy
plt.title("Accuracy")
plt.plot(..., label='accuracy', color='tab:orange')
plt.xlabel('Epoch', fontsize=13)
plt.ylabel('Accuracy', fontsize=13)
plt.legend(loc='lower right')

plt.tight_layout()
plt.show()

In [None]:
# TODO: evaluate the Model on test data
test_loss, test_accuracy = ...

print(f'Loss on test set: {test_loss}')
print(f'Accuracy on test set: {test_accuracy*100}%')

### Confusion matrix

- A confusion matrix is a performance measurement tool used in classification tasks, to evaluate the performance of a classification model.
- It is a square matrix where each row represents the instances in a predicted class, and each column represents the instances in an actual class (or vice versa).
- The diagonal elements of the matrix represent the number of correct predictions for each class, while the off-diagonal elements represent incorrect predictions.

By analyzing the confusion matrix, we can gain insights into the model's performance, such as:
- `Accuracy`: The overall accuracy of the model, calculated as the ratio of the sum of correct predictions to the total number of predictions.
- `Precision`: The ratio of true positive predictions to the total number of positive predictions, indicating the model's ability to correctly identify positive cases.
- `Recall`: The ratio of true positive predictions to the total number of actual positive cases, indicating the model's ability to capture all positive cases.
- `F1 Score`: The harmonic mean of precision and recall, providing a balance between the two metrics.

Overall, the confusion matrix provides a comprehensive overview of the model's performance across different classes, enabling us to identify areas for improvement and fine-tuning.

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sn    # https://seaborn.pydata.org/
import pandas  as pd

In [None]:
# TODO: get model prediction on testing set
y_pred = ...

# TODO compute the confusion matrix
# HINT: use the confusion_matrix function provided by sklear lib
matrix = ...

# We plot the confusion matrix
df_cm = pd.DataFrame(matrix, class_names, class_names)
plt.figure(figsize = (10,7))
sn.set(font_scale=1.4) #for label size
sn.heatmap(df_cm, cmap="BuPu",annot=True,annot_kws={"size": 10})# font size
plt.show()

In [None]:
# TODO: save the Model
...

### **Visualize the feature maps**
Feature maps are the **representations of features extracted from the input image at each level of the CNN**.

To visualize the latent features computed by a convolutional layer for a given image, you have to extract the output values of that layer.

To do this:
- you need to create a new model with the same input as the original model and the layer you want to analyze as the output layer.
- once you have this new model, you can call it on the image you want to visualize, and it will output the feature maps for that specific layer.

This can help you understand what features the model is detecting in the image and how it is processing the input data.

To access the layers, you can use  `model.layers`

In [None]:
# TODO: print some layers
print(type(model.layers))
print(model.layers[0])

# TODO: print only the conv layers
for ...:
	...
		...

1. Show the feature maps extracted by the first conv layer
2. Build a new model to output right after the first hidden layer

In [None]:
# We create a new model with the first conv layer as output
# NOTE: You can get the model by its name, but consider that the names assigned change if you re-run the code so
#       it's better to select the layer using the list index
model_v = keras.Model(inputs=model.inputs, outputs=model.layers[0].output)
model_v.summary()

In [None]:
im = x_train[14]

# TODO: get the feature maps for an image
# NOTE: you have to reshape the images to include the batch size (equals to 1)
feature_maps = ...

# Print the shape of feature_maps
print(feature_maps.shape)

In [None]:
# We plot the image for which we want to compute the feature maps and its class
plt.imshow(im)
print(class_names[y_train[14].item()])

In [None]:
# TODO: show the feature map corresponding to the 5th filter as an image
# NOTE: Remember that feature_maps.shape = (1, 32, 32, 32) where the 4th entry represents the filters
fmap = ...
print(fmap.shape)

plt.imshow(fmap,cmap="gray")

In [None]:
# We plot all the feature maps
fig  = plt.figure(figsize=(16,8))

for i in range(32):
    sub = fig.add_subplot(4,8, i+1)
    plt.xticks([])
    plt.yticks([])
    sub.imshow(feature_maps[0,:,:,i], cmap = "gray")
    sub.set_title(f"Fieature map {i+1}")

plt.tight_layout()
plt.show()

### Plot the learned Filters

In [None]:
# TODO: extracting the weights of the first convolutional layer
# NOTE: remebere thah for a conv layer the learned weights are the filters !
conv_weights, conv_biases = ...

# Normalizing the weights to [0, 1] to make them easy to visualize.
conv_weights_normalized = (conv_weights - np.min(conv_weights)) / (np.max(conv_weights) - np.min(conv_weights))

# Plotting the learned filters
plt.figure(figsize=(10, 10))
for i in range(conv_weights.shape[-1]):
    plt.subplot(6, 6, i + 1)
    plt.imshow(conv_weights_normalized[:, :, :, i], cmap='gray')
    plt.axis('off')
    plt.title(f'Filter {i+1}')
plt.show()