<center>

### COSC2753 - Machine Learning

# **Model Development - Convolutional Neural Network (CNN)**

<center>────────────────────────────</center>
&nbsp;


# I. Introduction

In this notebook, we will focus on the development of a Convolutional Neural Network (CNN) model. This process will involve training the CNN model on preprocessed image data and optimizing its performance through hyperparameter tuning. Specifically, we will perform the following steps:

- **Training:** We will train the selected CNN model using the preprocessed image data. This involves feeding the data into the model and adjusting its parameters to minimize the loss function.

- **Hyperparameter Tuning:** We will explore different combinations of hyperparameters to optimize the performance of our CNN model. This may include tuning parameters such as learning rate, batch size, and regularization strength.

- **Model Evaluation:** After training and tuning the CNN model, we will evaluate its performance using appropriate evaluation metrics. This step will help us assess how well the model generalizes to unseen data and determine its effectiveness in predicting labels for new images in the dataset.

By the end of this notebook, we will have developed a well-trained CNN model and evaluated its performance, providing insights into its effectiveness for image recognition tasks. This model will serve as a foundation for further analysis and applications in image classification.

# II. Importing Libraries

In [1]:
import os  # OS related functions
import numpy as np  # Numerical functions
import pandas as pd  # Data manipulation
import matplotlib.pyplot as plt  # Plotting
import seaborn as sns  # Plotting

# Deep learning
import tensorflow as tf
from keras.models import Sequential  # Pipeline
from keras.layers import (
    Dense,
    Conv2D,
    Flatten,
    MaxPooling2D,
    BatchNormalization,
    Dropout,
)  # Layers
from keras.callbacks import EarlyStopping, ReduceLROnPlateau  # Callbacks
from tensorflow.keras.preprocessing.image import (
    ImageDataGenerator,
)  # Image data generator

# Sklearn
from sklearn.metrics import classification_report  # Metrics

# III. Data Loading and Preprocessing

Link to support to the choice of: (Will be used later)

batch size: https://medium.com/data-science-365/determining-the-right-batch-size-for-a-neural-network-to-get-better-and-faster-results-7a8662830f15#:~:text=It%20is%20a%20good%20practice,requires%20fewer%20epochs%20to%20converge.

In [None]:
df_train = pd.read_csv("../../data/train/") # Load train data
df_test = pd.read_csv("../../data/test/") # Load test data

In [None]:
batch_size = 32  # Number of samples per gradient update
num_classes = df_train["Category"].nunique()  # Number of classes

# Image data generator
datagen = ImageDataGenerator(rescale=1.0 / 255)

# Common arguments
common_args = {
    "x_col": "Path",  # Path to image
    "y_col": "Category",  # Target column
    "batch_size": batch_size,  # Batch size
    "class_mode": "categorical",  # Multi-class classification
}

# Create generator for training data
train_dataset = datagen.flow_from_dataframe(
    dataframe=df_train,  # Training data
    shuffle=True,  # Shuffle the data
    **common_args  # Common arguments
)

# Create generator for testing data
test_dataset = datagen.flow_from_dataframe(
    dataframe=df_test,  # Testing data
    shuffle=False,  # Do not shuffle the data
    **common_args  # Common arguments
)

# III. Model Development (CNN)

## CNN Model Architecture Initialization

Link to support to the choice of: (Will be used later)

- Convolutional Layers: https://www.linkedin.com/pulse/choosing-number-hidden-layers-neurons-neural-networks-sachdev/

- Pooling Layers: https://machinelearningmastery.com/pooling-layers-for-convolutional-neural-networks/

- Dropout Layers: https://nchlis.github.io/2017_08_10/page.html

- Activation function choice: https://thangasami.medium.com/cnn-and-ann-performance-with-different-activation-functions-like-relu-selu-elu-sigmoid-gelu-etc-c542dd3b1365

- Kernel size: https://medium.com/analytics-vidhya/significance-of-kernel-size-200d769aecb1#:~:text=Limiting%20the%20number%20of%20parameters,size%20at%203x3%20or%205x5.

In [None]:
cnn = Sequential()  # Pipeline

# Convolutional layer 1
cnn.add(Conv2D(32, kernel_size=(3, 3), activation="relu"))  # Convolutional layer
cnn.add(BatchNormalization())  # Batch normalization
cnn.add(MaxPooling2D(pool_size=(2, 2)))  # Max pooling
cnn.add(Dropout(0.2))  # Dropout

# # Convolutional layer 2
# cnn.add(Conv2D(64, kernel_size=(3, 3), activation="relu"))  # Convolutional layer
# cnn.add(BatchNormalization())  # Batch normalization
# cnn.add(MaxPooling2D(pool_size=(2, 2)))  # Max pooling
# cnn.add(Dropout(0.2))  # Dropout

# # Convolutional layer 3
# cnn.add(Conv2D(128, kernel_size=(3, 3), activation="relu"))  # Convolutional layer
# cnn.add(BatchNormalization())  # Batch normalization
# cnn.add(MaxPooling2D(pool_size=(2, 2)))  # Max pooling
# cnn.add(Dropout(0.2))  # Dropout

# # Convolutional layer 4
# cnn.add(Conv2D(256, kernel_size=(3, 3), activation="relu"))  # Convolutional layer
# cnn.add(BatchNormalization())  # Batch normalization
# cnn.add(MaxPooling2D(pool_size=(2, 2)))  # Max pooling
# cnn.add(Dropout(0.2))  # Dropout

cnn.add(Flatten())  # Flatten

cnn.add(Dense(512, activation="relu"))  # Dense layer
cnn.add(Dropout(0.5))  # Dropout
# cnn.add(Dense(256, activation="relu"))  # Dense layer
# cnn.add(Dropout(0.5))  # Dropout

cnn.add(Dense(num_classes, activation="softmax"))  # Output layer

cnn.summary()  # Model summary

## Training the CNN Model

In [None]:
# Early stopping
early_stopping = EarlyStopping(
    monitor="val_loss",  # Monitor validation loss
    patience=5,  # Stop training if no improvement for 5 epochs
    restore_best_weights=True,  # Restore the best weights when stopping
    min_delta=0.001,  # Minimum change to qualify as an improvement
    verbose=1,  # Print messages
)

# Reduce learning rate
reduce_lr = ReduceLROnPlateau(
    monitor="val_loss",  # Monitor validation loss
    patience=3,  # Reduce learning rate if no improvement for 3 epochs
    factor=0.2,  # Reduce learning rate by a factor of 0.2
    min_lr=0.00001,  # Minimum learning rate
    verbose=1,  # Print messages
)

In [None]:
cnn.compile(
    optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]
)  # Compile model

cnn.fit(
    train_dataset,
    validation_data=test_dataset,
    epochs=10,  # 10 epochs
    callbacks=[early_stopping, reduce_lr],
)  # Fit model

## Model Evaluation

In [None]:
# Evaluate model
loss, accuracy = cnn.evaluate(test_dataset, verbose=0)

print(f"Loss: {loss:.2f}")
print(f"Accuracy: {accuracy:.2f}")

# Predictions
y_pred = cnn.predict(test_dataset)
y_pred = np.argmax(y_pred, axis=1)

# True values
y_true = test_dataset.classes

# Classification report
print(classification_report(y_true, y_pred))