# **Deep Learning Project - Pothole detection** <br/>
# **Ensemble Model**
**Data Science and Advanced Analytics with a specialization in Data Science**<br/>
**NOVA IMS**<br/>
Authors of this notebook:
*   Mafalda Paço - 20220619@novaims.unl.pt
*   Mª Margarida Graça - 20220602@novaims.unl.pt
*   Marta Dinis - 20220611@novaims.unl.pt
*   Nuno Dias - 20220603@novaims.unl.pt
*   Patrícia Morais - 20220638@novaims.unl.pt


## References
1.  https://www.v7labs.com/blog/ensemble-learning
2.  https://medium.com/randomai/ensemble-and-store-models-in-keras-2-x-b881a6d7693f
3.  https://builtin.com/machine-learning/ensemble-model

##Ready to use Dataset
https://drive.google.com/file/d/1KE507iE7Hwb7TiJINnvMYCXNIGrEgPvt/view?usp=share_link

## **Summary**

In this notebook you'll find an ensemble model, combining our handcrafted model, ResNet50 and VGG16. By combining the predictions of multiple models we have a chance of boosting accuracy. This approach tends to genenerate more robust predictions, anchoring itself on the wisdom of crowds.
All the choices made during the development of this model are detailed as we implemented them. You can also find the accuracy, AUROC and loss values for train and validation datasets.
We concluded that this model is worse than the one we handcrafted.

## **Data Import**

Necessary library imports.

In [None]:
!pip install -q -U keras-tuner

import os
import numpy as np
from PIL import Image

import time
import shutil
import zipfile

import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras import Sequential, Model, layers, initializers, regularizers, optimizers, metrics

from tensorflow.keras.layers import Dense, Flatten, Dropout, GlobalAveragePooling2D, Input
from tensorflow.keras.models import Model

from sklearn.metrics import accuracy_score
from tensorflow.keras.applications import ResNet50, VGG16
from tensorflow.keras.optimizers import Adam

import cv2
from sklearn.metrics import roc_auc_score
from tensorflow.keras.losses import binary_crossentropy

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/172.2 KB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m172.2/172.2 KB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m
[?25h

Connection to the Data Source.

In [None]:
# Set the machine
gdrive = True
# Set the connection string
path = "/content/drive/MyDrive/DL/Project/"
main_folder, training_folder, testing_folder = "DATA/", "train/", "test/"
# If using Google Drive
if gdrive:
    # Setup drive
    from google.colab import drive
    drive.mount('/content/drive')
    # Transfer zip dataset to the current virtual machine
    t0 = time.time()
    shutil.copyfile(path + 'DATA.zip', 'DATA.zip')
    # Extract files
    zip_ = zipfile.ZipFile('DATA.zip')
    zip_.extractall()
    zip_.close()
    print("File transfer completed in %0.3f seconds" % (time.time() - t0))
    path = ""

Mounted at /content/drive
File transfer completed in 6.447 seconds


Definition of a list of parameters for the function image_dataset_from_directory. We defined the size to which all images will be resized as well as the the number of batches at a time that our model will be trained on. All of these parameters were adapted accordingly to our problem's complexity.

In [None]:
image_size=(224, 224) # Experimetar
crop_to_aspect_ratio=True # Experimentar
color_mode='rgb'
batch_size=64
label_mode="binary"
validation_split=0.2
shuffle=True
seed=0

Loads the training data using the ``image_dataset_from_directory()``function and does an automatic split between training and validation data via validation_split, saving 20% for Validation.

In [None]:
# Generate an object of type tf.data.Dataset
ds_train, ds_val = image_dataset_from_directory(path + main_folder + training_folder,
                                                image_size=image_size,
                                                crop_to_aspect_ratio=crop_to_aspect_ratio,
                                                color_mode=color_mode,
                                                batch_size=batch_size,
                                                label_mode=label_mode,
                                                subset='both',
                                                validation_split=validation_split,
                                                shuffle=shuffle,
                                                seed=seed)

Found 1436 files belonging to 2 classes.
Using 1149 files for training.
Using 287 files for validation.


Loads the testing data using the ``image_dataset_from_directory()``function.

In [None]:
ds_test = image_dataset_from_directory(path + main_folder + testing_folder,
                                       image_size=image_size,
                                       crop_to_aspect_ratio=crop_to_aspect_ratio,
                                       color_mode=color_mode,
                                       batch_size=batch_size,
                                       label_mode=label_mode,
                                       shuffle=shuffle,
                                       seed=seed)

Found 16 files belonging to 2 classes.


In [None]:
input_shape=(*image_size, 3)
input_shape

(224, 224, 3)

With this preprocess function we are flattening and normalizing our image data, in order to make it suitable to train a machine learning model.

In [None]:
def preprocess(ds):
    X = []
    y = []
    for images, labels in ds:
        # Flatten images
        flat_images = images.numpy().reshape(images.shape[0], -1)
        X.extend(flat_images)

        # Get labels
        y.extend(labels.numpy())

    # Normalize images
    X = np.array(X).astype('float32') / 255.0
    y = np.array(y)

    return X, y

X_train, y_train = preprocess(ds_train)
X_val, y_val = preprocess(ds_val)

This cell guarantees that the images are compatible with the pre-trained models which expect input images of size (224, 224) with 3 color channels.

In [None]:
# Load and resize the input images to (224, 224)
X_train_resized = []
X_val_resized = []

for img in X_train:
    img_resized = cv2.resize(img, (224, 224))
    X_train_resized.append(img_resized)

for img in X_val:
    img_resized = cv2.resize(img, (224, 224))
    X_val_resized.append(img_resized)

X_train_resized = np.array(X_train_resized)
X_val_resized = np.array(X_val_resized)

# Ensure the input data has the correct number of channels
if X_train_resized.ndim == 3:
    X_train_resized = np.expand_dims(X_train_resized, axis=-1)
    X_train_resized = np.repeat(X_train_resized, 3, axis=-1)

if X_val_resized.ndim == 3:
    X_val_resized = np.expand_dims(X_val_resized, axis=-1)
    X_val_resized = np.repeat(X_val_resized, 3, axis=-1)

## **Models**


###ResNet50

ResNet stands for Residual Network, and is a 50 layer convolutional neural network (CNN). It uses shortcut connections, skipping over some convolutional layers and bypassing the vanishing gradient problem. (After a certain amount of backpropagation the gradients become so small that the model's weights cannot change, leading to the earlier layers of the network to not learn effectively). By using these residual blocks, the model is able to learn more complex representations of the input images.

In [None]:
# Load pre-trained ResNet50 model
resnet50 = ResNet50(include_top=False, weights='imagenet', input_shape=input_shape)
resnet50.trainable = False

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5


###VGG16

VGG is a convolutional neural network comprised of 16 layers. It uses a 3x3 filter, the smallest possible size to capture spatial features. It's a more complex model than ResNet, having more filters and relying on increasing the number of layers to improve accuracy. By using many filters with samll receptive fields it's able to capture fine-grained features in the input images.

In [None]:
# Load pre-trained VGG16 model
vgg16 = VGG16(include_top=False, weights='imagenet', input_shape=input_shape)
vgg16.trainable = False

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5


###Our BEST MODEL

In [None]:
# Define the custom model
class SimpleCNN_DO_L2_ES_3CL_BN(Model):
    def __init__(self, seed=0):
        super().__init__()
        self.preprocess = tf.keras.layers.BatchNormalization()
        self.conv1 = tf.keras.layers.Conv2D(filters=32, kernel_size=3, padding='same', activation='relu',
                                           kernel_initializer=tf.keras.initializers.GlorotNormal(seed=seed))
        self.bn1 = tf.keras.layers.BatchNormalization()
        self.conv2 = tf.keras.layers.Conv2D(filters=64, kernel_size=3, padding='same', activation='relu',
                                           kernel_initializer=tf.keras.initializers.GlorotNormal(seed=seed))
        self.bn2 = tf.keras.layers.BatchNormalization()
        self.conv3 = tf.keras.layers.Conv2D(filters=64, kernel_size=3, padding='same', activation='relu',
                                           kernel_initializer=tf.keras.initializers.GlorotNormal(seed=seed))
        self.bn3 = tf.keras.layers.BatchNormalization()
        self.dense1 = tf.keras.layers.Dense(units=1, activation='sigmoid',
                                            kernel_initializer=tf.keras.initializers.GlorotNormal(seed=seed),
                                            kernel_regularizer=tf.keras.regularizers.l2(0.01))  # Add L2 regularization
        self.dropout1 = tf.keras.layers.Dropout(0.15, seed=seed)  # Add dropout layer

        # Non-learnable layers (define only once)
        self.gmp = tf.keras.layers.GlobalMaxPooling2D()
        self.maxpool2x2 = tf.keras.layers.MaxPooling2D(pool_size=2, strides=2)

    def call(self, inputs):
        x = self.preprocess(inputs)
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.maxpool2x2(x)
        x = self.dropout1(x)  # Apply dropout to the output of conv1

        x = self.conv2(x)
        x = self.bn2(x)
        x = self.maxpool2x2(x)

        x = self.conv3(x)
        x = self.bn3(x)
        x = self.gmp(x)
        x = self.dense1(x)
        return x

# Create an instance of the custom model
custom_model = SimpleCNN_DO_L2_ES_3CL_BN()


We pre-trained ResNet50, VGG16, and our custom model, we processed training images and extracted features with a diverse range of techniques. These features were then merged to shape the final ensemble's dataset.

In [None]:
# Extract features from the pre-trained models
resnet50_features = resnet50.predict(X_train_resized)
vgg16_features = vgg16.predict(X_train_resized)
custom_model_features = custom_model.predict(X_train_resized)

# Flatten the features
resnet50_features_flat = resnet50_features.reshape((resnet50_features.shape[0], -1))
vgg16_features_flat = vgg16_features.reshape((vgg16_features.shape[0], -1))
custom_model_features_flat = custom_model_features.reshape((custom_model_features.shape[0], -1))

# Concatenate the features
ensemble_features = np.concatenate([resnet50_features_flat, vgg16_features_flat, custom_model_features_flat], axis=1)




We then created a simple classifier with one dense layer and a sigmoid activation function. We compiled and trained the ensemble classifier using the combined features from the pre-trained models and corresponding labels.

In [None]:
# Create the final classifier
classifier = tf.keras.Sequential()
classifier.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

# Compile the classifier
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the classifier
classifier.fit(ensemble_features, y_train, epochs=10, batch_size=32)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f653ce87400>

We extracted the features from the pre-trained models for validation data, flattened these features, and then concatenated them so they would be the same size.

In [None]:
# Extract features from the pre-trained models for validation data
resnet50_features_val = resnet50.predict(X_val_resized)
vgg16_features_val = vgg16.predict(X_val_resized)
custom_model_features_val = custom_model.predict(X_val_resized)

# Flatten the features
resnet50_features_val_flat = resnet50_features_val.reshape((resnet50_features_val.shape[0], -1))
vgg16_features_val_flat = vgg16_features_val.reshape((vgg16_features_val.shape[0], -1))
custom_model_features_val_flat = custom_model_features_val.reshape((custom_model_features_val.shape[0], -1))

# Concatenate the features
ensemble_features_val = np.concatenate([resnet50_features_val_flat, vgg16_features_val_flat, custom_model_features_val_flat], axis=1)



Finally we printed the accuracy, AUROC, and loss for both training and validation data.

In [None]:
# Make predictions with the classifier on training data
y_pred_train = classifier.predict(ensemble_features)
y_pred = classifier.predict(ensemble_features_val)

# Calculate the accuracy for train and validation data
train_accuracy = accuracy_score(y_train, y_pred_train.round())
val_accuracy = accuracy_score(y_val, y_pred.round())

# Calculate the AUROC score for train and validation data
train_auroc = roc_auc_score(y_train, y_pred_train)
val_auroc = roc_auc_score(y_val, y_pred)

# Calculate the loss for train and validation data
train_loss = binary_crossentropy(y_train, y_pred_train).numpy().mean()
val_loss = binary_crossentropy(y_val, y_pred).numpy().mean()

# Print the evaluation metrics for train data
print("Training Data Metrics:")
print("Accuracy:", train_accuracy)
print("AUROC:", train_auroc)
print("Loss:", train_loss)
print()

# Print the evaluation metrics for validation data
print("Validation Data Metrics:")
print("Accuracy:", val_accuracy)
print("AUROC:", val_auroc)
print("Loss:", val_loss)

Training Data Metrics:
Accuracy: 0.8268059181897301
AUROC: 0.9417990540868301
Loss: 0.37872666

Validation Data Metrics:
Accuracy: 0.7735191637630662
AUROC: 0.8951824817518248
Loss: 0.5018571


### **Conclusion**

We couldn't reach a satisfactory result, since the accuracy is worse than other models we tested and some overfitting, even though ensemble is a good technique
It might be possible to get better results if we used other models in the ensemble.