<div style="display: flex; align-items: center; justify-content: center; flex-wrap: wrap;">
    <div style="flex: 1; max-width: 400px; display: flex; justify-content: center;">
        <img src="https://i.ibb.co/JBPWVYR/Logo-Nova-IMS-Black.png" style="max-width: 50%; height: auto; margin-top: 50px; margin-bottom: 50px;margin-left: 6rem;">
    </div>
    <div style="flex: 2; text-align: center; margin-top: 20px;margin-left: 8rem;">
        <div style="font-size: 28px; font-weight: bold; line-height: 1.2;">
            <span style="color: #22c1c3;">DL Project |</span> <span style="color: #08529C;">Predicting Rare Species from Images using Deep Learning</span>
        </div>
        <div style="font-size: 17px; font-weight: bold; margin-top: 10px;">
            Spring Semester | 2024 - 2025
        </div>
        <div style="font-size: 17px; font-weight: bold;">
            Master in Data Science and Advanced Analytics
        </div>
        <div style="margin-top: 20px;">
            <div>André Silvestre, 20240502</div>
            <div>Diogo Duarte, 20240525</div>
            <div>Filipa Pereira, 20240509</div>
            <div>Maria Cruz, 20230760</div>
            <div>Umeima Mahomed, 20240543</div>
        </div>
        <div style="margin-top: 20px; font-weight: bold;">
            Group 37
        </div>
    </div>
</div>

<div style="background: linear-gradient(to right, #22c1c3, #27b1dd, #2d9cfd, #090979);
            padding: 1px; color: white; border-radius: 500px; text-align: center;">
</div>

## **📚 Libraries Import**

In [1]:
# !pip install visualkeras

In [2]:
# System imports
import os
import sys
import time
import datetime
from tqdm import tqdm
from typing_extensions import Self, Any      # For Python 3.10
# from typing import Self, Any               # For Python >3.11

from pathlib import Path

# Data manipulation imports
import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings("ignore")

# Data visualization imports
import matplotlib.pyplot as plt
import seaborn as sns

# Deep learning imports
import tensorflow as tf
from keras.ops import add
from keras.losses import CategoricalCrossentropy
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras import Model, Sequential, Input
from tensorflow.keras.callbacks import ModelCheckpoint, CSVLogger, LearningRateScheduler, EarlyStopping
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, Dropout, Rescaling, Lambda
import visualkeras

# Evaluation imports
from keras.metrics import CategoricalAccuracy, AUC, F1Score, Precision, Recall

# Other imports
from itertools import product

# Set the style of the visualization
pd.set_option('future.no_silent_downcasting', True)   # use int instead of float in DataFrame
pd.set_option("display.max_columns", None)            # display all columns

# Disable warnings (FutureWarning)
import warnings
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=UserWarning)

# Set random seed for reproducibility
np.random.seed(2025)

2025-04-11 12:17:33.229800: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1744370253.248429   10457 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1744370253.253607   10457 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1744370253.269875   10457 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1744370253.269912   10457 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1744370253.269914   10457 computation_placer.cc:177] computation placer alr

In [3]:
print("TensorFlow Version:", tf.__version__)
print("Is TensorFlow built with CUDA?", tf.test.is_built_with_cuda())
print("GPU Available:", tf.config.list_physical_devices('GPU'))
print("GPU Device Name:", tf.test.gpu_device_name())                                # (if error in Google Colab: Make sure your Hardware accelerator is set to GPU.
                                                                                    # Runtime > Change runtime type > Hardware Accelerator)

TensorFlow Version: 2.19.0
Is TensorFlow built with CUDA? True
GPU Available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
GPU Device Name: /device:GPU:0


I0000 00:00:1744370256.124663   10457 gpu_device.cc:2019] Created device /device:GPU:0 with 3586 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6


In [4]:
# Extra: https://www.tensorflow.org/api_docs/python/tf/config/experimental/set_memory_growth
# If you’re using a GPU, TensorFlow might pre-allocate GPU memory, leaving less for CPU operations.
# Enabling memory growth lets the GPU allocate only what’s needed.
if tf.test.is_built_with_cuda():
    gpus = tf.config.list_physical_devices('GPU')
    if gpus:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)

In [5]:
# Import custom module for importing data, visualization, and utilities
import utilities

<div style="background: linear-gradient(to right, #22c1c3, #27b1dd, #2d9cfd, #090979);
            padding: 1px; color: white; border-radius: 500px; text-align: center;">
</div>

## **🧮 Import Databases**

In [6]:
# # Run in Google Collab to download the dataset already splitted
# # Source: https://stackoverflow.com/questions/25010369/wget-curl-large-file-from-google-drivez
# # Download the file from Google Drive using wget
# !wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate \
#   "https://drive.usercontent.google.com/download?id=11vkRJLP-re8E-8DWaoKeSuG66u64ez0J&export=download" -O- | \
#   sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p' > /tmp/confirm.txt

# # Read the confirmation token from the temporary file
# with open('/tmp/confirm.txt', 'r') as f:
#     confirm_token = f.read().strip()

# # Download the file using the confirmation token and cookies
# !wget --load-cookies /tmp/cookies.txt \
#   "https://drive.usercontent.google.com/download?id=11vkRJLP-re8E-8DWaoKeSuG66u64ez0J&export=download&confirm={confirm_token}" \
#   -O /content/RareSpecies_Split.zip

# # Clean up temporary files
# !rm /tmp/cookies.txt /tmp/confirm.txt

# # List files in the /content directory to verify the download
# !ls -lh /content/

# # Unzip the downloaded file
# !unzip /content/RareSpecies_Split.zip -d /content/

# # List the unzipped files to verify
# !ls -lh /content/

In [7]:
# Define the path to the data
train_dir = Path("data/RareSpecies_Split/train")
val_dir = Path("data/RareSpecies_Split/val")
test_dir = Path("data/RareSpecies_Split/test")

# For Google Collab
# train_dir = Path("/content/RareSpecies_Split/train")
# val_dir = Path("/content/RareSpecies_Split/val")
# test_dir = Path("/content/RareSpecies_Split/test")

In [8]:
# Image Generators
n_classes = 202                                     # Number of classes (we already know this based on previous notebook)
image_size = (224, 224)                             # Image size (224x224)
img_height, img_width = image_size                  # Image dimensions
batch_size = 32                                     # Batch size
input_shape = (img_height, img_width, 3)            # Input shape of the model
value_range = (0.0, 1.0)                            # Range of pixel values

In [9]:
# Get class names from directory
class_names = sorted(os.listdir(train_dir))
class_indices = {name: i for i, name in enumerate(class_names)}

# Import the image dataset from the directory
from utilities import load_images_from_directory
train_datagen, val_datagen, test_datagen = load_images_from_directory(train_dir, val_dir, test_dir,
                                                                      labels='inferred', label_mode='categorical',
                                                                      class_names=class_names, color_mode='rgb',
                                                                      batch_size=batch_size, image_size=image_size, seed=2025,
                                                                      interpolation='bilinear', crop_to_aspect_ratio=False, pad_to_aspect_ratio=False)

print(f"\nLoaded: Train ({train_datagen.cardinality().numpy() * batch_size}), "
        f"Val ({val_datagen.cardinality().numpy() * batch_size}), "
        f"Test ({test_datagen.cardinality().numpy() * batch_size})")

Found 9586 files belonging to 202 classes.


I0000 00:00:1744370257.322203   10457 gpu_device.cc:2019] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3586 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6


Found 1198 files belonging to 202 classes.
Found 1199 files belonging to 202 classes.

Loaded: Train (9600), Val (1216), Test (1216)


In [10]:
# Check the shape of the data (batch_size, img_width, img_height, 3)
for x, y in train_datagen.take(1):
    print("Train batch shape:", x.shape, y.shape)
for x, y in val_datagen.take(1):
    print("Val batch shape:", x.shape, y.shape)
for x, y in test_datagen.take(1):
    print("Test batch shape:", x.shape, y.shape)

Train batch shape: (32, 224, 224, 3) (32, 202)


2025-04-11 12:17:39.243418: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


Val batch shape: (32, 224, 224, 3) (32, 202)


2025-04-11 12:17:39.680507: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


Test batch shape: (32, 224, 224, 3) (32, 202)


# <a class='anchor' id='3'></a>
<br>
<style>
@import url('https://fonts.cdnfonts.com/css/avenir-next-lt-pro?styles=29974');
</style>

<div style="background: linear-gradient(to right, #22c1c3, #27b1dd, #2d9cfd, #090979);
            padding: 10px; color: white; border-radius: 300px; text-align: center;">
    <center><h1 style="margin-left: 140px;margin-top: 10px; margin-bottom: 4px; color: white;
                       font-size: 32px; font-family: 'Avenir Next LT Pro', sans-serif;">
        <b>3 | Modeling - Baseline Model</b></h1></center>
</div>

<br><br>

# **💡 Modeling**

In [11]:
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

In [12]:
from keras.applications import VGG19
from keras.layers import GlobalAveragePooling2D

# Baseline Model
class RareSpeciesCNN(Model):
    """Custom CNN for rare species classification.

    Architecture: Simple CNN
    Why: Small model to establish baseline, avoiding overfitting on 202 classes.
    Alternatives: Deeper CNNs (e.g., ResNet) or transfer learning (e.g., EfficientNet).
    Allows selection of preprocessing steps like grayscale, contrast, and saturation adjustment.
    """
    def __init__(self, n_classes=202,
                 apply_grayscale=False,
                 apply_contrast=False, contrast_factor=1.5,
                 apply_saturation=False, saturation_factor=1.5):
        """Initializes the model.

        Args:
            n_classes (int): Number of output classes.
            apply_grayscale (bool): If True, convert images to grayscale.
            apply_contrast (bool): If True, adjust image contrast.
            contrast_factor (float): Factor to adjust contrast by (if apply_contrast is True).
            apply_saturation (bool): If True, adjust image saturation.
            saturation_factor (float): Factor to adjust saturation by (if apply_saturation is True).
        """
        super().__init__() # Call the parent class constructor

        # Store preprocessing flags and factors
        self.apply_grayscale = apply_grayscale
        self.apply_contrast = apply_contrast
        self.apply_saturation = apply_saturation

        # --- Preprocessing Layers ---
        # Rescaling layer (always applied)
        self.rescale_layer = Rescaling(scale= 1 / 255.0, name="Rescale_Layer")    # Rescales pixel values to [0, 1]

        # Conditionally define Lambda layer for contrast adjustment
        if self.apply_contrast:
            # Define Lambda layer for contrast adjustment
            # Source: https://keras.io/api/layers/core_layers/lambda/
            #         https://www.tensorflow.org/api_docs/python/tf/image/adjust_contrast
            #         contrast_factor > 1 increases contrast, < 1 decreases contrast
            self.contrast_layer = Lambda(
                lambda x: tf.image.adjust_contrast(x, contrast_factor=contrast_factor),
                name='Adjust_Contrast'
            )

        # Conditionally define Lambda layer for saturation adjustment
        if self.apply_saturation:
            # Define Lambda layer for saturation adjustment
            # Source: https://www.tensorflow.org/api_docs/python/tf/image/adjust_saturation
            #         saturation_factor > 1 increases saturation, < 1 decreases saturation
            self.saturation_layer = Lambda(
                lambda x: tf.image.adjust_saturation(x, saturation_factor=saturation_factor),
                name='Adjust_Saturation'
            )

        # Conditionally define Lambda layer for grayscale conversion
        if self.apply_grayscale:
            # Define Lambda layer for grayscale conversion
            # Source: https://www.tensorflow.org/api_docs/python/tf/image/rgb_to_grayscale
            self.grayscale_layer = Lambda(
                lambda x: tf.image.rgb_to_grayscale(x),
                name='RGB_to_Grayscale'
            )
            # IMPORTANT: Add a Conv2D layer immediately after grayscale to ensure
            # the number of channels is compatible with subsequent layers
            # if they expect 3 channels. Here, we'll keep it 1 channel and adjust conv1.
            # Alternatively, convert grayscale back to 3 identical channels:
            # self.grayscale_to_rgb_layer = Lambda(
            #     lambda x: tf.image.grayscale_to_rgb(x),
            #     name='Grayscale_to_RGB'
            # )


        # # --- Convolutional Layers ---
        # # Adjust the first Conv layer's input channels if grayscale is applied and not converted back to RGB
        # # If grayscale IS applied, the input to conv1 will have 1 channel.
        # # If grayscale IS NOT applied, the input will have 3 channels (after rescaling).
        # # We will handle this by checking the shape dynamically or assuming subsequent layers can handle 1 channel if needed.
        # # For simplicity here, let's assume conv1 works with either 1 or 3 channels.
        # # If grayscale is applied, the input depth is 1, otherwise 3.
        # # A more robust way might involve explicitly setting input_shape or checking channels.
        # # Let's define conv1 to work even if input is grayscale (1 channel)
        # self.conv1 = Conv2D(filters=3*8, kernel_size=(3, 3), activation='relu', name="Conv_Layer1", padding="same")    # 24 filters
        # self.pool1 = MaxPooling2D(pool_size=(2, 2), name="Max_Pool_Layer1")                                            # Reduces spatial dimensions by half

        # # Subsequent layers
        # self.conv2l = Conv2D(filters=3*16, kernel_size=(3, 3), activation='relu', name="Conv_Layer2l", padding="same") # 48 filters
        # self.conv2r = Conv2D(filters=3*16, kernel_size=(3, 3), activation='relu', name="Conv_Layer2r", padding="same") # 48 filters (parallel path example)
        # # Need to combine conv2l and conv2r, e.g., by concatenation or addition before pooling
        # # For simplicity, let's just use one path for now:
        # self.conv2 = Conv2D(filters=3*16, kernel_size=(3, 3), activation='relu', name="Conv_Layer2", padding="same") # 48 filters
        # self.pool2 = MaxPooling2D(pool_size=(2, 2), name="MaxPool_Layer2")                                            # Further reduces spatial dimensions

        # Pre-trained VGG19 model - Using as a feature extractor
        # Source: https://keras.io/api/applications/vgg/#vgg19-function
        self.vgg19 = VGG19(include_top=False, weights='imagenet')
        # Freeze the VGG19 layers to prevent training
        for layer in self.vgg19.layers:
            layer.trainable = False
            
        # Use Global Average Pooling instead of Flatten to reduce memory usage
        self.global_pool = GlobalAveragePooling2D(name="GlobalAvgPool")
        
        self.output_layer = Dense(n_classes, activation='softmax', name="Output_Layer")

    def call(self, inputs, training=False):
        """Defines the forward pass of the model.

        Args:
            inputs: Input tensor (batch of images).
            training (bool): Indicates if the model is in training mode (for Dropout).

        Returns:
            Output tensor (probabilities for each class).
        """
        # Apply mandatory rescaling
        x = self.rescale_layer(inputs)

        # Apply conditional preprocessing layers
        if self.apply_contrast:
            x = self.contrast_layer(x)
        if self.apply_saturation:
            x = self.saturation_layer(x)
        if self.apply_grayscale:
            x = self.grayscale_layer(x)
            # If subsequent layers strictly require 3 channels, uncomment this:
            # x = self.grayscale_to_rgb_layer(x)
            # Note: If grayscale is applied, conv1 will process a 1-channel input unless converted back.

        x = self.vgg19(x)
        x = self.global_pool(x)       
        outputs = self.output_layer(x)
        return outputs

# Example Instantiation and Summary
model = RareSpeciesCNN(
    n_classes=n_classes,
    apply_grayscale=False,
    apply_contrast=False,
    apply_saturation=False
)

# Build the model by providing an input shape
inputs = Input(shape=(img_width, img_height, 3))        # Input shape
_ = model.call(inputs)                                  # Call the model to build it
model.summary()                                         # Print the model summary

In [13]:
# # Visualize the model architecture
# # Source: https://www.kaggle.com/code/devsubhash/visualize-deep-learning-models-using-visualkeras
# visualkeras.layered_view(model,
#                          legend=True,
#                          show_dimension=True,
#                          scale_xy=1,                                        # Adjust the scale of the image
#                         #  scale_z=1,
#                          # to_file='./BaselineModel_Architecture.png',
# )

In [14]:
# Compile model
# optimizer = SGD(learning_rate=0.1, momentum=0.9, name="Optimizer")                                                       # SGD with decay for stability
optimizer = Adam(learning_rate=0.001, name="Optimizer")                                                   # Adam for faster convergence
# optimizer = Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, decay=0.0, amsgrad=False, name="Optimizer")  # Adam

loss = CategoricalCrossentropy(name="Loss")                            # Suitable for multi-class one-hot labels
metrics = [CategoricalAccuracy(name="accuracy"),
           Precision(name="precision"),
           Recall(name="recall"),
           F1Score(average="macro", name="f1_score"),
           AUC(name="auc")]
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

In [15]:
# Create a directory for saving the model and logs
model_name = f"RareSpeciesCNN_{datetime.datetime.now().strftime('%Y%m%d')}"                                                                             # Model name
print(f"\n\033[1mModel name:\033[0m {model_name}")


[1mModel name:[0m RareSpeciesCNN_20250411


In [16]:
# Callbacks
model_name = f"RareSpeciesCNN_{datetime.datetime.now().strftime('%Y%m%d')}"
os.makedirs("./ModelCallbacks", exist_ok=True)      # Create directory if it doesn't exist                                                                      # Model name
callbacks = [
    ModelCheckpoint(f"./ModelCallbacks/checkpoint_{model_name}.keras", monitor="val_loss", save_best_only=True, verbose=0),       # Save best model
    CSVLogger(f"./ModelCallbacks/metrics_{model_name}.csv"),                                                                      # Log training metrics
    LearningRateScheduler(lambda epoch, lr: lr * 0.95),                                                                           # Exponential decay for learning rate
    EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True, verbose=1)                                           # Stop training when the validation loss stops improving
]

---

### **Original Data**

In [None]:
print(f"\n\033[1mBatch size:\033[0m {batch_size}")


[1mBatch size:[0m 32


: 

In [None]:
# Train model
start_time = time.time()
history = model.fit(train_datagen, batch_size = batch_size, epochs=25, validation_data=val_datagen, callbacks=callbacks, verbose=1)
train_time = round(time.time() - start_time, 2)

print(f"\nTraining completed in \033[1m{train_time} seconds ({str(datetime.timedelta(seconds=train_time))} h)\033[0m).")

Epoch 1/25


I0000 00:00:1744370263.073893   10573 service.cc:152] XLA service 0x77eeb4005c40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1744370263.073938   10573 service.cc:160]   StreamExecutor device (0): NVIDIA GeForce RTX 3060 Laptop GPU, Compute Capability 8.6
2025-04-11 12:17:43.565867: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1744370264.315481   10573 cuda_dnn.cc:529] Loaded cuDNN version 90300




[1m  1/300[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m1:12:04[0m 14s/step - accuracy: 0.0000e+00 - auc: 0.4097 - f1_score: 0.0000e+00 - loss: 5.8171 - precision: 0.0000e+00 - recall: 0.0000e+00

I0000 00:00:1744370276.074765   10573 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m  7/300[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m31s[0m 107ms/step - accuracy: 0.0000e+00 - auc: 0.4824 - f1_score: 0.0000e+00 - loss: 5.6063 - precision: 0.0000e+00 - recall: 0.0000e+00

2025-04-11 12:17:56.620156: W external/local_xla/xla/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 162656685 exceeds 10% of free system memory.


[1m 26/300[0m [32m━[0m[37m━━━━━━━━━━━━━━━━━━━[0m [1m29s[0m 107ms/step - accuracy: 0.0075 - auc: 0.5250 - f1_score: 4.4036e-04 - loss: 5.4770 - precision: 0.0000e+00 - recall: 0.0000e+00

2025-04-11 12:17:58.742286: W external/local_xla/xla/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 150962688 exceeds 10% of free system memory.
2025-04-11 12:17:58.858175: W external/local_xla/xla/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 150962688 exceeds 10% of free system memory.


[1m 81/300[0m [32m━━━━━[0m[37m━━━━━━━━━━━━━━━[0m [1m25s[0m 115ms/step - accuracy: 0.0243 - auc: 0.5868 - f1_score: 0.0015 - loss: 5.3038 - precision: 0.0000e+00 - recall: 0.0000e+00

2025-04-11 12:18:05.272162: W external/local_xla/xla/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 150962688 exceeds 10% of free system memory.


[1m 85/300[0m [32m━━━━━[0m[37m━━━━━━━━━━━━━━━[0m [1m24s[0m 115ms/step - accuracy: 0.0251 - auc: 0.5895 - f1_score: 0.0016 - loss: 5.2954 - precision: 0.0000e+00 - recall: 0.0000e+00

2025-04-11 12:18:05.739647: W external/local_xla/xla/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 150962688 exceeds 10% of free system memory.


[1m299/300[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 133ms/step - accuracy: 0.0455 - auc: 0.6486 - f1_score: 0.0041 - loss: 5.0940 - precision: 0.0000e+00 - recall: 0.0000e+00





[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 167ms/step - accuracy: 0.0456 - auc: 0.6487 - f1_score: 0.0041 - loss: 5.0934 - precision: 0.0000e+00 - recall: 0.0000e+00




[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 226ms/step - accuracy: 0.0457 - auc: 0.6489 - f1_score: 0.0041 - loss: 5.0928 - precision: 0.0000e+00 - recall: 0.0000e+00 - val_accuracy: 0.0902 - val_auc: 0.7410 - val_f1_score: 0.0112 - val_loss: 4.6934 - val_precision: 0.0000e+00 - val_recall: 0.0000e+00 - learning_rate: 9.5000e-04
Epoch 2/25
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m63s[0m 210ms/step - accuracy: 0.1050 - auc: 0.7644 - f1_score: 0.0147 - loss: 4.5958 - precision: 0.0000e+00 - recall: 0.0000e+00 - val_accuracy: 0.1194 - val_auc: 0.7840 - val_f1_score: 0.0212 - val_loss: 4.4611 - val_precision: 0.0000e+00 - val_recall: 0.0000e+00 - learning_rate: 9.0250e-04
Epoch 3/25
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m73s[0m 243ms/step - accuracy: 0.1362 - auc: 0.8099 - f1_score: 0.0290 - loss: 4.3386 - precision: 0.0000e+00 - recall: 0.0000e+00 - val_accuracy: 0.1444 - val_auc: 0.8099 - val_f1_score: 0.0347 - val_loss: 4.29

2025-04-11 12:32:52.233596: W tensorflow/core/kernels/data/prefetch_autotuner.cc:52] Prefetch autotuner tried to allocate 33554432 bytes after encountering the first element of size 33554432 bytes.This already causes the autotune ram budget to be exceeded. To stay within the ram budget, either increase the ram budget or reduce element size


[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m144s[0m 479ms/step - accuracy: 0.2657 - auc: 0.9169 - f1_score: 0.1719 - loss: 3.4542 - precision: 0.9522 - recall: 0.0278 - val_accuracy: 0.2129 - val_auc: 0.8723 - val_f1_score: 0.1061 - val_loss: 3.7267 - val_precision: 0.8889 - val_recall: 0.0267 - learning_rate: 5.6880e-04
Epoch 12/25
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m79s[0m 263ms/step - accuracy: 0.2748 - auc: 0.9197 - f1_score: 0.1842 - loss: 3.4064 - precision: 0.9362 - recall: 0.0306 - val_accuracy: 0.2212 - val_auc: 0.8752 - val_f1_score: 0.1188 - val_loss: 3.6919 - val_precision: 0.8974 - val_recall: 0.0292 - learning_rate: 5.4036e-04
Epoch 13/25
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m86s[0m 288ms/step - accuracy: 0.2837 - auc: 0.9231 - f1_score: 0.1957 - loss: 3.3575 - precision: 0.9495 - recall: 0.0352 - val_accuracy: 0.2337 - val_auc: 0.8778 - val_f1_score: 0.1284 - val_loss: 3.6592 - val_precision: 0.9000 - val_reca

#### **🧪 Model Selection & 📏 Model Evaluation**

In [None]:
# Evaluate model
from utilities import plot_metrics

plot_metrics(history,
            #  file_path=f"./ModelsEvaluation/2_Training_Validation_Metrics_{datetime.datetime.now().strftime('%Y%m%d')}.png"
             )

In [None]:
# Evaluate on validation and test sets
train_results = {'accuracy': history.history['accuracy'][-1], 'precision': history.history['precision'][-1], 'recall': history.history['recall'][-1], 'f1_score': history.history['f1_score'][-1], 'auc': history.history['auc'][-1]}
val_results = model.evaluate(val_datagen, batch_size=batch_size, return_dict=True, verbose=0)
test_results = model.evaluate(test_datagen, batch_size=batch_size, return_dict=True, verbose=0)

In [None]:
# Display results
from utilities import display_side_by_side, create_evaluation_dataframe

results_df = create_evaluation_dataframe(
    model_name="Baseline Model",
    variation="Default",
    train_metrics=train_results,
    val_metrics=val_results,
    test_metrics=test_results,
    train_time=train_time
)

display_side_by_side(results_df, super_title="Model Evaluation Results")

---

## **📊 Best Model - Predictions Analysis**

In [None]:
from utilities import plot_confusion_matrix

# Plot confusion matrix for test set
plot_confusion_matrix(
    y_true=test_datagen.classes,
    y_pred=model.predict(test_datagen, batch_size=batch_size),
    title="Confusion Matrix | Best Baseline Model",
    # file_path="./ModelsEvaluation/3_Test_Confusion_Matrix.png"
)

In [None]:
# Plot 5 right and 5 wrong predictions
from utilities import plot_predictions
plot_predictions(
    model=model,
    data=test_datagen,
    n_samples=5,
    file_path=None
)

In [None]:
# # Save to CSV
# results_df.set_index('Models', inplace=True)
# results_df.to_csv("ModelsEvaluation/BaselineModelEvaluation_1_29.03.2025.csv", index=False)                ### Change the name of the file to save it

---

In [None]:
import os
import datetime
import tensorflow as tf
from tensorflow.keras import Model, Input
from tensorflow.keras.applications import VGG19
from tensorflow.keras.layers import (GlobalAveragePooling2D, BatchNormalization,
                                     Dense, Dropout, Lambda, Rescaling)
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.losses import CategoricalCrossentropy
from tensorflow.keras.callbacks import (ModelCheckpoint, CSVLogger, EarlyStopping,
                                        LearningRateScheduler)

# Improved model class definition
class ImprovedRareSpeciesCNN(Model):
    """
    Improved Rare Species CNN using VGG19 as the backbone.

    This model includes:
      - Rescaling and optional preprocessing (contrast, saturation, grayscale adjustments).
      - Data augmentation (applied only during training).
      - A pre-trained VGG19 backbone with the option to fine-tune later layers.
      - Global Average Pooling to reduce the number of parameters.
      - Batch Normalization and Dropout for regularization.
      - A final Dense layer for classification (using softmax activation).

    Args:
        n_classes (int): Number of output classes.
        apply_grayscale (bool): Convert images to grayscale if True.
        apply_contrast (bool): Adjust image contrast if True.
        contrast_factor (float): Factor for contrast adjustment.
        apply_saturation (bool): Adjust image saturation if True.
        saturation_factor (float): Factor for saturation adjustment.
        fine_tune_at (int or None): If specified, unfreezes layers from this index onward in VGG19.
    """
    def __init__(self, n_classes=202,
                 apply_grayscale=False,
                 apply_contrast=False, contrast_factor=1.5,
                 apply_saturation=False, saturation_factor=1.5,
                 fine_tune_at=None):
        super().__init__()

        # Mandatory rescaling layer: scales pixel values to [0, 1]
        self.rescale_layer = Rescaling(1/255.0, name="Rescale")

        # Optional preprocessing layers
        self.contrast_layer = (Lambda(lambda x: tf.image.adjust_contrast(x, contrast_factor=contrast_factor),
                                      name="Contrast")
                               if apply_contrast else None)

        self.saturation_layer = (Lambda(lambda x: tf.image.adjust_saturation(x, saturation_factor=saturation_factor),
                                        name="Saturation")
                                 if apply_saturation else None)

        self.grayscale_layer = (Lambda(lambda x: tf.image.rgb_to_grayscale(x), name="Grayscale")
                                if apply_grayscale else None)
        # If grayscale conversion is applied and subsequent layers expect 3 channels,
        # you can convert it back to RGB:
        self.grayscale_to_rgb_layer = (Lambda(lambda x: tf.image.grayscale_to_rgb(x), name="GrayscaleToRGB")
                                       if apply_grayscale else None)

        # Pre-trained VGG19 backbone (without the top classification layers)
        self.vgg19 = VGG19(include_top=False, weights='imagenet', input_shape=(224, 224, 3))
        self.vgg19.trainable = False  # Initially freeze the backbone

        # Option to fine-tune: Unfreeze layers from the specified index onward
        if fine_tune_at is not None:
            self.vgg19.trainable = True
            for layer in self.vgg19.layers[:fine_tune_at]:
                layer.trainable = False

        # Replace Flatten with Global Average Pooling to reduce parameters
        self.global_pool = GlobalAveragePooling2D(name="GlobalAvgPool")
        self.batch_norm = BatchNormalization(name="BatchNorm")
        self.dropout = Dropout(0.5, name="Dropout")
        self.output_layer = Dense(n_classes, activation='softmax', name="Predictions")

    def call(self, inputs, training=False):
        """
        Forward pass of the model.

        Args:
            inputs: Input tensor (batch of images).
            training (bool): Indicates if the model is in training mode.

        Returns:
            Output tensor (class probabilities).
        """
        # Rescale input
        x = self.rescale_layer(inputs)


        # Optional preprocessing: contrast, saturation, grayscale adjustments
        if self.contrast_layer is not None:
            x = self.contrast_layer(x)
        if self.saturation_layer is not None:
            x = self.saturation_layer(x)
        if self.grayscale_layer is not None:
            x = self.grayscale_layer(x)
            if self.grayscale_to_rgb_layer is not None:
                x = self.grayscale_to_rgb_layer(x)

        # Pass through the pre-trained backbone
        x = self.vgg19(x, training=training)

        # Global pooling, normalization, and dropout
        x = self.global_pool(x)
        x = self.batch_norm(x, training=training)
        x = self.dropout(x, training=training)

        # Final classification layer
        return self.output_layer(x)

# Instantiate the improved model
model = ImprovedRareSpeciesCNN(n_classes=202,
                               apply_grayscale=False,
                               apply_contrast=True, contrast_factor=1.2,
                               apply_saturation=True, saturation_factor=1.2,
                               fine_tune_at=15)  # Unfreeze layers after index 15 for fine-tuning

# Build the model by providing an input tensor
inputs = Input(shape=(224, 224, 3))
_ = model.call(inputs)

# Compile the model using SGD with momentum and a categorical crossentropy loss
optimizer = SGD(learning_rate=1e-3, momentum=0.9)
loss = CategoricalCrossentropy()
metrics = ['accuracy', tf.keras.metrics.Precision(name="precision"), tf.keras.metrics.Recall(name="recall")]

model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

# Set up callbacks for training
model_name = f"ImprovedRareSpeciesCNN_{datetime.datetime.now().strftime('%Y%m%d')}"
os.makedirs("ModelCallbacks", exist_ok=True)
callbacks = [
    ModelCheckpoint(f"ModelCallbacks/checkpoint_{model_name}.keras", monitor="val_loss", save_best_only=True, verbose=1),
    CSVLogger(f"ModelCallbacks/metrics_{model_name}.csv"),
    EarlyStopping(monitor="val_loss", patience=3, restore_best_weights=True, verbose=1),
    LearningRateScheduler(lambda epoch, lr: lr * 0.95, verbose=1)
]

# Display the model summary
model.summary()

In [None]:
# Import the image dataset from the directory
from utilities import load_images_from_directory
train_datagen_SMOTE, val_datagen, test_datagen = load_images_from_directory('/content/RareSpecies_Split/train_DataAugmentationSMOTE', val_dir, test_dir,
                                                                      labels='inferred', label_mode='categorical',
                                                                      class_names=class_names, color_mode='rgb',
                                                                      batch_size=batch_size, image_size=image_size, seed=2025,
                                                                      interpolation='bilinear', crop_to_aspect_ratio=False, pad_to_aspect_ratio=False)

print(f"\nLoaded: Train ({train_datagen_SMOTE.cardinality().numpy() * batch_size}), "
        f"Val ({val_datagen.cardinality().numpy() * batch_size}), "
        f"Test ({test_datagen.cardinality().numpy() * batch_size})")

Found 48480 files belonging to 202 classes.
Found 1198 files belonging to 202 classes.
Found 1199 files belonging to 202 classes.

Loaded: Train (48480), Val (1216), Test (1216)


In [None]:
# Train
start_time = time.time()
history = model.fit(train_datagen_SMOTE, batch_size = batch_size, epochs=25, validation_data=val_datagen, callbacks=callbacks, verbose=2)
train_time = round(time.time() - start_time, 2)

print(f"\nTraining completed in \033[1m{train_time} seconds ({str(datetime.timedelta(seconds=train_time))} h)\033[0m).")


Epoch 1: LearningRateScheduler setting learning rate to 0.0009500000451225787.
Epoch 1/25

Epoch 1: val_loss improved from inf to 5.31813, saving model to ModelCallbacks/checkpoint_ImprovedRareSpeciesCNN_20250411.keras
1515/1515 - 430s - 284ms/step - accuracy: 7.8383e-04 - loss: 5.3882 - precision: 0.0000e+00 - recall: 0.0000e+00 - val_accuracy: 0.0025 - val_loss: 5.3181 - val_precision: 0.0000e+00 - val_recall: 0.0000e+00 - learning_rate: 9.5000e-04

Epoch 2: LearningRateScheduler setting learning rate to 0.0009025000152178108.
Epoch 2/25

Epoch 2: val_loss did not improve from 5.31813
1515/1515 - 424s - 280ms/step - accuracy: 2.8878e-04 - loss: 5.3720 - precision: 0.0000e+00 - recall: 0.0000e+00 - val_accuracy: 0.0025 - val_loss: 5.3188 - val_precision: 0.0000e+00 - val_recall: 0.0000e+00 - learning_rate: 9.0250e-04

Epoch 3: LearningRateScheduler setting learning rate to 0.0008573750033974647.
Epoch 3/25


KeyboardInterrupt: 

---

# **🔗 Bibliography/References**

**[[1]](https://)** AAAAAAAAAA

---