# Pretrained Model Comparison

In this notebook, we aim to identify the best-performing pretrained model.

- The same preprocessing steps and model pipeline were used for each model to ensure a fair comparison.
- To build the pretrained models, we used a pipeline similar to the one from the practical class, adapted to fit the function used for `model_from_scratch`.  
  This includes:  
  - An augmentation layer (again simple one)
  - A rescaling layer  
  - A pretrained model without top layers  
  - A Flatten layer  
  - A Dropout layer (added by us, as we observed it could reduce overfitting)  
  - A Dense output layer

We used the following models:
- **VGG16**, as it was presented in the practical class  
- **ResNet50**, as we saw it is a robust model for complex image classification tasks  
- **MobileNetV2**, as it is known to be a very efficient and fast model

We also decided to **freeze the pretrained layers**, as it is considered good practice when using transfer learning.  
In future iterations, we plan to experiment with unfreezing the layers after the initial training phase.

# Conclusion

From this run, we can conclude that the pretrained model **MobileNetV2** performs better in every aspect compared to the others.  
Therefore, it is likely that we will choose this model for the next steps.

Although we are aware that this could be due to our pipeline being more tailored to this specific model, the results are promising.


# Imports

In [None]:
from google.colab import drive
import zipfile
drive.mount('/content/drive')

zip_path = '/content/drive/MyDrive/rare_species 1.zip'
extract_path = '/content/rare_species 1'
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall(extract_path)

Mounted at /content/drive


In [None]:
import os
import shutil
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow import data as tf_data
from tensorflow.keras import layers
from tensorflow.keras.applications import VGG16, ResNet50, MobileNetV2, Xception, DenseNet121
from tensorflow.keras.layers import Rescaling, RandAugment
from sklearn.metrics import classification_report

In [None]:
# With colab
folder_path = '/content/rare_species 1'
meta = pd.read_csv('/content/rare_species 1/metadata.csv')


# Splits

In [None]:

image_size = (224, 224)
seed = 42
batch_size = 32

train_ds, val_ds= keras.utils.image_dataset_from_directory(
    folder_path,
    validation_split=0.2,
    subset= "both",
    seed= seed,
    image_size= image_size,
    batch_size= batch_size
)


Found 11983 files belonging to 202 classes.
Using 9587 files for training.
Using 2396 files for validation.


# Defining the different models

In this section, we create three different models. All models are built using the same architecture to allow for fair comparison. The pipeline for each model includes the following components:

- **Augmentation layer**: Applies basic random transformations to simulate data variability.
- **Rescaling layer**: Normalizes pixel values.
- **Pretrained model**: Varies between the models to test performance differences.
- **Flatten layer**: Converts the output of the convolutional base to a 1D vector.
- **Dropout layer**: Helps the model generalize better by reducing overfitting.

Although this setup is not optimal—since different pretrained models may respond better to different image sizes and configurations—we believe this approach provides a consistent and efficient basis for comparison.

In [None]:
# Model creation functions for different architectures
def make_model_vgg16(input_shape, num_classes):
    inputs = keras.Input(shape=input_shape)
    x = RandAugment(value_range= (0, 255))(inputs)
    x = Rescaling(1./255)(x)

    # Pretrained VGG16
    base_model = VGG16(include_top=False, input_tensor=x, weights="imagenet")
    base_model.trainable = False  # Freeze for transfer learning

    x = base_model.output
    x = layers.Flatten()(x)
    x = layers.Dropout(0.1)(x) # To somewhat prevent overfitting though it might be to little

    outputs = layers.Dense(num_classes, activation="softmax")(x)

    return keras.Model(inputs, outputs)

def make_model_resnet50(input_shape, num_classes):
    inputs = keras.Input(shape=input_shape)
    x = RandAugment(value_range= (0, 255))(inputs)
    x = Rescaling(1./255)(x)

    # Pretrained ResNet50
    base_model = ResNet50(include_top=False, input_tensor=x, weights="imagenet")
    base_model.trainable = False  # Freeze for transfer learning

    x = base_model.output
    x = layers.Flatten()(x)
    x = layers.Dropout(0.1)(x) # To somewhat prevent overfitting though it might be to little

    outputs = layers.Dense(num_classes, activation="softmax")(x)

    return keras.Model(inputs, outputs)

def make_model_mobilenetv2(input_shape, num_classes):
    inputs = keras.Input(shape=input_shape)
    x = RandAugment(value_range= (0, 255))(inputs)
    x = Rescaling(1./255)(x)

    # Pretrained MobileNetV2
    base_model = MobileNetV2(include_top=False, input_tensor=x, weights="imagenet")
    base_model.trainable = False  # Freeze for transfer learning

    x = base_model.output
    x = layers.Flatten()(x)
    x = layers.Dropout(0.1)(x) # To somewhat prevent overfitting though it might be to little

    outputs = layers.Dense(num_classes, activation="softmax")(x)

    return keras.Model(inputs, outputs)

# Train and evaluate the models

For each of the model we save only the model with the best score on val in order to compare the models. if one model as a very high val-score it doesn't mean it will be picked as other parameter like over/under-fitting, loss is taken into account

In [None]:
def train_and_evaluate_model(model, model_name, train_ds, val_ds, epochs=50):
    """Train and evaluate a model, saving the best version"""

    checkpoint_path = f"best_model_{model_name}.keras"
    callbacks = [
        keras.callbacks.ModelCheckpoint(
            checkpoint_path,
            save_best_only=True,
            monitor="val_acc",
            mode="max",
            verbose=1
        )
    ]

    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3), # larger learning rate then the model form scratch as it was mentioned in class
        loss=keras.losses.SparseCategoricalCrossentropy(from_logits=False),
        metrics=[keras.metrics.SparseCategoricalAccuracy(name="acc")],
    )

    history = model.fit(
        train_ds,
        epochs=epochs,
        callbacks=callbacks,
        validation_data=val_ds,
    )

    # load the best model and make a final prediction on val
    best_model = keras.models.load_model(checkpoint_path)
    y_pred_probs = best_model.predict(val_ds)
    y_pred = np.argmax(y_pred_probs, axis=1)

    y_true = np.concatenate([y for x, y in val_ds], axis=0)

    # Print classification report
    print(f"\nClassification Report for {model_name}:")
    report = classification_report(y_true, y_pred, output_dict=True)
    print(classification_report(y_true, y_pred))

    # Return metrics and paths
    return {
        'model_name': model_name,
        'history': history.history,
        'accuracy': report['accuracy'],
        'f1_macro': report['macro avg']['f1-score'],
        'f1_weighted': report['weighted avg']['f1-score'],
        'model_path': checkpoint_path
    }

# Model comparasion


# Model run

In [None]:
# we dicided to run for 20 epochs as otherwise the run might be two long, we hope that is enough to already see patterns in the scores
epochs = 20

model_vgg16 = make_model_vgg16(input_shape=image_size + (3,), num_classes=202)
model_resnet50 = make_model_resnet50(input_shape=image_size + (3,), num_classes=202)
model_mobilenet = make_model_mobilenetv2(input_shape=image_size + (3,), num_classes=202)





print("VGG16:")
result_vgg16 = train_and_evaluate_model(
    model=model_vgg16,
    model_name="vgg16",
    train_ds=train_ds,
    val_ds=val_ds,
    epochs=epochs
)


print("Resnet50:")
result_resnet50 = train_and_evaluate_model(
    model=model_resnet50,
    model_name="resnet50",
    train_ds=train_ds,
    val_ds=val_ds,
    epochs=epochs
)


print("MobileNet:")
result_mobilenet = train_and_evaluate_model(
    model=model_mobilenet,
    model_name="mobilenetv2",
    train_ds=train_ds,
    val_ds=val_ds,
    epochs=epochs
)



  base_model = MobileNetV2(include_top=False, input_tensor=x, weights="imagenet")


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5
[1m9406464/9406464[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step
VGG16:
Epoch 1/20
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 160ms/step - acc: 0.0964 - loss: 6.5189
Epoch 1: val_acc improved from -inf to 0.16694, saving model to best_model_vgg16.keras
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m71s[0m 213ms/step - acc: 0.0965 - loss: 6.5173 - val_acc: 0.1669 - val_loss: 5.5781
Epoch 2/20
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 162ms/step - acc: 0.2925 - loss: 4.2564
Epoch 2: val_acc improved from 0.16694 to 0.17362, saving model to best_model_vgg16.keras
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m62s[0m 204ms/step - acc: 0.2925 - loss: 4.2568 - val_acc: 0.1736 - val_loss: 5.8831
Epoch 3/20
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 145ms/step - acc: 0.0286 - loss: 22.3596
Epoch 1: val_acc improved from -inf to 0.06052, saving model to best_model_resnet50.keras
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m66s[0m 192ms/step - acc: 0.0287 - loss: 22.3446 - val_acc: 0.0605 - val_loss: 12.3880
Epoch 2/20
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 139ms/step - acc: 0.0955 - loss: 11.4062
Epoch 2: val_acc did not improve from 0.06052
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m51s[0m 171ms/step - acc: 0.0955 - loss: 11.4074 - val_acc: 0.0551 - val_loss: 15.6258
Epoch 3/20
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 139ms/step - acc: 0.1509 - loss: 11.0204
Epoch 3: val_acc did not improve from 0.06052
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m51s[0m 171ms/step - acc: 0.1509 - loss: 11.0202 - val_acc: 0.0392 - val_loss: 11.7611
Epoch 4/20
[1m300/300[0m [32m━━━━━

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 112ms/step - acc: 0.2680 - loss: 21.1300
Epoch 1: val_acc improved from -inf to 0.44616, saving model to best_model_mobilenetv2.keras
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 151ms/step - acc: 0.2683 - loss: 21.1297 - val_acc: 0.4462 - val_loss: 19.5264
Epoch 2/20
[1m299/300[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 114ms/step - acc: 0.7998 - loss: 4.3389
Epoch 2: val_acc improved from 0.44616 to 0.45534, saving model to best_model_mobilenetv2.keras
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m43s[0m 144ms/step - acc: 0.7997 - loss: 4.3408 - val_acc: 0.4553 - val_loss: 22.9600
Epoch 3/20
[1m299/300[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 112ms/step - acc: 0.8463 - loss: 3.4365
Epoch 3: val_acc improved from 0.45534 to 0.49207, saving model to best_model_mobilenetv2.keras
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m43s[0m 144ms/step - acc

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
