# Master course in Object Recognition
## Practice 1

### Title: Deep learning advanced architectures

The goal is to practice advanced deep learning architectures for multi-label classification in [Pascal VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/index.html). We specifically check ResNet50, Inception and MobileNet. We will see 1) how pretrained ResNet50 on imagenet performs on multi-label images, 2) how to modify classification head and 3) implementation of F1 metric.

### NOTES

- Hyperparameters are modifiable,
- The dataset is PASCAL VOC 2012,
- The code uses the KERAS library,
- The code can run in google colab.
- How to finetune on a pretrained model not included (i.e. freeze the pretrained network and train the head, then finetune everything),
- No validation set has been defined. The test and validation sets are the same.

In [1]:
# Imports

import pandas as pd
import time

In [None]:
import warnings

warnings.filterwarnings("ignore", category=UserWarning)

2025-03-08 00:08:31.750735: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1741392511.792183  119923 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1741392511.805347  119923 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-03-08 00:08:31.872334: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [1]:
# Importing from .py files

from config import *
from experiment_config import experiments
from train_and_test import train_and_test
from load_data import load_data, create_dataset
from augmentation import apply_augmentation
from imbalance_handling import (
    create_balanced_dataset,
    create_weighted_binary_crossentropy,
)
from models import create_model, setup_best_config

In [None]:
train_list = load_data(TRAIN_TXT)
test_list = load_data(TEST_TXT)

# Create dictionaries to store datasets for different batch sizes
train_datasets = {}
test_datasets = {}

start_time = time.time()
# Iterate over batch sizes and create datasets
for batch_size in BATCH_SIZES:
    train_datasets[batch_size] = create_dataset(train_list, batch_size)
    test_datasets[batch_size] = create_dataset(test_list, batch_size)
print(f"Time taken to create datasets: {time.time() - start_time} seconds")

I0000 00:00:1741482891.440283   18734 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22455 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:07:00.0, compute capability: 8.6


Time taken to create datasets: 3.6416664123535156 seconds


In [4]:
# Run model experiments
exp_name = "model-experiments"
for exp in experiments[exp_name]:

    # Create the model
    base_model, model = create_model(exp)

    train_dataset = train_datasets[exp.batch_size]
    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model,
        base_model,
        exp_name,
        exp,
        train_dataset,
        test_dataset,
        train_list,
        test_list,
    )

Defining model: resnet50 no-pretraining no-warmup
In training loop: resnet50 no-pretraining no-warmup
Recompiling model at epoch 0 (Optimizer changed)


I0000 00:00:1741482929.137013   18734 service.cc:148] XLA service 0x1668c170 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1741482929.137243   18734 service.cc:156]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2025-03-09 01:15:30.030386: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1741482933.748472   18734 cuda_dnn.cc:529] Loaded cuDNN version 90300







I0000 00:00:1741482972.157394   18734 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


Time taken for training one epoch: 126.57s
Epoch 0 training loss: 0.17, acc: 0.89, f1: 0.55, mAP: 0.73
Time taken for testing one epoch: 51.18s
Epoch 0 test loss: 0.22, acc: 0.79, f1: 0.14, mAP: 0.47
Time taken for training one epoch: 56.84s
Epoch 1 training loss: 0.09, acc: 0.96, f1: 0.72, mAP: 0.87
Time taken for testing one epoch: 40.28s
Epoch 1 test loss: 0.16, acc: 0.91, f1: 0.53, mAP: 0.73
Time taken for training one epoch: 55.10s
Epoch 2 training loss: 0.07, acc: 0.98, f1: 0.79, mAP: 0.92
Time taken for testing one epoch: 40.40s
Epoch 2 test loss: 0.14, acc: 0.93, f1: 0.65, mAP: 0.80
Time taken for training one epoch: 56.78s
Epoch 3 training loss: 0.06, acc: 0.99, f1: 0.84, mAP: 0.95
Time taken for testing one epoch: 40.43s
Epoch 3 test loss: 0.14, acc: 0.93, f1: 0.66, mAP: 0.82
Time taken for training one epoch: 55.13s
Epoch 4 training loss: 0.05, acc: 0.99, f1: 0.87, mAP: 0.96
Time taken for testing one epoch: 37.79s
Epoch 4 test loss: 0.18, acc: 0.89, f1: 0.62, mAP: 0.76
Time

  mynet(include_top=False)


In training loop: mobilenet_v2 no-pretraining no-warmup
Recompiling model at epoch 0 (Optimizer changed)











Time taken for training one epoch: 95.08s
Epoch 0 training loss: 0.14, acc: 0.92, f1: 0.62, mAP: 0.78
Time taken for testing one epoch: 47.92s
Epoch 0 test loss: 0.96, acc: 0.56, f1: 0.04, mAP: 0.61
Time taken for training one epoch: 40.56s
Epoch 1 training loss: 0.11, acc: 0.95, f1: 0.73, mAP: 0.88
Time taken for testing one epoch: 40.21s
Epoch 1 test loss: 0.86, acc: 0.67, f1: 0.23, mAP: 0.64
Time taken for training one epoch: 42.57s
Epoch 2 training loss: 0.09, acc: 0.97, f1: 0.77, mAP: 0.90
Time taken for testing one epoch: 38.89s
Epoch 2 test loss: 0.57, acc: 0.71, f1: 0.34, mAP: 0.60
Time taken for training one epoch: 41.36s
Epoch 3 training loss: 0.07, acc: 0.98, f1: 0.80, mAP: 0.92
Time taken for testing one epoch: 38.84s
Epoch 3 test loss: 0.80, acc: 0.55, f1: 0.03, mAP: 0.39
Time taken for training one epoch: 40.48s
Epoch 4 training loss: 0.07, acc: 0.98, f1: 0.82, mAP: 0.93
Time taken for testing one epoch: 38.25s
Epoch 4 test loss: 0.49, acc: 0.69, f1: 0.17, mAP: 0.46
Time 

  else mynet(weights="imagenet", include_top=False)


In training loop: mobilenet_v2 pretraining no-warmup
Recompiling model at epoch 0 (Optimizer changed)
Time taken for training one epoch: 86.55s
Epoch 0 training loss: 0.14, acc: 0.92, f1: 0.61, mAP: 0.79
Time taken for testing one epoch: 45.11s
Epoch 0 test loss: 1.04, acc: 0.57, f1: 0.05, mAP: 0.56
Time taken for training one epoch: 40.93s
Epoch 1 training loss: 0.10, acc: 0.96, f1: 0.72, mAP: 0.87
Time taken for testing one epoch: 40.00s
Epoch 1 test loss: 0.51, acc: 0.74, f1: 0.45, mAP: 0.66
Time taken for training one epoch: 40.44s
Epoch 2 training loss: 0.08, acc: 0.97, f1: 0.77, mAP: 0.90
Time taken for testing one epoch: 40.38s
Epoch 2 test loss: 0.57, acc: 0.70, f1: 0.23, mAP: 0.54
Time taken for training one epoch: 42.14s
Epoch 3 training loss: 0.07, acc: 0.98, f1: 0.80, mAP: 0.92
Time taken for testing one epoch: 39.13s
Epoch 3 test loss: 0.69, acc: 0.60, f1: 0.10, mAP: 0.39
Time taken for training one epoch: 43.30s
Epoch 4 training loss: 0.07, acc: 0.98, f1: 0.82, mAP: 0.93


  else mynet(weights="imagenet", include_top=False)


In training loop: mobilenet_v2 pretraining warmup
Freezing base model layers for warmup.
Recompiling model at epoch 0 (Optimizer changed)
Time taken for training one epoch: 48.84s
Epoch 0 training loss: 0.19, acc: 0.84, f1: 0.51, mAP: 0.69
Time taken for testing one epoch: 47.91s
Epoch 0 test loss: 0.10, acc: 0.95, f1: 0.73, mAP: 0.89
Time taken for training one epoch: 34.73s
Epoch 1 training loss: 0.09, acc: 0.96, f1: 0.75, mAP: 0.90
Time taken for testing one epoch: 39.62s
Epoch 1 test loss: 0.09, acc: 0.96, f1: 0.76, mAP: 0.90
Time taken for training one epoch: 35.63s
Epoch 2 training loss: 0.08, acc: 0.97, f1: 0.77, mAP: 0.92
Time taken for testing one epoch: 38.82s
Epoch 2 test loss: 0.09, acc: 0.96, f1: 0.76, mAP: 0.90
Unfreezing base model at epoch 3
Recompiling model at epoch 3 (Optimizer changed)
Time taken for training one epoch: 83.43s
Epoch 3 training loss: 0.13, acc: 0.93, f1: 0.65, mAP: 0.81
Time taken for testing one epoch: 45.94s
Epoch 3 test loss: 1.17, acc: 0.63, f1: 

In [7]:
# Determine the best experiment of the 9 model experiments

df = pd.read_csv(RESULTS_DIR / f"model-experiments.csv")
best_id = df.loc[df["test_map"].idxmax(), "id"]

best_model_experiment_config = next(
    exp for exp in experiments["model-experiments"] if exp.id == best_id
)

best_model_experiment_config

ExperimentConfig(id=4, title='inception_v3 pretraining no-warmup', net_name=['inception_v3', 'InceptionV3'], train_from_scratch=False, warm_up=False, batch_size=32, n_epochs=12, last_layer_activation='sigmoid', learning_rate=0.001, loss='binary_crossentropy', classifier_head='default')

In [9]:
# Run hyperparameter experiments

exp_name = "hyperparameter-experiments"
for exp in experiments[exp_name]:

    # Reuse the best model parameters
    exp = setup_best_config(
        exp, ["net_name", "train_from_scratch", "warm_up"], best_model_experiment_config
    )
    # Create the model
    base_model, model = create_model(exp)

    train_dataset = train_datasets[exp.batch_size]
    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model,
        base_model,
        exp_name,
        exp,
        train_dataset,
        test_dataset,
        train_list,
        test_list,
    )

Reusing parameters from best experiment:
	net_name: ['inception_v3', 'InceptionV3']
	train_from_scratch: False
	warm_up: False
Defining model: batch_size: 16, learning_rate: 0.0001
In training loop: batch_size: 16, learning_rate: 0.0001
Recompiling model at epoch 0 (Optimizer changed)
Time taken for training one epoch: 137.30s
Epoch 0 training loss: 0.14, acc: 0.91, f1: 0.61, mAP: 0.78
Time taken for testing one epoch: 81.34s
Epoch 0 test loss: 0.10, acc: 0.96, f1: 0.77, mAP: 0.91
Time taken for training one epoch: 49.61s
Epoch 1 training loss: 0.07, acc: 0.98, f1: 0.80, mAP: 0.93
Time taken for testing one epoch: 66.04s
Epoch 1 test loss: 0.09, acc: 0.96, f1: 0.78, mAP: 0.91
Time taken for training one epoch: 47.41s
Epoch 2 training loss: 0.05, acc: 0.99, f1: 0.87, mAP: 0.96
Time taken for testing one epoch: 67.33s
Epoch 2 test loss: 0.10, acc: 0.95, f1: 0.79, mAP: 0.91
Time taken for training one epoch: 47.31s
Epoch 3 training loss: 0.03, acc: 1.00, f1: 0.91, mAP: 0.98
Time taken for









Time taken for training one epoch: 139.37s
Epoch 0 training loss: 0.20, acc: 0.85, f1: 0.51, mAP: 0.69
Time taken for testing one epoch: 42.18s
Epoch 0 test loss: 0.09, acc: 0.96, f1: 0.76, mAP: 0.90
Time taken for training one epoch: 44.18s
Epoch 1 training loss: 0.07, acc: 0.98, f1: 0.80, mAP: 0.93
Time taken for testing one epoch: 27.58s
Epoch 1 test loss: 0.08, acc: 0.97, f1: 0.79, mAP: 0.92
Time taken for training one epoch: 44.66s
Epoch 2 training loss: 0.05, acc: 0.99, f1: 0.86, mAP: 0.96
Time taken for testing one epoch: 27.63s
Epoch 2 test loss: 0.08, acc: 0.97, f1: 0.79, mAP: 0.92
Time taken for training one epoch: 43.96s
Epoch 3 training loss: 0.03, acc: 1.00, f1: 0.91, mAP: 0.98
Time taken for testing one epoch: 28.30s
Epoch 3 test loss: 0.09, acc: 0.96, f1: 0.80, mAP: 0.92
Time taken for training one epoch: 43.88s
Epoch 4 training loss: 0.02, acc: 1.00, f1: 0.94, mAP: 0.99
Time taken for testing one epoch: 28.36s
Epoch 4 test loss: 0.10, acc: 0.95, f1: 0.79, mAP: 0.91
Time

In [10]:
# Determine the best experiment of the 9 hyperparameter experiments

df = pd.read_csv(RESULTS_DIR / f"hyperparameter-experiments.csv")
best_id = df.loc[df["test_map"].idxmax(), "id"]

best_hyperparameter_experiment_config = next(
    exp for exp in experiments["hyperparameter-experiments"] if exp.id == best_id
)

best_hyperparameter_experiment_config

ExperimentConfig(id=12, title='batch_size: 32, learning_rate: 0.0001', net_name=['inception_v3', 'InceptionV3'], train_from_scratch=False, warm_up=False, batch_size=32, n_epochs=12, last_layer_activation='sigmoid', learning_rate=0.0001, loss='binary_crossentropy', classifier_head='default')

In [None]:
# Run augmentation experiments

exp_name = "augmentation-experiments"
for exp in experiments[exp_name]:

    # Reuse the best model and hyperparameter parameters
    exp = setup_best_config(
        exp, ["net_name", "train_from_scratch", "warm_up"], best_model_experiment_config
    )
    exp = setup_best_config(
        exp, ["batch_size", "learning_rate"], best_hyperparameter_experiment_config
    )
    # Create the model
    base_model, model = create_model(exp)

    train_dataset = train_datasets[exp.batch_size]

    train_dataset = apply_augmentation(train_dataset, exp.augmentation)

    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model,
        base_model,
        exp_name,
        exp,
        train_dataset,
        test_dataset,
        train_list,
        test_list,
    )

Defining model: Augmentation: simple
Reusing from best hyperparameter experiment:
	net_name: ['resnet50', 'ResNet50'],
	train_from_scratch: False,
	warm_up: True,
	batch_size: 32,
	learning_rate: 0.001,
	loss: binary_crossentropy,
	last_layer_activation: sigmoid
In training loop: Augmentation: simple
Freezing base model layers for warmup.
Recompiling model at epoch 0 (Optimizer changed)
y true shape (32, 20)
y true shape (32, 20)
Time taken for training one epoch: 30.18s
Epoch 0 training loss: 0.21, acc: 0.76, f1: 0.42, mAP: 0.18
y true shape (32, 20)
Time taken for testing one epoch: 12.15s
Epoch 0 test loss: 0.20, acc: 0.77, f1: 0.34, mAP: 0.20
Training (Augmentation: simple) finished in: 42.34 seconds
Results saved to /root/mai-object-recognition/practicals/p1/data/02_results/augmentation-experiments.csv
History saved to /root/mai-object-recognition/practicals/p1/data/01_histories/resnet50-18-train_loss.csv
History saved to /root/mai-object-recognition/practicals/p1/data/01_historie

In [9]:
# Run imbalance handling experiments

exp_name = "imbalance-experiments"
for exp in experiments[exp_name]:

    # Reuse the best model and hyperparameter parameters
    exp = setup_best_config(
        exp, ["net_name", "train_from_scratch", "warm_up"], best_model_experiment_config
    )
    exp = setup_best_config(
        exp, ["batch_size", "learning_rate"], best_hyperparameter_experiment_config
    )
    # Create the model
    base_model, model = create_model(exp)
    if exp.imbalance == "loss" or exp.imbalance == "all":
        weighted_loss = create_weighted_binary_crossentropy(train_list)

        # Recompile the model with the same optimizer but different loss
        model.compile(
            loss=weighted_loss,
            # optimizer=model.optimizer,
            metrics=model.metrics,  # Keep the same metrics
        )

    if exp.imbalance == "batch" or exp.imbalance == "all":
        train_dataset = create_balanced_dataset(train_list, is_training=True)
    else:
        train_dataset = train_datasets[exp.batch_size]
    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model,
        base_model,
        exp_name,
        exp,
        train_dataset,
        test_dataset,
        train_list,
        test_list,
    )

Defining model: Imbalance handling: loss


In training loop: Imbalance handling: loss
Freezing base model layers for warmup.
Recompiling model at epoch 0 (Optimizer changed)
y true shape (32, 20)
y true shape (32, 20)
Time taken for training one epoch: 30.15s
Epoch 0 training loss: 0.21, acc: 0.76, f1: 0.42, mAP: 0.17
y true shape (32, 20)
Time taken for testing one epoch: 12.29s
Epoch 0 test loss: 0.20, acc: 0.78, f1: 0.37, mAP: 0.19
Training (Imbalance handling: loss) finished in: 42.44 seconds
Results saved to /root/mai-object-recognition/practicals/p1/data/02_results/imbalance-experiments.csv
History saved to /root/mai-object-recognition/practicals/p1/data/01_histories/resnet50-22-train_loss.csv
History saved to /root/mai-object-recognition/practicals/p1/data/01_histories/resnet50-22-train_acc.csv
History saved to /root/mai-object-recognition/practicals/p1/data/01_histories/resnet50-22-train_f1.csv
History saved to /root/mai-object-recognition/practicals/p1/data/01_histories/resnet50-22-train_map.csv
History saved to /root/

In [11]:
# Run classifier head experiments

exp_name = "classfier_head-experiments"
for exp in experiments[exp_name]:

    # Reuse the best model and hyperparameter parameters
    exp = setup_best_config(
        exp, ["net_name", "train_from_scratch", "warm_up"], best_model_experiment_config
    )
    exp = setup_best_config(
        exp, ["batch_size", "learning_rate"], best_hyperparameter_experiment_config
    )
    # Create the model
    base_model, model = create_model(exp)

    train_dataset = train_datasets[exp.batch_size]
    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model,
        base_model,
        exp_name,
        exp,
        train_dataset,
        test_dataset,
        train_list,
        test_list,
    )

Reusing parameters from best experiment:
	net_name: ['inception_v3', 'InceptionV3']
	train_from_scratch: False
	warm_up: False
Reusing parameters from best experiment:
	batch_size: 32
	learning_rate: 0.0001
Defining model: classifier_head: ensemble
In training loop: classifier_head: ensemble
Recompiling model at epoch 0 (Optimizer changed)






Time taken for training one epoch: 134.23s
Epoch 0 training loss: 0.20, acc: 0.85, f1: 0.51, mAP: 0.69
Time taken for testing one epoch: 57.35s
Epoch 0 test loss: 0.09, acc: 0.97, f1: 0.75, mAP: 0.90
Time taken for training one epoch: 45.55s
Epoch 1 training loss: 0.08, acc: 0.98, f1: 0.77, mAP: 0.92
Time taken for testing one epoch: 42.37s
Epoch 1 test loss: 0.08, acc: 0.97, f1: 0.79, mAP: 0.92
Time taken for training one epoch: 47.18s
Epoch 2 training loss: 0.06, acc: 0.99, f1: 0.84, mAP: 0.95
Time taken for testing one epoch: 43.93s
Epoch 2 test loss: 0.08, acc: 0.97, f1: 0.79, mAP: 0.92
Time taken for training one epoch: 46.89s
Epoch 3 training loss: 0.04, acc: 0.99, f1: 0.88, mAP: 0.97
Time taken for testing one epoch: 41.53s
Epoch 3 test loss: 0.09, acc: 0.96, f1: 0.80, mAP: 0.92
Time taken for training one epoch: 44.49s
Epoch 4 training loss: 0.03, acc: 1.00, f1: 0.92, mAP: 0.99
Time taken for testing one epoch: 42.06s
Epoch 4 test loss: 0.09, acc: 0.96, f1: 0.79, mAP: 0.92
Time










Time taken for training one epoch: 135.50s
Epoch 0 training loss: 0.15, acc: 0.91, f1: 0.61, mAP: 0.78






Time taken for testing one epoch: 63.10s
Epoch 0 test loss: 0.09, acc: 0.96, f1: 0.77, mAP: 0.90
Time taken for training one epoch: 42.83s
Epoch 1 training loss: 0.07, acc: 0.98, f1: 0.82, mAP: 0.94
Time taken for testing one epoch: 37.69s
Epoch 1 test loss: 0.09, acc: 0.96, f1: 0.78, mAP: 0.90
Time taken for training one epoch: 44.15s
Epoch 2 training loss: 0.04, acc: 0.99, f1: 0.89, mAP: 0.98
Time taken for testing one epoch: 41.02s
Epoch 2 test loss: 0.09, acc: 0.96, f1: 0.78, mAP: 0.91
Time taken for training one epoch: 43.86s
Epoch 3 training loss: 0.03, acc: 1.00, f1: 0.94, mAP: 0.99
Time taken for testing one epoch: 38.98s
Epoch 3 test loss: 0.10, acc: 0.95, f1: 0.78, mAP: 0.90
Time taken for training one epoch: 44.66s
Epoch 4 training loss: 0.02, acc: 1.00, f1: 0.96, mAP: 1.00
Time taken for testing one epoch: 39.45s
Epoch 4 test loss: 0.11, acc: 0.94, f1: 0.78, mAP: 0.90
Time taken for training one epoch: 45.46s
Epoch 5 training loss: 0.01, acc: 1.00, f1: 0.97, mAP: 1.00
Time 