# Master course in Object Recognition
## Practice 1

### Title: Deep learning advanced architectures

The goal is to practice advanced deep learning architectures for multi-label classification in [Pascal VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/index.html). We specifically check ResNet50, Inception and MobileNet. We will see 1) how pretrained ResNet50 on imagenet performs on multi-label images, 2) how to modify classification head and 3) implementation of F1 metric.

### NOTES

- Hyperparameters are modifiable,
- The dataset is PASCAL VOC 2012,
- The code uses the KERAS library,
- The code can run in google colab.
- How to finetune on a pretrained model not included (i.e. freeze the pretrained network and train the head, then finetune everything),
- No validation set has been defined. The test and validation sets are the same.

In [2]:
# Imports

import pandas as pd
import time

In [3]:
# Importing from .py files

from config import *
from experiment_config import experiments
from train_and_test import train_and_test
from load_data import load_data, create_dataset
from models import create_model

In [None]:
train_list = load_data(TRAIN_TXT)
test_list = load_data(TEST_TXT)

# Create dictionaries to store datasets for different batch sizes
train_datasets = {}
test_datasets = {}

start_time = time.time()
# Iterate over batch sizes and create datasets
for batch_size in BATCH_SIZES:
    train_datasets[batch_size] = create_dataset(
        train_list, batch_size, is_training=True
    )
    test_datasets[batch_size] = create_dataset(test_list, batch_size, is_training=False)
print(f"Time taken to create datasets: {time.time() - start_time} seconds")

I0000 00:00:1741335426.640641 1486723 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10147 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:03:00.0, compute capability: 8.6


Time taken to create datasets: 1.858973503112793 seconds


In [None]:
# Run model experiments
exp_name = "model-experiments"
for exp in experiments[exp_name]:
    
    # Create the model
    base_model, model = create_model(exp, exp_name)

    train_dataset = train_datasets[exp.batch_size]
    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model, exp_name, exp, train_dataset, test_dataset, train_list, test_list
    )

Defining model: mobilenet_v2 pretraining no-warmup


  else mynet(weights="imagenet", include_top=False)


In training loop: mobilenet_v2 pretraining no-warmup


I0000 00:00:1741326108.741813  613451 service.cc:148] XLA service 0x200f21a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1741326108.741865  613451 service.cc:156]   StreamExecutor device (0): NVIDIA GeForce RTX 3080 Ti, Compute Capability 8.6
2025-03-07 05:41:49.108484: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1741326110.602709  613451 cuda_dnn.cc:529] Loaded cuDNN version 90300








I0000 00:00:1741326129.434566  613451 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


Time taken for training one epoch: 54.63s
Epoch 0 training loss: 0.14, acc: 0.92, f1: 0.63, mAP: 0.78
Time taken for testing one epoch: 22.80s
Epoch 0 test loss: 0.55, acc: 0.74, f1: 0.48, mAP: 0.65
Time taken for training one epoch: 44.72s
Epoch 1 training loss: 0.09, acc: 0.97, f1: 0.74, mAP: 0.88
Time taken for testing one epoch: 22.47s
Epoch 1 test loss: 0.45, acc: 0.77, f1: 0.50, mAP: 0.66
Time taken for training one epoch: 44.62s
Epoch 2 training loss: 0.07, acc: 0.98, f1: 0.80, mAP: 0.92
Time taken for testing one epoch: 22.65s
Epoch 2 test loss: 0.45, acc: 0.73, f1: 0.24, mAP: 0.55
Time taken for training one epoch: 44.71s
Epoch 3 training loss: 0.06, acc: 0.99, f1: 0.83, mAP: 0.94
Time taken for testing one epoch: 22.61s
Epoch 3 test loss: 0.27, acc: 0.82, f1: 0.51, mAP: 0.67
Time taken for training one epoch: 44.86s
Epoch 4 training loss: 0.05, acc: 0.99, f1: 0.86, mAP: 0.95
Time taken for testing one epoch: 22.54s
Epoch 4 test loss: 0.38, acc: 0.79, f1: 0.53, mAP: 0.65
Time 

In [4]:
# Determine the best experiment of the 9 model experiments

df = pd.read_csv(RESULTS_DIR / f"model-experiments.csv")
best_id = df.loc[df["Test mAP"].idxmax(), "ID"]

best_model_experiment_config = next(
    exp for exp in experiments["model-experiments"] if exp.id == best_id
)

best_model_experiment_config

ExperimentConfig(id=2, title='resnet50 pretraining warmup', net_name=['resnet50', 'ResNet50'], train_from_scratch=False, warm_up=True, batch_size=32, n_epochs=12, last_layer_activation='sigmoid', learning_rate=0.001, loss='binary_crossentropy')

In [5]:
# Run hyperparameter experiments

exp_name = "hyperparameter-experiments"
for exp in experiments[exp_name]:

    # Create the model
    base_model, model = create_model(exp, exp_name, best_model_experiment_config)

    train_dataset = train_datasets[exp.batch_size]
    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model, exp_name, exp, train_dataset, test_dataset, train_list, test_list
    )

Defining model: batch_size: 64, learning_rate: 0.001
Reusing net_name: ['resnet50', 'ResNet50'], train_from_scratch: False, warm_up: True from best model experiment


In training loop: batch_size: 64, learning_rate: 0.001


I0000 00:00:1741335488.804120 1486723 service.cc:148] XLA service 0x2a477e30 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1741335488.804155 1486723 service.cc:156]   StreamExecutor device (0): NVIDIA GeForce RTX 3080 Ti, Compute Capability 8.6
2025-03-07 08:18:09.295413: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1741335491.711692 1486723 cuda_dnn.cc:529] Loaded cuDNN version 90300






I0000 00:00:1741335516.240618 1486723 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


Time taken for training one epoch: 84.77s
Epoch 0 training loss: 0.17, acc: 0.89, f1: 0.57, mAP: 0.74







Time taken for testing one epoch: 21.74s
Epoch 0 test loss: 0.27, acc: 0.58, f1: 0.00, mAP: 0.20
Time taken for training one epoch: 68.98s
Epoch 1 training loss: 0.08, acc: 0.98, f1: 0.78, mAP: 0.91
Time taken for testing one epoch: 18.13s
Epoch 1 test loss: 0.25, acc: 0.82, f1: 0.12, mAP: 0.54
Unfreezing base model at epoch 2
Time taken for training one epoch: 68.72s
Epoch 2 training loss: 0.05, acc: 0.99, f1: 0.87, mAP: 0.96
Time taken for testing one epoch: 18.50s
Epoch 2 test loss: 0.30, acc: 0.82, f1: 0.35, mAP: 0.62
Time taken for training one epoch: 68.95s
Epoch 3 training loss: 0.04, acc: 0.99, f1: 0.90, mAP: 0.98
Time taken for testing one epoch: 18.42s
Epoch 3 test loss: 0.17, acc: 0.91, f1: 0.67, mAP: 0.81
Time taken for training one epoch: 67.92s
Epoch 4 training loss: 0.03, acc: 1.00, f1: 0.92, mAP: 0.98
Time taken for testing one epoch: 18.50s
Epoch 4 test loss: 0.25, acc: 0.86, f1: 0.56, mAP: 0.72
Time taken for training one epoch: 69.07s
Epoch 5 training loss: 0.02, acc

In [4]:
# Determine the best experiment of the 9 hyperparameter experiments

df = pd.read_csv(RESULTS_DIR / f"hyperparameter-experiments.csv")
best_id = df.loc[df["Test mAP"].idxmax(), "ID"]

best_hyperparameter_experiment_config = next(
    exp for exp in experiments["hyperparameter-experiments"] if exp.id == best_id
)

best_hyperparameter_experiment_config

ExperimentConfig(id=15, title='batch_size: 64, learning_rate: 0.001', net_name=['resnet50', 'ResNet50'], train_from_scratch=False, warm_up=True, batch_size=64, n_epochs=12, last_layer_activation='sigmoid', learning_rate=0.001, loss='binary_crossentropy')

In [None]:
# Run augmentation experiments

exp_name = "augmentation-experiments"
for exp in experiments[exp_name]:

    # Create the model
    base_model, model = create_model(exp, exp_name, best_hyperparameter_experiment_config)

    train_dataset = train_datasets[exp.batch_size]
    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model, exp_name, exp, train_dataset, test_dataset, train_list, test_list
    )

In [None]:
# Run classifier head experiments

exp_name = "classfier_head-experiments"
for exp in experiments[exp_name]:

    # Create the model
    base_model, model = create_model(exp, exp_name, best_hyperparameter_experiment_config)

    train_dataset = train_datasets[exp.batch_size]
    test_dataset = test_datasets[exp.batch_size]

    train_and_test(
        model, exp_name, exp, train_dataset, test_dataset, train_list, test_list
    )