# 1. Data & Model Loading

This notebook prepares the data and models used for the subsequent optimisation pipeline. This is to emulate a non-compressed model training and evaluation process, where the model is adapted to a specific dataset and then exported for further compression for embedded deployment.

The process is defined as such:
* A Torch dataset (already split into train and val) and model are loaded. Those must be specialized for classification tasks, but are agnostic
of the modality.
* The model"s classification head is adapted to the number of classes in the dataset, trained on the training set while freezing the backbone, and evaluated on the validation set.
* The whole model (backbone + classification head) is then adapted to the dataset by freezing all layers except the classification head, which is trained on the training set.
* The adapted model is then exported as a Torch model for later use in the optimisation pipeline.

2 models are exported:
* An image MobileNetV2 model with a classification head adapted to the CIFAR-10 dataset.
* An audio YAML model with a classification head adapted to the ESC-50 dataset.

## Setup

In [1]:
import torch
import torchvision

from nnopt.model.train import adapt_model_head_to_dataset
from nnopt.model.eval import eval_model
from nnopt.model.const import DEVICE, DTYPE
from nnopt.recipes.mobilenetv2_cifar10 import get_cifar10_datasets, save_mobilenetv2_cifar10_model

# MobileNetV2 and CIFAR-10 adaptation

In [2]:
mobilenetv2 = torchvision.models.mobilenet_v2(
    weights=torchvision.models.MobileNet_V2_Weights.DEFAULT
)
cifar10_train_dataset, cifar10_val_dataset, cifar10_test_dataset = get_cifar10_datasets()

# Adapt the MobileNetV2 model to CIFAR-10 dataset
mobilenetv2_cifar10_baseline = adapt_model_head_to_dataset(
    model=mobilenetv2,
    num_classes=10,  # CIFAR-10 has 10 classes
    train_dataset=cifar10_train_dataset,
    val_dataset=cifar10_val_dataset,
    batch_size=64,  # Adjust batch size as needed
    head_train_epochs=5,  # Train head for fewer epochs
    fine_tune_epochs=3,  # Fine-tune for fewer epochs
    optimizer_cls=torch.optim.Adam,  # Use Adam optimizer
    head_train_lr=0.001,  # Learning rate for head training
    fine_tune_lr=0.0001,  # Learning rate for fine-tuning
    use_amp=True,  # Use mixed precision training
    device=DEVICE,
    dtype=DTYPE
)

2025-06-10 20:28:32,432 - nnopt.recipes.mobilenetv2_cifar10 - INFO - Loading existing training and validation datasets...
2025-06-10 20:28:34,011 - nnopt.recipes.mobilenetv2_cifar10 - INFO - Loading existing test dataset...
2025-06-10 20:28:34,329 - nnopt.model.train - INFO - Training head of the model with backbone frozen...
Epoch 1/5 [Training]: 100%|██████████| 704/704 [00:36<00:00, 19.16it/s, acc=0.4712, cpu=2.6%, gpu_mem=3.1/24.0GB (13.1%), gpu_util=37.0%, loss=1.3101, ram=8.7/30.9GB (31.4%), samples/s=363.0]  
Epoch 1/5 [Validation]: 100%|██████████| 79/79 [00:04<00:00, 18.11it/s, acc=0.5394, cpu=3.7%, gpu_mem=3.1/24.0GB (13.1%), gpu_util=35.0%, loss=1.0425, ram=8.7/30.9GB (31.4%), samples/s=1357.4] 


Epoch 1/5, Train Loss: 1.5416, Train Acc: 0.4712, Train Throughput: 3669.42 samples/s | Val Loss: 1.3190, Val Acc: 0.5394, Val Throughput: 4236.98 samples/s | CPU Usage: 10.20% | RAM Usage: 8.4/30.9GB (30.5%) | GPU 0 Util: 35.00% | GPU 0 Mem: 3.1/24.0GB (13.1%)


Epoch 2/5 [Training]: 100%|██████████| 704/704 [00:36<00:00, 19.40it/s, acc=0.5208, cpu=3.0%, gpu_mem=3.1/24.0GB (13.1%), gpu_util=36.0%, loss=2.0801, ram=8.9/30.9GB (32.0%), samples/s=973.1]  
Epoch 2/5 [Validation]: 100%|██████████| 79/79 [00:04<00:00, 18.39it/s, acc=0.5418, cpu=3.6%, gpu_mem=3.2/24.0GB (13.2%), gpu_util=36.0%, loss=1.2500, ram=8.8/30.9GB (31.7%), samples/s=1297.1] 


Epoch 2/5, Train Loss: 1.3702, Train Acc: 0.5208, Train Throughput: 3495.65 samples/s | Val Loss: 1.3087, Val Acc: 0.5418, Val Throughput: 4471.32 samples/s | CPU Usage: 10.60% | RAM Usage: 8.6/30.9GB (31.0%) | GPU 0 Util: 36.00% | GPU 0 Mem: 3.2/24.0GB (13.2%)


Epoch 3/5 [Training]: 100%|██████████| 704/704 [00:36<00:00, 19.53it/s, acc=0.5291, cpu=6.1%, gpu_mem=3.1/24.0GB (13.0%), gpu_util=36.0%, loss=0.6765, ram=8.8/30.9GB (31.8%), samples/s=1005.4] 
Epoch 3/5 [Validation]: 100%|██████████| 79/79 [00:04<00:00, 18.30it/s, acc=0.5620, cpu=3.6%, gpu_mem=3.1/24.0GB (13.0%), gpu_util=35.0%, loss=1.5690, ram=8.8/30.9GB (31.8%), samples/s=1343.6] 


Epoch 3/5, Train Loss: 1.3438, Train Acc: 0.5291, Train Throughput: 3492.29 samples/s | Val Loss: 1.2692, Val Acc: 0.5620, Val Throughput: 4240.50 samples/s | CPU Usage: 13.50% | RAM Usage: 8.6/30.9GB (31.0%) | GPU 0 Util: 35.00% | GPU 0 Mem: 3.1/24.0GB (13.0%)


Epoch 4/5 [Training]: 100%|██████████| 704/704 [00:36<00:00, 19.45it/s, acc=0.5323, cpu=5.9%, gpu_mem=3.1/24.0GB (13.1%), gpu_util=37.0%, loss=1.6912, ram=8.8/30.9GB (31.8%), samples/s=1111.8] 
Epoch 4/5 [Validation]: 100%|██████████| 79/79 [00:04<00:00, 18.57it/s, acc=0.5540, cpu=3.6%, gpu_mem=3.1/24.0GB (13.1%), gpu_util=34.0%, loss=1.4028, ram=8.8/30.9GB (31.7%), samples/s=1414.2] 


Epoch 4/5, Train Loss: 1.3316, Train Acc: 0.5323, Train Throughput: 3471.52 samples/s | Val Loss: 1.2600, Val Acc: 0.5540, Val Throughput: 4314.31 samples/s | CPU Usage: 13.50% | RAM Usage: 8.5/30.9GB (30.9%) | GPU 0 Util: 34.00% | GPU 0 Mem: 3.1/24.0GB (13.1%)


Epoch 5/5 [Training]: 100%|██████████| 704/704 [00:36<00:00, 19.53it/s, acc=0.5351, cpu=2.8%, gpu_mem=3.1/24.0GB (13.0%), gpu_util=34.0%, loss=1.6875, ram=8.7/30.9GB (31.6%), samples/s=996.1]  
Epoch 5/5 [Validation]: 100%|██████████| 79/79 [00:04<00:00, 18.02it/s, acc=0.5558, cpu=7.7%, gpu_mem=3.2/24.0GB (13.1%), gpu_util=16.0%, loss=1.1225, ram=8.7/30.9GB (31.4%), samples/s=1398.6]  
2025-06-10 20:31:57,293 - nnopt.model.train - INFO - Fine-tuning full model...


Epoch 5/5, Train Loss: 1.3277, Train Acc: 0.5351, Train Throughput: 3505.89 samples/s | Val Loss: 1.2570, Val Acc: 0.5558, Val Throughput: 6469.30 samples/s | CPU Usage: 9.20% | RAM Usage: 8.4/30.9GB (30.6%) | GPU 0 Util: 9.00% | GPU 0 Mem: 3.2/24.0GB (13.1%)


Epoch 1/3 [Training]: 100%|██████████| 704/704 [00:36<00:00, 19.47it/s, acc=0.6443, cpu=5.5%, gpu_mem=5.6/24.0GB (23.4%), gpu_util=60.0%, loss=1.3190, ram=8.9/30.9GB (31.9%), samples/s=166.6]  
Epoch 1/3 [Validation]: 100%|██████████| 79/79 [00:04<00:00, 18.77it/s, acc=0.7080, cpu=3.7%, gpu_mem=5.6/24.0GB (23.3%), gpu_util=27.0%, loss=1.0943, ram=8.8/30.9GB (31.9%), samples/s=1426.6]  


Epoch 1/3, Train Loss: 1.0121, Train Acc: 0.6443, Train Throughput: 2022.23 samples/s | Val Loss: 0.8132, Val Acc: 0.7080, Val Throughput: 6701.15 samples/s | CPU Usage: 10.40% | RAM Usage: 8.6/30.9GB (31.0%) | GPU 0 Util: 15.00% | GPU 0 Mem: 5.6/24.0GB (23.3%)


Epoch 2/3 [Training]: 100%|██████████| 704/704 [00:35<00:00, 19.85it/s, acc=0.7195, cpu=5.2%, gpu_mem=5.7/24.0GB (23.6%), gpu_util=64.0%, loss=0.9156, ram=8.6/30.9GB (31.3%), samples/s=506.6]  
Epoch 2/3 [Validation]: 100%|██████████| 79/79 [00:04<00:00, 19.03it/s, acc=0.7596, cpu=7.7%, gpu_mem=5.7/24.0GB (23.6%), gpu_util=28.0%, loss=1.0375, ram=8.6/30.9GB (31.2%), samples/s=1438.0]  


Epoch 2/3, Train Loss: 0.7965, Train Acc: 0.7195, Train Throughput: 2069.50 samples/s | Val Loss: 0.6897, Val Acc: 0.7596, Val Throughput: 6487.11 samples/s | CPU Usage: 11.90% | RAM Usage: 8.4/30.9GB (30.6%) | GPU 0 Util: 36.00% | GPU 0 Mem: 5.7/24.0GB (23.6%)


Epoch 3/3 [Training]: 100%|██████████| 704/704 [00:36<00:00, 19.44it/s, acc=0.7511, cpu=4.8%, gpu_mem=5.6/24.0GB (23.2%), gpu_util=59.0%, loss=0.7101, ram=8.6/30.9GB (31.2%), samples/s=467.2]  
Epoch 3/3 [Validation]: 100%|██████████| 79/79 [00:04<00:00, 18.50it/s, acc=0.7702, cpu=3.2%, gpu_mem=5.6/24.0GB (23.3%), gpu_util=29.0%, loss=0.8009, ram=8.6/30.9GB (31.2%), samples/s=1274.9]  

Epoch 3/3, Train Loss: 0.7048, Train Acc: 0.7511, Train Throughput: 2029.63 samples/s | Val Loss: 0.6467, Val Acc: 0.7702, Val Throughput: 6269.95 samples/s | CPU Usage: 11.90% | RAM Usage: 8.4/30.9GB (30.6%) | GPU 0 Util: 29.00% | GPU 0 Mem: 5.6/24.0GB (23.3%)





In [3]:
# Evaluate the adapted model on the validation and test set
val_metrics = eval_model(
    model=mobilenetv2_cifar10_baseline,
    test_dataset=cifar10_val_dataset,
    batch_size=64,  # Adjust batch size as needed
    device=DEVICE,
    use_amp=True,
    dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32
)

test_metrics = eval_model(
    model=mobilenetv2_cifar10_baseline,
    test_dataset=cifar10_test_dataset,
    batch_size=64,  # Adjust batch size as needed
    device=DEVICE,
    use_amp=True,
    dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32
)
print(f"Validation accuracy of the adapted MobileNetV2 on CIFAR-10: {val_metrics['accuracy']:.2f}")
print(f"Test accuracy of the adapted MobileNetV2 on CIFAR-10: {test_metrics['accuracy']:.2f}")

2025-06-10 20:33:57,791 - nnopt.model.eval - INFO - Starting evaluation on device: cuda, dtype: torch.bfloat16, batch size: 64
2025-06-10 20:33:57,795 - nnopt.model.eval - INFO - Starting warmup for 5 batches...
[Warmup]: 100%|██████████| 5/5 [00:00<00:00,  7.27it/s]
2025-06-10 20:33:58,580 - nnopt.model.eval - INFO - Warmup complete.
[Evaluation]: 100%|██████████| 79/79 [00:04<00:00, 18.21it/s, acc=0.7796, cpu=3.7%, gpu_mem=5.6/24.0GB (23.4%), gpu_util=42.0%, loss=0.9209, ram=8.7/30.9GB (31.4%), samples/s=1294.3] 
2025-06-10 20:34:02,947 - nnopt.model.eval - INFO - Starting evaluation on device: cuda, dtype: torch.bfloat16, batch size: 64
2025-06-10 20:34:02,950 - nnopt.model.eval - INFO - Starting warmup for 5 batches...


Evaluation Complete: Avg Loss: 0.6432, Accuracy: 0.7796
Throughput: 4038.36 samples/sec | Avg Batch Time: 15.67 ms | Avg Sample Time: 0.25 ms
System Stats: CPU Usage: 11.40% | RAM Usage: 8.4/30.9GB (30.6%) | GPU 0 Util: 42.00% | GPU 0 Mem: 5.6/24.0GB (23.4%)


[Warmup]: 100%|██████████| 5/5 [00:00<00:00, 13.52it/s]
2025-06-10 20:34:03,411 - nnopt.model.eval - INFO - Warmup complete.
[Evaluation]: 100%|██████████| 157/157 [00:03<00:00, 40.01it/s, acc=0.9016, cpu=2.3%, gpu_mem=5.6/24.0GB (23.4%), gpu_util=34.0%, loss=0.1212, ram=8.6/30.9GB (31.3%), samples/s=658.3]  


Evaluation Complete: Avg Loss: 0.2853, Accuracy: 0.9016
Throughput: 7993.78 samples/sec | Avg Batch Time: 7.97 ms | Avg Sample Time: 0.13 ms
System Stats: CPU Usage: 11.60% | RAM Usage: 8.4/30.9GB (30.7%) | GPU 0 Util: 34.00% | GPU 0 Mem: 5.6/24.0GB (23.4%)
Validation accuracy of the adapted MobileNetV2 on CIFAR-10: 0.78
Test accuracy of the adapted MobileNetV2 on CIFAR-10: 0.90


In [4]:
# Export the adapted model
save_mobilenetv2_cifar10_model(
    model=mobilenetv2_cifar10_baseline,
    metrics_values={
        "val_metrics": val_metrics,
        "test_metrics": test_metrics,
    },
    version="mobilenetv2_cifar10/baseline",
)

2025-06-10 20:34:07,407 - nnopt.recipes.mobilenetv2_cifar10 - INFO - Metadata saved to /home/pbeuran/repos/nnopt/models/mobilenetv2_cifar10/baseline/metadata.json
2025-06-10 20:34:07,408 - nnopt.recipes.mobilenetv2_cifar10 - INFO - Model saved to /home/pbeuran/repos/nnopt/models/mobilenetv2_cifar10/baseline/model.pt
