# Notebook 01: Final Model Training Tournament

**Project:** Music Genre Classification on GTZAN  
**Author:** Alessandro Potenza & Camilla Sed  
**Course:** Numerical Analysis for Machine Learning, Politecnico di Milano

---

## Objective

This notebook serves as the primary "computation engine" for our project. Its purpose is to conduct the definitive, comparative training experiment for the three architectures selected to represent our research narrative.

This notebook will:
1.  **Configure the Environment**: Set up GPU resources and mixed-precision training for efficiency.
2.  **Load Pre-processed Data**: Load the clean, augmented, and leak-free datasets created by `00_Setup_and_Data_Preparation.ipynb`.
3.  **Define a Curated `ModelFactory`**: Specify the three key architectures that tell our project's story, from a simple baseline to a sophisticated, paper-inspired model.
4.  **Orchestrate the Training Tournament**: Systematically train and evaluate each of the three models under identical conditions, using a robust training loop.
5.  **Save Results and Models**: Save the final performance summary (`training_summary_final.csv`) and the best weights for each model (`.keras` files) for later use in our analysis notebook (`02_Analysis_and_Publication_Results.ipynb`).

---
## Cell 1: Environment Setup and Data Loading
This cell handles all initial configuration. It sets up global paths, configures TensorFlow to use available GPUs with memory growth, and enables mixed-precision training to accelerate computation and reduce memory usage.

Finally, it loads the NumPy arrays and the `label_encoder` object generated by the pre-processing notebook, preparing all necessary data for the training phase.

In [None]:
# ===================================================================
# CELLA 1: SETUP, IMPORTS E CARICAMENTO DATI
# ===================================================================

import os
import numpy as np
import pandas as pd
import pickle
import matplotlib.pyplot as plt
import seaborn as sns
import time
import traceback

import tensorflow as tf
import keras as keras
from keras import layers, models, optimizers, callbacks, regularizers
from keras.utils import to_categorical

# --- Configurazione Globale ---
PROCESSED_DATA_PATH = '../../data/processed/'
MODELS_PATH = '../../models/'
REPORTS_PATH = '../../reports/'
RANDOM_STATE = 42

os.makedirs(MODELS_PATH, exist_ok=True)
os.makedirs(REPORTS_PATH, exist_ok=True)

# 1. GPU e Mixed Precision
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True)
        print(f"✅ GPU(s) Trovata/e: {[tf.config.experimental.get_device_details(g)['device_name'] for g in gpus]}")
        policy = keras.mixed_precision.Policy('mixed_float16')
        keras.mixed_precision.set_global_policy(policy)
        print(f"✅ Politica di Mixed Precision impostata su: {keras.mixed_precision.global_policy().name}")
    except RuntimeError as e: print(f"⚠️ Errore durante l'inizializzazione della GPU: {e}")
else: print("❌ NESSUNA GPU TROVATA. L'allenamento sarà su CPU.")

# 2. Caricamento Dati Pre-processati
print("\n🔄 Caricamento dei dati pre-processati...")
try:
    X_train = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_train.npy'))
    y_train = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_train.npy'))
    X_val = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_val.npy'))
    y_val = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_val.npy'))
    X_test = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_test.npy'))
    y_test = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_test.npy'))
    
    with open(os.path.join(PROCESSED_DATA_PATH, 'label_encoder.pkl'), 'rb') as f:
        label_encoder = pickle.load(f)

    # Conversione in formato categorico
    num_classes = len(label_encoder.classes_)
    y_train_cat = to_categorical(y_train, num_classes=num_classes)
    y_val_cat = to_categorical(y_val, num_classes=num_classes)
    y_test_cat = to_categorical(y_test, num_classes=num_classes)
    
    print("\n✅ Dati caricati con successo.")
    print(f"   - Shape X_train: {X_train.shape} | Shape y_train_cat: {y_train_cat.shape}")
    print(f"   - Numero di classi: {num_classes}")
except FileNotFoundError:
    print("❌ ERRORE: File di dati non trovati. Eseguire prima il notebook '00_Setup_and_Data_Preparation.ipynb'.")

✅ GPU(s) Trovata/e: ['NVIDIA GeForce RTX 4070']
✅ Politica di Mixed Precision impostata su: mixed_float16

🔄 Caricamento dei dati pre-processati...

✅ Dati caricati con successo.
   - Shape X_train: (5990, 128, 128, 1) | Shape y_train_cat: (5990, 10)
   - Numero di classi: 10


---
## Cell 2: Curated Model Factory for Final Narrative
This cell defines the `ModelFactory`, a class that now contains only the three architectures essential to our project's story. This curated selection allows us to tell a clear and compelling narrative of architectural evolution and performance gains.

### The Three Acts of Our Story:
1.  **`Efficient_VGG`**: The simple, robust baseline. Represents the "less-is-more" philosophy.
2.  **`ResSE_AudioCNN`**: The modern standard. A sophisticated classifier using residual connections and attention, representing a typical high-performance approach.
3.  **`UNet_Audio_Classifier`**: The paper-inspired innovator. This model tests our central hypothesis that multi-scale feature learning is superior for this task, achieving the best generalization.

In [14]:
# ===================================================================
# CELL 2: CURATED MODEL FACTORY FOR FINAL NARRATIVE
# ===================================================================
# This cell defines the three key architectures that form the core of
# our project's narrative. The selection tells a clear story, moving from
# a simple baseline, to a modern standard, and finally to an innovative
# model inspired by the reference paper's core concepts.

from keras import layers, models, regularizers

class ModelFactory:
    """
    A curated factory for building and comparing the three key CNN
    architectures selected for our final, focused analysis.
    """

    # -------------------------------------------------------------------
    # Public Methods to Get Model Builders
    # -------------------------------------------------------------------
    @staticmethod
    def get_final_models():
        """
        Returns a dictionary mapping the names of our three final, curated
        models to their respective builder functions. This is the definitive
        list for our final experiment.
        """
        return {
            "Efficient_VGG": ModelFactory.build_efficient_vgg,
            "ResSE_AudioCNN": ModelFactory.build_res_se_audio_cnn,
            "UNet_Audio_Classifier": ModelFactory.build_unet_audio_classifier,
        }

    @staticmethod
    def get_final_model_names():
        """Returns a list of the curated model names."""
        return list(ModelFactory.get_final_models().keys())

    @staticmethod
    def get_builder_by_name(name):
        """
        Retrieves the model-building function for a given model name from
        our curated list.
        """
        builder = ModelFactory.get_final_models().get(name)
        if builder is None:
            raise AttributeError(f"Model '{name}' not found. Available models for the final run: {ModelFactory.get_final_model_names()}")
        return builder

    # -------------------------------------------------------------------
    # Shared Helper Building Blocks
    # -------------------------------------------------------------------

    @staticmethod
    def _se_block(input_tensor, ratio=8, name_prefix=""):
        """Squeeze-and-Excitation block to add channel-wise attention."""
        channels = input_tensor.shape[-1]
        se = layers.GlobalAveragePooling2D(name=f'{name_prefix}_se_squeeze')(input_tensor)
        se = layers.Reshape((1, 1, channels))(se)
        se = layers.Dense(channels // ratio, activation='relu', name=f'{name_prefix}_se_excite_1')(se)
        se = layers.Dense(channels, activation='sigmoid', name=f'{name_prefix}_se_excite_2')(se)
        return layers.Multiply(name=f'{name_prefix}_se_scale')([input_tensor, se])

    # -------------------------------------------------------------------
    # ACT 1: The Efficient Baseline (Our Unexpected Champion)
    # -------------------------------------------------------------------
    @staticmethod
    def build_efficient_vgg(input_shape, num_classes):
        """
        ACT 1: The surprisingly effective baseline. A simple VGG-style
        architecture, well-regularized with SE blocks. Represents the
        "less is more" philosophy and sets a high bar for performance.
        """
        inputs = layers.Input(shape=input_shape)
        # Block 1
        x = layers.Conv2D(16, 3, padding='same', use_bias=False)(inputs)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x)
        x = ModelFactory._se_block(x, name_prefix="vgg_b1")
        # Block 2
        x = layers.Conv2D(32, 3, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x)
        x = ModelFactory._se_block(x, name_prefix="vgg_b2")
        # Block 3
        x = layers.Conv2D(64, 3, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x)
        x = ModelFactory._se_block(x, name_prefix="vgg_b3")
        # Classification Head
        x = layers.GlobalAveragePooling2D(name="gap")(x)
        x = layers.Dense(128, activation='relu')(x)
        x = layers.Dropout(0.5)(x)
        outputs = layers.Dense(num_classes, activation='softmax', dtype='float32')(x)
        return models.Model(inputs=inputs, outputs=outputs, name='Efficient_VGG')

    # -------------------------------------------------------------------
    # ACT 2: The Modern Standard (The Sophisticated Challenger)
    # -------------------------------------------------------------------
    @staticmethod
    def _res_se_block(input_tensor, filters, stride=1, name_prefix=""):
        """Residual block using standard convolutions, PReLU, and SE attention."""
        shortcut = input_tensor
        # First convolution
        x = layers.Conv2D(filters, 3, strides=stride, padding='same', use_bias=False, name=f'{name_prefix}_conv1')(input_tensor)
        x = layers.BatchNormalization(name=f'{name_prefix}_bn1')(x)
        x = layers.PReLU(shared_axes=[1, 2], name=f'{name_prefix}_prelu1')(x)
        # Second convolution
        x = layers.Conv2D(filters, 3, padding='same', use_bias=False, name=f'{name_prefix}_conv2')(x)
        x = layers.BatchNormalization(name=f'{name_prefix}_bn2')(x)
        x = ModelFactory._se_block(x, name_prefix=f'{name_prefix}_se')
        # Shortcut connection
        if stride > 1 or shortcut.shape[-1] != filters:
            shortcut = layers.Conv2D(filters, 1, strides=stride, use_bias=False, name=f'{name_prefix}_shortcut_conv')(shortcut)
            shortcut = layers.BatchNormalization(name=f'{name_prefix}_shortcut_bn')(shortcut)
        x = layers.Add(name=f'{name_prefix}_add')([shortcut, x])
        x = layers.PReLU(shared_axes=[1, 2], name=f'{name_prefix}_prelu2')(x)
        return x

    @staticmethod
    def build_res_se_audio_cnn(input_shape, num_classes):
        """
        ACT 2: The modern standard. Combines deep residual connections
        (ResNet) with Squeeze-and-Excitation attention to represent the
        current state-of-the-art for classification.
        """
        inputs = layers.Input(shape=input_shape)
        # Entry block
        x = layers.Conv2D(32, 3, strides=1, padding='same', use_bias=False)(inputs)
        x = layers.BatchNormalization()(x)
        x = layers.PReLU(shared_axes=[1, 2])(x)
        # Residual blocks
        x = ModelFactory._res_se_block(x, 64, stride=2, name_prefix="res_b1")
        x = ModelFactory._res_se_block(x, 128, stride=2, name_prefix="res_b2")
        x = ModelFactory._res_se_block(x, 256, stride=2, name_prefix="res_b3")
        # Classification Head
        x = layers.GlobalAveragePooling2D(name="gap")(x)
        x = layers.Dropout(0.5)(x)
        outputs = layers.Dense(num_classes, activation='softmax', dtype='float32')(x)
        return models.Model(inputs=inputs, outputs=outputs, name='ResSE_AudioCNN')

    # -------------------------------------------------------------------
    # ACT 3: The Paper-Inspired Innovator (The Best Generalizer)
    # -------------------------------------------------------------------
    @staticmethod
    def _unet_encoder_block(input_tensor, filters, pool=True, name_prefix=""):
        """Encoder block for a U-Net, returning both the pooled output and skip connection."""
        x = layers.Conv2D(filters, 3, padding='same', use_bias=False, name=f'{name_prefix}_conv1')(input_tensor)
        x = layers.BatchNormalization(name=f'{name_prefix}_bn1')(x); x = layers.PReLU(shared_axes=[1, 2], name=f'{name_prefix}_prelu1')(x)
        x = layers.Conv2D(filters, 3, padding='same', use_bias=False, name=f'{name_prefix}_conv2')(x)
        x = layers.BatchNormalization(name=f'{name_prefix}_bn2')(x); x = layers.PReLU(shared_axes=[1, 2], name=f'{name_prefix}_prelu2')(x)
        skip_connection = x
        if pool:
            pool_output = layers.MaxPooling2D(2, name=f'{name_prefix}_pool')(x)
            return pool_output, skip_connection
        else: # For the bottleneck
            return x, skip_connection

    @staticmethod
    def build_unet_audio_classifier(input_shape, num_classes):
        """
        ACT 3: Paper-inspired innovation. Adapts the U-Net's multi-scale
        feature learning for classification. Uses the deep, abstract features
        from the bottleneck to achieve the best generalization on the test set.
        """
        inputs = layers.Input(shape=input_shape)
        # Encoder Path
        p1, s1 = ModelFactory._unet_encoder_block(inputs, 32, name_prefix="enc1")
        p2, s2 = ModelFactory._unet_encoder_block(p1, 64, name_prefix="enc2")
        p3, s3 = ModelFactory._unet_encoder_block(p2, 128, name_prefix="enc3")
        # Bottleneck (we only need its primary output for classification)
        bottleneck, _ = ModelFactory._unet_encoder_block(p3, 256, pool=False, name_prefix="bneck")
        # Classification Head
        x = layers.GlobalAveragePooling2D(name="gap")(bottleneck)
        x = layers.Dropout(0.5)(x)
        outputs = layers.Dense(num_classes, activation='softmax', dtype='float32')(x)
        return models.Model(inputs=inputs, outputs=outputs, name='UNet_Audio_Classifier')


print("✅ Curated ModelFactory defined with 3 key architectures for the final analysis.")

✅ Curated ModelFactory defined with 3 key architectures for the final analysis.


---
## Cell 3: The Definitive Training and Evaluation Framework
This is the main execution cell of the notebook. It orchestrates the end-to-end training and evaluation process for our three selected models.

### Key Components:
-   **`spec_augment_tf`**: An on-the-fly data augmentation function that applies frequency and time masking to the training spectrograms to improve model robustness.
-   **`RichLoggerCallback`**: A custom Keras callback for clean, professional logging of training progress, including epoch time and learning rate.
-   **`ModelEvaluator`**: A robust class that iterates through each model factory, compiles the model, and runs the `fit`/`evaluate` cycle with a standardized set of callbacks (`EarlyStopping`, `ReduceLROnPlateau`, `ModelCheckpoint`).
-   **Experiment Execution**: The cell configures the `tf.data` pipelines, instantiates the `ModelEvaluator`, and launches the training tournament. Upon completion, it saves the final results to a CSV file for analysis in the next notebook.

In [15]:
# ===================================================================
# CELL 3: FINAL, FOCUSED TRAINING FRAMEWORK
# ===================================================================
# This cell orchestrates the definitive training run for our three
# curated candidate architectures. It uses the same robust data pipelines
# and logger, but focuses only on the models relevant to our final narrative.

import os
import pandas as pd
import traceback
import numpy as np
import pickle
import time
import tensorflow as tf
import keras
from keras import optimizers, callbacks
from keras.utils import to_categorical

# -------------------------------------------------------------------
# 0. DATA LOADING AND PREPARATION (Unchanged)
# -------------------------------------------------------------------
print("Loading pre-processed data...")
try:
    X_train = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_train.npy'))
    y_train = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_train.npy'))
    X_val = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_val.npy'))
    y_val = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_val.npy'))
    X_test = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_test.npy'))
    y_test = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_test.npy'))
    with open(os.path.join(PROCESSED_DATA_PATH, 'label_encoder.pkl'), 'rb') as f:
        label_encoder = pickle.load(f)

    y_train_cat = to_categorical(y_train)
    y_val_cat = to_categorical(y_val)
    y_test_cat = to_categorical(y_test)
    
    print("✅ Data successfully loaded and prepared.")
except FileNotFoundError:
    raise RuntimeError("ERROR: Data files not found. Please run the '00_Data_Preprocessing' notebook first.")

# -------------------------------------------------------------------
# 1. DATA AUGMENTATION & CUSTOM CALLBACK (Unchanged)
# -------------------------------------------------------------------
@tf.function
def spec_augment_tf(spectrogram, label):
    """Applies frequency and time masking to a spectrogram."""
    aug_spec = tf.identity(spectrogram)
    # Frequency Masking
    freq_bins = tf.shape(aug_spec)[0]
    f_param = tf.cast(tf.cast(freq_bins, tf.float32) * 0.2, tf.int32)
    if f_param > 1:
        f = tf.random.uniform(shape=(), minval=1, maxval=f_param, dtype=tf.int32)
        f0 = tf.random.uniform(shape=(), minval=0, maxval=freq_bins - f, dtype=tf.int32)
        mask_freq_values = tf.concat([tf.ones((f0,)), tf.zeros((f,)), tf.ones((freq_bins - f0 - f,))], axis=0)
        freq_mask = tf.reshape(tf.cast(mask_freq_values, aug_spec.dtype), (freq_bins, 1, 1))
        aug_spec *= freq_mask
    # Time Masking
    time_steps = tf.shape(aug_spec)[1]
    t_param = tf.cast(tf.cast(time_steps, tf.float32) * 0.2, tf.int32)
    if t_param > 1:
        t = tf.random.uniform(shape=(), minval=1, maxval=t_param, dtype=tf.int32)
        t0 = tf.random.uniform(shape=(), minval=0, maxval=time_steps - t, dtype=tf.int32)
        mask_time_values = tf.concat([tf.ones((t0,)), tf.zeros((t,)), tf.ones((time_steps - t0 - t,))], axis=0)
        time_mask = tf.reshape(tf.cast(mask_time_values, aug_spec.dtype), (1, time_steps, 1))
        aug_spec *= time_mask
    return aug_spec, label

class RichLoggerCallback(callbacks.Callback):
    """A custom Keras callback for clean, informative, and professional logging."""
    def __init__(self, total_epochs):
        super().__init__()
        self.total_epochs = total_epochs
        self.best_val_accuracy = 0
        self.start_time = 0

    def on_epoch_begin(self, epoch, logs=None):
        self.start_time = time.time()
        
    def on_train_begin(self, logs=None):
        print(f"🚀 Starting training for model: {self.model.name}...")

    def on_epoch_end(self, epoch, logs=None):
        epoch_time = time.time() - self.start_time
        lr = self.model.optimizer.learning_rate
        if isinstance(lr, tf.keras.optimizers.schedules.LearningRateSchedule):
            lr = lr(self.model.optimizer.iterations)
        lr_value = lr.numpy() if hasattr(lr, 'numpy') else lr
        is_best = ""
        if logs['val_accuracy'] > self.best_val_accuracy:
            self.best_val_accuracy = logs['val_accuracy']
            is_best = " ✅"
        log_str = (f"Epoch {epoch + 1:02d}/{self.total_epochs} | "
                   f"Time: {epoch_time:.2f}s | "
                   f"Loss: {logs['loss']:.4f} | Acc: {logs['accuracy']:.4f} | "
                   f"Val Loss: {logs['val_loss']:.4f} | Val Acc: {logs['val_accuracy']:.4f} | "
                   f"LR: {lr_value:.1e}{is_best}")
        print(log_str)

    def on_train_end(self, logs=None):
        print(f"🏁 Finished training. Best Validation Accuracy: {self.best_val_accuracy:.4f}")

# -------------------------------------------------------------------
# 2. EXPERIMENT ORCHESTRATION CLASS (Unchanged)
# -------------------------------------------------------------------
class ModelEvaluator:
    """Orchestrates the training and evaluation of multiple models."""
    def __init__(self):
        self.results = []

    def run_experiments(self, model_factories, train_data, val_data, test_data, epochs, patience):
        for model_name, model_factory_fn in model_factories.items():
            print(f"\n{'='*80}\nTRAINING ARCHITECTURE: '{model_name}'\n{'='*80}")
            try:
                # Clear session to ensure a clean slate for each model
                keras.backend.clear_session()
                
                model = model_factory_fn()
                optimizer = keras.optimizers.Adam(learning_rate=1e-3)
                model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
                
                callbacks_list = [
                    RichLoggerCallback(total_epochs=epochs),
                    callbacks.EarlyStopping(monitor='val_accuracy', patience=patience, restore_best_weights=True, verbose=0),
                    callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=patience//2, verbose=1),
                    callbacks.ModelCheckpoint(os.path.join(MODELS_PATH, f"{model_name}_best.keras"), 
                                              monitor='val_accuracy', save_best_only=True, verbose=0)
                ]
                
                history = model.fit(train_data, epochs=epochs, validation_data=val_data, callbacks=callbacks_list, verbose=0)
                
                test_loss, test_acc = model.evaluate(test_data, verbose=0)
                self.results.append({
                    'Model': model_name,
                    'Test_Accuracy': test_acc,
                    'Best_Val_Accuracy': max(history.history.get('val_accuracy', [0])),
                    'Epochs_Run': len(history.history['val_accuracy']),
                })
            except Exception:
                print(f"❌ ERROR during training of [{model_name}]:")
                traceback.print_exc()
        return pd.DataFrame(self.results)

# -------------------------------------------------------------------
# 3. EXPERIMENT CONFIGURATION & EXECUTION (CORRECTED)
# -------------------------------------------------------------------
AUTOTUNE = tf.data.AUTOTUNE
BATCH_SIZE = 64
EPOCHS = 100
PATIENCE = 20

# Data Pipelines
print("\nConfiguring JIT data pipelines...")
train_pipeline = (tf.data.Dataset.from_tensor_slices((X_train, y_train_cat)).shuffle(len(X_train))
                  .map(spec_augment_tf, num_parallel_calls=AUTOTUNE).batch(BATCH_SIZE).prefetch(AUTOTUNE))
val_pipeline = (tf.data.Dataset.from_tensor_slices((X_val, y_val_cat)).batch(BATCH_SIZE).prefetch(AUTOTUNE))
test_pipeline = (tf.data.Dataset.from_tensor_slices((X_test, y_test_cat)).batch(BATCH_SIZE).prefetch(AUTOTUNE))

# ===================================================================
# *** FIX: Define the model factories correctly for the final run ***
# ===================================================================
input_shape = X_train.shape[1:]
num_classes = y_train_cat.shape[1]

# 1. Get the dictionary of builder methods from our curated factory
final_model_builders = ModelFactory.get_final_models()

# 2. Create the final dictionary of callable functions for the evaluator.
#    The lambda now correctly captures the builder 'b' for each item.
model_factories_to_run = {
    name: (lambda b=builder: b(input_shape, num_classes))
    for name, builder in final_model_builders.items()
}

print(f"\n✅ Ready to run final tournament for the following models: {list(model_factories_to_run.keys())}")

# Execute the comparative analysis
evaluator = ModelEvaluator()
results_df = evaluator.run_experiments(
    model_factories_to_run, train_pipeline, val_pipeline, test_pipeline, EPOCHS, PATIENCE
)

# -------------------------------------------------------------------
# 4. REPORTING (Unchanged)
# -------------------------------------------------------------------
if not results_df.empty:
    # Save the definitive results
    results_df.to_csv(os.path.join(REPORTS_PATH, 'training_summary_final.csv'), index=False)
    
    print("\n🎉 FINAL COMPARATIVE ANALYSIS COMPLETED 🎉")
    print("\nFinal Leaderboard (sorted by Best Validation Accuracy):")
    print(results_df.sort_values(by='Best_Val_Accuracy', ascending=False).to_markdown(index=False))

Loading pre-processed data...
✅ Data successfully loaded and prepared.

Configuring JIT data pipelines...

✅ Ready to run final tournament for the following models: ['Efficient_VGG', 'ResSE_AudioCNN', 'UNet_Audio_Classifier']

TRAINING ARCHITECTURE: 'Efficient_VGG'
🚀 Starting training for model: Efficient_VGG...
Epoch 01/100 | Time: 9.71s | Loss: 1.9332 | Acc: 0.2912 | Val Loss: 2.3392 | Val Acc: 0.1005 | LR: 1.0e-03 ✅
Epoch 02/100 | Time: 1.07s | Loss: 1.5779 | Acc: 0.4244 | Val Loss: 3.0182 | Val Acc: 0.1330 | LR: 1.0e-03 ✅
Epoch 03/100 | Time: 1.03s | Loss: 1.4420 | Acc: 0.4766 | Val Loss: 3.3359 | Val Acc: 0.1820 | LR: 1.0e-03 ✅
Epoch 04/100 | Time: 1.08s | Loss: 1.3104 | Acc: 0.5304 | Val Loss: 1.5637 | Val Acc: 0.4395 | LR: 1.0e-03 ✅
Epoch 05/100 | Time: 1.02s | Loss: 1.2198 | Acc: 0.5693 | Val Loss: 2.2346 | Val Acc: 0.3280 | LR: 1.0e-03
Epoch 06/100 | Time: 1.12s | Loss: 1.1540 | Acc: 0.5943 | Val Loss: 1.7913 | Val Acc: 0.4265 | LR: 1.0e-03
Epoch 07/100 | Time: 0.98s | Loss: 1




Epoch 01/100 | Time: 29.61s | Loss: 1.6398 | Acc: 0.3987 | Val Loss: 3.1155 | Val Acc: 0.1955 | LR: 1.0e-03 ✅
Epoch 02/100 | Time: 3.71s | Loss: 1.2327 | Acc: 0.5631 | Val Loss: 3.1900 | Val Acc: 0.1770 | LR: 1.0e-03
Epoch 03/100 | Time: 3.64s | Loss: 1.0317 | Acc: 0.6449 | Val Loss: 3.8619 | Val Acc: 0.1905 | LR: 1.0e-03
Epoch 04/100 | Time: 3.73s | Loss: 0.8192 | Acc: 0.7200 | Val Loss: 2.2087 | Val Acc: 0.4380 | LR: 1.0e-03 ✅
Epoch 05/100 | Time: 3.68s | Loss: 0.6919 | Acc: 0.7618 | Val Loss: 2.0798 | Val Acc: 0.5020 | LR: 1.0e-03 ✅
Epoch 06/100 | Time: 3.65s | Loss: 0.6094 | Acc: 0.7945 | Val Loss: 1.8721 | Val Acc: 0.4930 | LR: 1.0e-03
Epoch 07/100 | Time: 3.75s | Loss: 0.4699 | Acc: 0.8394 | Val Loss: 1.2563 | Val Acc: 0.6470 | LR: 1.0e-03 ✅
Epoch 08/100 | Time: 3.65s | Loss: 0.3989 | Acc: 0.8648 | Val Loss: 1.9650 | Val Acc: 0.5715 | LR: 1.0e-03
Epoch 09/100 | Time: 3.69s | Loss: 0.3225 | Acc: 0.8908 | Val Loss: 1.4458 | Val Acc: 0.6345 | LR: 1.0e-03
Epoch 10/100 | Time: 3.76s |