# Notebook 01: Addestramento e Valutazione dei Modelli

**Scopo:** Questo notebook carica i dati pre-processati dal Notebook 00, definisce le architetture delle reti neurali, orchestra un ciclo di esperimenti per addestrare e valutare diverse combinazioni di modelli e ottimizzatori, e salva gli artefatti migliori per l'analisi successiva.

**Input:**
- Dati pre-processati da `../data/processed/` (`X_train.npy`, `y_train.npy`, etc.)

**Output (salvati in `../models/` e `../reports/`):**
- I modelli migliori per ogni esperimento (es. `UNet_Lite_Adam.keras`).
- Un file di riepilogo con le metriche di performance (es. `training_summary.csv`).
- (Opzionale) Le storie di training salvate.

In [10]:
# ===================================================================
# CELLA 1: SETUP, IMPORTS E CARICAMENTO DATI
# ===================================================================

import os
import numpy as np
import pandas as pd
import pickle
import matplotlib.pyplot as plt
import seaborn as sns
import time
import traceback

import tensorflow as tf
import keras as keras
from keras import layers, models, optimizers, callbacks, regularizers
from keras.utils import to_categorical

# --- Configurazione Globale ---
PROCESSED_DATA_PATH = '../../data/processed/'
MODELS_PATH = '../../models/ale/'
REPORTS_PATH = '../../reports/'
RANDOM_STATE = 42

os.makedirs(MODELS_PATH, exist_ok=True)
os.makedirs(REPORTS_PATH, exist_ok=True)

# 1. GPU e Mixed Precision
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True)
        print(f"✅ GPU(s) Trovata/e: {[tf.config.experimental.get_device_details(g)['device_name'] for g in gpus]}")
        policy = keras.mixed_precision.Policy('mixed_float16')
        keras.mixed_precision.set_global_policy(policy)
        print(f"✅ Politica di Mixed Precision impostata su: {keras.mixed_precision.global_policy().name}")
    except RuntimeError as e: print(f"⚠️ Errore durante l'inizializzazione della GPU: {e}")
else: print("❌ NESSUNA GPU TROVATA. L'allenamento sarà su CPU.")

# 2. Caricamento Dati Pre-processati
print("\n🔄 Caricamento dei dati pre-processati...")
try:
    X_train = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_train.npy'))
    y_train = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_train.npy'))
    X_val = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_val.npy'))
    y_val = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_val.npy'))
    X_test = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_test.npy'))
    y_test = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_test.npy'))
    
    with open(os.path.join(PROCESSED_DATA_PATH, 'label_encoder.pkl'), 'rb') as f:
        label_encoder = pickle.load(f)

    # Conversione in formato categorico
    num_classes = len(label_encoder.classes_)
    y_train_cat = to_categorical(y_train, num_classes=num_classes)
    y_val_cat = to_categorical(y_val, num_classes=num_classes)
    y_test_cat = to_categorical(y_test, num_classes=num_classes)
    
    print("\n✅ Dati caricati con successo.")
    print(f"   - Shape X_train: {X_train.shape} | Shape y_train_cat: {y_train_cat.shape}")
    print(f"   - Numero di classi: {num_classes}")
except FileNotFoundError:
    print("❌ ERRORE: File di dati non trovati. Eseguire prima il notebook '00_Setup_and_Data_Preparation.ipynb'.")

✅ GPU(s) Trovata/e: ['NVIDIA GeForce RTX 4070']
✅ Politica di Mixed Precision impostata su: mixed_float16

🔄 Caricamento dei dati pre-processati...

✅ Dati caricati con successo.
   - Shape X_train: (5990, 128, 128, 1) | Shape y_train_cat: (5990, 10)
   - Numero di classi: 10


In [None]:
# ===================================================================
# CELL 2: COMPREHENSIVE MODEL FACTORY FOR COMPARATIVE ANALYSIS
# ===================================================================
# This cell defines all candidate architectures for our final analysis.
# The selection progresses from simple and efficient baselines to more
# complex residual networks, allowing for a thorough evaluation of
# architectural trade-offs. All code is written in English for clarity.

from keras import layers, models, regularizers

class ModelFactory:
    """
    A comprehensive factory for building and comparing a curated set of CNN architectures.
    This class centralizes our key models for the final comparative experiment.
    """
    
    # -------------------------------------------------------------------
    # Helper Building Blocks (Shared across multiple architectures)
    # -------------------------------------------------------------------

    @staticmethod
    def _se_block(input_tensor, ratio=8):
        """
        Squeeze-and-Excitation block. A lightweight channel-wise attention
        mechanism to recalibrate feature maps by modeling interdependencies
        between channels.
        Ref: Hu et al., "Squeeze-and-Excitation Networks" (2018)
        """
        channels = input_tensor.shape[-1]
        # Squeeze: Global information embedding
        se = layers.GlobalAveragePooling2D(name=f'se_squeeze_{input_tensor.name}')(input_tensor)
        se = layers.Reshape((1, 1, channels))(se)
        # Excitation: Adaptive recalibration
        se = layers.Dense(channels // ratio, activation='relu', name=f'se_excite_1_{input_tensor.name}')(se)
        se = layers.Dense(channels, activation='sigmoid', name=f'se_excite_2_{input_tensor.name}')(se)
        return layers.Multiply(name=f'se_scale_{input_tensor.name}')([input_tensor, se])

    # -------------------------------------------------------------------
    # Model 1: Efficient VGG-style Baseline
    # -------------------------------------------------------------------
    @staticmethod
    def build_efficient_vgg(input_shape, num_classes):
        """
        A memory-efficient VGG-style model. Starts with a small number of
        filters to establish a fast, simple, and memory-safe baseline.
        """
        inputs = layers.Input(shape=input_shape)
        x = layers.Conv2D(16, 3, padding='same', use_bias=False)(inputs)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x); x = ModelFactory._se_block(x)
        x = layers.Conv2D(32, 3, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x); x = ModelFactory._se_block(x)
        x = layers.Conv2D(64, 3, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x); x = ModelFactory._se_block(x)
        x = layers.GlobalAveragePooling2D()(x)
        x = layers.Dense(128, activation='relu')(x)
        x = layers.Dropout(0.5)(x)
        outputs = layers.Dense(num_classes, activation='softmax', dtype='float32')(x)
        return models.Model(inputs=inputs, outputs=outputs, name='Efficient_VGG')

    # -------------------------------------------------------------------
    # Model 2: Paper-Inspired Multi-Scale Model
    # -------------------------------------------------------------------
    @staticmethod
    def build_paper_cnn_lite(input_shape, num_classes):
        """
        A memory-optimized interpretation of the paper's multi-scale feature
        aggregation concept, using PReLU activation as specified.
        """
        inputs = layers.Input(shape=input_shape)
        x1 = layers.Conv2D(16, 3, padding='same', use_bias=False)(inputs)
        x1 = layers.BatchNormalization()(x1); x1 = layers.PReLU(shared_axes=[1, 2])(x1)
        p1 = layers.MaxPooling2D(2)(x1)
        x2 = layers.Conv2D(32, 3, padding='same', use_bias=False)(p1)
        x2 = layers.BatchNormalization()(x2); x2 = layers.PReLU(shared_axes=[1, 2])(x2)
        p2 = layers.MaxPooling2D(2)(x2)
        x3 = layers.Conv2D(64, 3, padding='same', use_bias=False)(p2)
        x3 = layers.BatchNormalization()(x3); x3 = layers.PReLU(shared_axes=[1, 2])(x3)
        gap1 = layers.GlobalAveragePooling2D()(x1)
        gap2 = layers.GlobalAveragePooling2D()(x2)
        gap3 = layers.GlobalAveragePooling2D()(x3)
        merged_features = layers.Concatenate()([gap1, gap2, gap3])
        x = layers.Dense(128, activation='relu')(merged_features)
        x = layers.Dropout(0.5)(x)
        outputs = layers.Dense(num_classes, activation='softmax', dtype='float32')(x)
        return models.Model(inputs=inputs, outputs=outputs, name='PaperCNN_Lite')

    # -------------------------------------------------------------------
    # Model 3: VGG with Attention
    # -------------------------------------------------------------------
    @staticmethod
    def build_se_audio_cnn(input_shape, num_classes):
        """
        A standard VGG-style architecture enhanced with SE blocks. Tests
        the impact of attention on a conventional, non-residual backbone.
        """
        inputs = layers.Input(shape=input_shape)
        x = layers.Conv2D(32, 3, padding='same', use_bias=False)(inputs)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x); x = ModelFactory._se_block(x)
        x = layers.Conv2D(64, 3, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x); x = ModelFactory._se_block(x)
        x = layers.Conv2D(128, 3, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x); x = layers.Activation('relu')(x)
        x = layers.MaxPooling2D(2)(x); x = ModelFactory._se_block(x)
        x = layers.GlobalAveragePooling2D()(x)
        x = layers.Dense(128, activation='relu')(x)
        x = layers.Dropout(0.5)(x)
        outputs = layers.Dense(num_classes, activation='softmax', dtype='float32')(x)
        return models.Model(inputs=inputs, outputs=outputs, name='SE_AudioCNN')

    # -------------------------------------------------------------------
    # Model 4: Residual Network with Separable Convolutions
    # -------------------------------------------------------------------
    @staticmethod
    def _separable_res_se_block(input_tensor, filters, stride=1):
        """Residual block using depthwise separable convolutions for efficiency."""
        shortcut = input_tensor
        x = layers.SeparableConv2D(filters, 3, strides=stride, padding='same', use_bias=False)(input_tensor)
        x = layers.BatchNormalization()(x); x = layers.PReLU(shared_axes=[1, 2])(x)
        x = layers.SeparableConv2D(filters, 3, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x); x = ModelFactory._se_block(x)
        if stride > 1 or shortcut.shape[-1] != filters:
            shortcut = layers.Conv2D(filters, 1, strides=stride, use_bias=False)(shortcut)
            shortcut = layers.BatchNormalization()(shortcut)
        x = layers.Add()([shortcut, x]); x = layers.PReLU(shared_axes=[1, 2])(x)
        return x

    @staticmethod
    def build_separable_res_se_cnn(input_shape, num_classes):
        """A parametrically efficient ResNet-style model."""
        inputs = layers.Input(shape=input_shape)
        x = layers.SeparableConv2D(32, 3, strides=1, padding='same', use_bias=False)(inputs)
        x = layers.BatchNormalization()(x); x = layers.PReLU(shared_axes=[1, 2])(x)
        x = ModelFactory._separable_res_se_block(x, 64, stride=2)
        x = ModelFactory._separable_res_se_block(x, 128, stride=2)
        x = ModelFactory._separable_res_se_block(x, 256, stride=2)
        x = layers.GlobalAveragePooling2D()(x)
        x = layers.Dropout(0.5)(x)
        outputs = layers.Dense(num_classes, activation='softmax', dtype='float32')(x)
        return models.Model(inputs=inputs, outputs=outputs, name='SeparableResSE_CNN')

    # -------------------------------------------------------------------
    # Model 5: Residual Network with Standard Convolutions
    # -------------------------------------------------------------------
    @staticmethod
    def _res_se_block(input_tensor, filters, stride=1):
        """Residual block using standard convolutions."""
        shortcut = input_tensor
        x = layers.Conv2D(filters, 3, strides=stride, padding='same', use_bias=False)(input_tensor)
        x = layers.BatchNormalization()(x); x = layers.PReLU(shared_axes=[1, 2])(x)
        x = layers.Conv2D(filters, 3, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x); x = ModelFactory._se_block(x)
        if stride > 1 or shortcut.shape[-1] != filters:
            shortcut = layers.Conv2D(filters, 1, strides=stride, use_bias=False)(shortcut)
            shortcut = layers.BatchNormalization()(shortcut)
        x = layers.Add()([shortcut, x]); x = layers.PReLU(shared_axes=[1, 2])(x)
        return x

    @staticmethod
    def build_res_se_audio_cnn(input_shape, num_classes):
        """
        Our most powerful stable architecture, combining ResNet principles
        with SE attention and standard convolutions.
        """
        inputs = layers.Input(shape=input_shape)
        x = layers.Conv2D(32, 3, strides=1, padding='same', use_bias=False)(inputs)
        x = layers.BatchNormalization()(x); x = layers.PReLU(shared_axes=[1, 2])(x)
        x = ModelFactory._res_se_block(x, 64, stride=2)
        x = ModelFactory._res_se_block(x, 128, stride=2)
        x = ModelFactory._res_se_block(x, 256, stride=2)
        x = layers.GlobalAveragePooling2D()(x)
        x = layers.Dropout(0.5)(x)
        outputs = layers.Dense(num_classes, activation='softmax', dtype='float32')(x)
        return models.Model(inputs=inputs, outputs=outputs, name='ResSE_AudioCNN')

print("✅ ModelFactory defined with 5 candidate architectures for the final comparative analysis.")

✅ ModelFactory defined with 5 candidate architectures for the final comparative analysis.


In [12]:
# ===================================================================
# CELL 3: DEFINITIVE COMPARATIVE ANALYSIS FRAMEWORK
# ===================================================================
# This cell orchestrates the final comparative analysis. It loads the
# pre-processed data, defines robust data pipelines with corrected
# augmentation, and systematically trains and evaluates all candidate
# architectures using a professional custom logger for clear progress tracking.

import os
import pandas as pd
import traceback
import numpy as np
import pickle
import time
import tensorflow as tf
import keras
from keras import optimizers, callbacks
from keras.utils import to_categorical

# -------------------------------------------------------------------
# 0. DATA LOADING AND PREPARATION
# -------------------------------------------------------------------
print("Loading pre-processed data...")
try:
    X_train = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_train.npy'))
    y_train = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_train.npy'))
    X_val = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_val.npy'))
    y_val = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_val.npy'))
    X_test = np.load(os.path.join(PROCESSED_DATA_PATH, 'X_test.npy'))
    y_test = np.load(os.path.join(PROCESSED_DATA_PATH, 'y_test.npy'))
    with open(os.path.join(PROCESSED_DATA_PATH, 'label_encoder.pkl'), 'rb') as f:
        label_encoder = pickle.load(f)

    y_train_cat = to_categorical(y_train)
    y_val_cat = to_categorical(y_val)
    y_test_cat = to_categorical(y_test)
    
    print("✅ Data successfully loaded and prepared.")
except FileNotFoundError:
    raise RuntimeError("ERROR: Data files not found. Please run the '00_Data_Preprocessing' notebook first.")

# -------------------------------------------------------------------
# 1. DATA AUGMENTATION & CUSTOM CALLBACK
# -------------------------------------------------------------------
@tf.function
def spec_augment_tf(spectrogram, label):
    """
    Applies frequency and time masking to a spectrogram.
    *** BUG FIX: Corrected variable names from 'mask_values' to the
    *** correctly scoped 'mask_freq_values' and 'mask_time_values'.
    """
    aug_spec = tf.identity(spectrogram)
    
    # Frequency Masking
    freq_bins = tf.shape(aug_spec)[0]
    f_param = tf.cast(tf.cast(freq_bins, tf.float32) * 0.2, tf.int32)
    if f_param > 1:
        f = tf.random.uniform(shape=(), minval=1, maxval=f_param, dtype=tf.int32)
        f0 = tf.random.uniform(shape=(), minval=0, maxval=freq_bins - f, dtype=tf.int32)
        mask_freq_values = tf.concat([tf.ones((f0,)), tf.zeros((f,)), tf.ones((freq_bins - f0 - f,))], axis=0)
        freq_mask = tf.reshape(tf.cast(mask_freq_values, aug_spec.dtype), (freq_bins, 1, 1))
        aug_spec *= freq_mask

    # Time Masking
    time_steps = tf.shape(aug_spec)[1]
    t_param = tf.cast(tf.cast(time_steps, tf.float32) * 0.2, tf.int32)
    if t_param > 1:
        t = tf.random.uniform(shape=(), minval=1, maxval=t_param, dtype=tf.int32)
        t0 = tf.random.uniform(shape=(), minval=0, maxval=time_steps - t, dtype=tf.int32)
        mask_time_values = tf.concat([tf.ones((t0,)), tf.zeros((t,)), tf.ones((time_steps - t0 - t,))], axis=0)
        time_mask = tf.reshape(tf.cast(mask_time_values, aug_spec.dtype), (1, time_steps, 1))
        aug_spec *= time_mask
        
    return aug_spec, label

class RichLoggerCallback(callbacks.Callback):
    """A custom Keras callback for clean, informative, and professional logging."""
    def __init__(self, total_epochs):
        super().__init__()
        self.total_epochs = total_epochs
        self.best_val_accuracy = 0
        self.epoch_start_time = 0

    def on_train_begin(self, logs=None):
        print(f"🚀 Starting training for model: {self.model.name}...")

    def on_epoch_end(self, epoch, logs=None):
        epoch_time = time.time() - self.epoch_start_time
        lr = self.model.optimizer.learning_rate
        
        # Handle potential learning rate schedules
        if isinstance(lr, tf.keras.optimizers.schedules.LearningRateSchedule):
            lr = lr(self.model.optimizer.iterations)

        # *** BUG FIX: Convert the learning rate variable to a numpy value before formatting.
        lr_value = lr.numpy() if hasattr(lr, 'numpy') else lr

        is_best = ""
        if logs['val_accuracy'] > self.best_val_accuracy:
            self.best_val_accuracy = logs['val_accuracy']
            is_best = " ✅"

        log_str = (
            f"Epoch {epoch + 1:02d}/{self.total_epochs} | "
            f"Time: {epoch_time:.2f}s | "
            f"Loss: {logs['loss']:.4f} | Acc: {logs['accuracy']:.4f} | "
            f"Val Loss: {logs['val_loss']:.4f} | Val Acc: {logs['val_accuracy']:.4f} | "
            f"LR: {lr_value:.1e}{is_best}"
        )
        print(log_str)

    def on_train_end(self, logs=None):
        print(f"🏁 Finished training. Best Validation Accuracy: {self.best_val_accuracy:.4f}")

# -------------------------------------------------------------------
# 2. EXPERIMENT ORCHESTRATION CLASS
# -------------------------------------------------------------------
class ModelEvaluator:
    """Orchestrates the training and evaluation of multiple models."""
    def __init__(self):
        self.results = []

    def run_experiments(self, model_factories, train_data, val_data, test_data, epochs):
        for model_name, model_factory_fn in model_factories.items():
            print(f"\n{'='*80}\nTRAINING ARCHITECTURE: '{model_name}'\n{'='*80}")
            try:
                model = model_factory_fn()
                optimizer = keras.optimizers.Adam(learning_rate=1e-3)
                model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
                
                callbacks_list = [
                    RichLoggerCallback(total_epochs=epochs),
                    callbacks.EarlyStopping(monitor='val_accuracy', patience=15, restore_best_weights=True, verbose=0),
                    callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, verbose=1),
                    callbacks.ModelCheckpoint(os.path.join(MODELS_PATH, f"{model_name}_best.keras"), 
                                              monitor='val_accuracy', save_best_only=True, verbose=0)
                ]
                
                history = model.fit(train_data, epochs=epochs, validation_data=val_data, callbacks=callbacks_list, verbose=0)
                
                test_loss, test_acc = model.evaluate(test_data, verbose=0)
                self.results.append({
                    'Model': model_name,
                    'Test_Accuracy': test_acc,
                    'Best_Val_Accuracy': max(history.history['val_accuracy']),
                    'Epochs_Run': len(history.history['val_accuracy']),
                })
            except Exception:
                print(f"❌ ERROR during training of [{model_name}]:")
                traceback.print_exc()
        return pd.DataFrame(self.results)

# -------------------------------------------------------------------
# 3. EXPERIMENT CONFIGURATION & EXECUTION
# -------------------------------------------------------------------
AUTOTUNE = tf.data.AUTOTUNE
BATCH_SIZE = 64
EPOCHS = 50

keras.mixed_precision.set_global_policy('float32')

# Data Pipelines (without .cache() to ensure no OOM errors)
print("\nConfiguring JIT data pipelines (without caching)...")
train_pipeline = (tf.data.Dataset.from_tensor_slices((X_train, y_train_cat)).shuffle(len(X_train))
                  .map(spec_augment_tf, num_parallel_calls=AUTOTUNE).batch(BATCH_SIZE).prefetch(AUTOTUNE))
val_pipeline = (tf.data.Dataset.from_tensor_slices((X_val, y_val_cat)).batch(BATCH_SIZE).prefetch(AUTOTUNE))
test_pipeline = (tf.data.Dataset.from_tensor_slices((X_test, y_test_cat)).batch(BATCH_SIZE).prefetch(AUTOTUNE))

# Define the models for the final comparative analysis
input_shape = X_train.shape[1:]
num_classes = y_train_cat.shape[1]
model_factories = {
    'Efficient_VGG': lambda: ModelFactory.build_efficient_vgg(input_shape, num_classes),
    'PaperCNN_Lite': lambda: ModelFactory.build_paper_cnn_lite(input_shape, num_classes),
    'SE_AudioCNN': lambda: ModelFactory.build_se_audio_cnn(input_shape, num_classes),
    'SeparableResSE_CNN': lambda: ModelFactory.build_separable_res_se_cnn(input_shape, num_classes),
    'ResSE_AudioCNN': lambda: ModelFactory.build_res_se_audio_cnn(input_shape, num_classes),
}

# Execute the comparative analysis
evaluator = ModelEvaluator()
results_df = evaluator.run_experiments(
    model_factories, train_pipeline, val_pipeline, test_pipeline, EPOCHS
)

# -------------------------------------------------------------------
# 4. REPORTING
# -------------------------------------------------------------------
if not results_df.empty:
    results_df.to_csv(os.path.join(REPORTS_PATH, 'training_summary_FINAL_COMPARISON.csv'), index=False)
    print("\n🎉 FINAL COMPARATIVE ANALYSIS COMPLETED 🎉")
    print("\nFinal Leaderboard:")
    print(results_df.sort_values(by='Best_Val_Accuracy', ascending=False).to_markdown(index=False))

Loading pre-processed data...
✅ Data successfully loaded and prepared.

Configuring JIT data pipelines (without caching)...

TRAINING ARCHITECTURE: 'Efficient_VGG'
🚀 Starting training for model: Efficient_VGG...
Epoch 01/50 | Time: 1753515507.21s | Loss: 1.9346 | Acc: 0.2965 | Val Loss: 2.2386 | Val Acc: 0.1785 | LR: 1.0e-03 ✅
Epoch 02/50 | Time: 1753515508.27s | Loss: 1.5980 | Acc: 0.4020 | Val Loss: 2.2621 | Val Acc: 0.1930 | LR: 1.0e-03 ✅
Epoch 03/50 | Time: 1753515509.31s | Loss: 1.4283 | Acc: 0.4910 | Val Loss: 2.6916 | Val Acc: 0.1785 | LR: 1.0e-03
Epoch 04/50 | Time: 1753515510.31s | Loss: 1.3383 | Acc: 0.5279 | Val Loss: 2.0470 | Val Acc: 0.2690 | LR: 1.0e-03 ✅
Epoch 05/50 | Time: 1753515511.36s | Loss: 1.2341 | Acc: 0.5644 | Val Loss: 1.9110 | Val Acc: 0.3975 | LR: 1.0e-03 ✅
Epoch 06/50 | Time: 1753515512.46s | Loss: 1.2089 | Acc: 0.5669 | Val Loss: 1.4708 | Val Acc: 0.4855 | LR: 1.0e-03 ✅
Epoch 07/50 | Time: 1753515513.54s | Loss: 1.1083 | Acc: 0.6000 | Val Loss: 1.3865 | Val











Epoch 01/50 | Time: 1753515633.48s | Loss: 1.8611 | Acc: 0.3244 | Val Loss: 2.2303 | Val Acc: 0.1495 | LR: 1.0e-03 ✅
Epoch 02/50 | Time: 1753515635.53s | Loss: 1.5200 | Acc: 0.4518 | Val Loss: 2.3377 | Val Acc: 0.2140 | LR: 1.0e-03 ✅
Epoch 03/50 | Time: 1753515637.49s | Loss: 1.3419 | Acc: 0.5304 | Val Loss: 2.2469 | Val Acc: 0.2480 | LR: 1.0e-03 ✅
Epoch 04/50 | Time: 1753515639.46s | Loss: 1.1928 | Acc: 0.5793 | Val Loss: 2.1272 | Val Acc: 0.3125 | LR: 1.0e-03 ✅
Epoch 05/50 | Time: 1753515641.43s | Loss: 1.1110 | Acc: 0.6147 | Val Loss: 1.8609 | Val Acc: 0.3845 | LR: 1.0e-03 ✅
Epoch 06/50 | Time: 1753515643.41s | Loss: 1.0522 | Acc: 0.6352 | Val Loss: 1.8341 | Val Acc: 0.4095 | LR: 1.0e-03 ✅
Epoch 07/50 | Time: 1753515645.47s | Loss: 1.0046 | Acc: 0.6528 | Val Loss: 1.4228 | Val Acc: 0.5280 | LR: 1.0e-03 ✅
Epoch 08/50 | Time: 1753515647.57s | Loss: 0.9531 | Acc: 0.6698 | Val Loss: 1.2816 | Val Acc: 0.5780 | LR: 1.0e-03 ✅
Epoch 09/50 | Time: 1753515649.53s | Loss: 0.9226 | Acc: 0.6845 




Epoch 01/50 | Time: 1753515766.52s | Loss: 1.8437 | Acc: 0.3301 | Val Loss: 2.3012 | Val Acc: 0.1260 | LR: 1.0e-03 ✅
Epoch 02/50 | Time: 1753515769.75s | Loss: 1.3583 | Acc: 0.5085 | Val Loss: 2.3672 | Val Acc: 0.1205 | LR: 1.0e-03
Epoch 03/50 | Time: 1753515772.74s | Loss: 1.1632 | Acc: 0.5880 | Val Loss: 2.6775 | Val Acc: 0.1115 | LR: 1.0e-03
Epoch 04/50 | Time: 1753515775.80s | Loss: 1.0355 | Acc: 0.6367 | Val Loss: 3.3661 | Val Acc: 0.1940 | LR: 1.0e-03 ✅
Epoch 05/50 | Time: 1753515779.10s | Loss: 0.9242 | Acc: 0.6731 | Val Loss: 2.4726 | Val Acc: 0.3220 | LR: 1.0e-03 ✅
Epoch 06/50 | Time: 1753515782.28s | Loss: 0.8303 | Acc: 0.7073 | Val Loss: 1.8466 | Val Acc: 0.4620 | LR: 1.0e-03 ✅
Epoch 07/50 | Time: 1753515785.53s | Loss: 0.7516 | Acc: 0.7409 | Val Loss: 1.7295 | Val Acc: 0.5090 | LR: 1.0e-03 ✅
Epoch 08/50 | Time: 1753515788.84s | Loss: 0.6754 | Acc: 0.7663 | Val Loss: 1.1834 | Val Acc: 0.6200 | LR: 1.0e-03 ✅
Epoch 09/50 | Time: 1753515792.01s | Loss: 0.6066 | Acc: 0.7940 | Va