# üåΩ Entrenamiento MobileNetV3 - ULTIMATE V3 (BATCH 64 + MIXED PRECISION)

**Objetivo: >85% Accuracy + >80% Recall**

## üéØ Mejoras en V3:
1. ‚úÖ **Batch size 64** (optimizado para A100)
2. ‚úÖ **Mixed Precision Training** (2x velocidad)
3. ‚úÖ **80 √©pocas** (probado en V2)
4. ‚úÖ **Sin fine-tuning** (evita colapso)
5. ‚úÖ Arquitectura 384‚Üí192 (probada y funcional)

## üìä Resultados previos:
- **V1 (60 √©pocas, batch 32):** 83.81% ‚Üí Colapso a 58% con fine-tuning
- **V2 (80 √©pocas, batch 32):** 84.53% ‚úÖ Recall >80% en TODAS las clases
- **V3 (80 √©pocas, batch 64 + FP16):** Objetivo >85% en ~45-50min

---

## üîß BLOQUE 1: Setup y Verificaci√≥n

In [None]:
# 1.1 Montar Google Drive
from google.colab import drive
drive.mount('/content/drive')

# 1.2 Clonar repositorio
!git clone -b main https://github.com/ojgonzalezz/corn-diseases-detection.git
%cd corn-diseases-detection/entrenamiento_modelos

# 1.3 Instalar dependencias
!pip install -q -r requirements.txt

# 1.4 Crear directorios necesarios en Drive
!mkdir -p /content/drive/MyDrive/corn-diseases-detection/models
!mkdir -p /content/drive/MyDrive/corn-diseases-detection/logs
!mkdir -p /content/drive/MyDrive/corn-diseases-detection/mlruns

print("\n‚úÖ Setup completado!")

## ‚ö° BLOQUE 2: Activar Mixed Precision (A100 Optimizado)

In [None]:
import tensorflow as tf
from tensorflow.keras import mixed_precision

# Activar mixed precision para A100 (usa Tensor Cores)
policy = mixed_precision.Policy('mixed_float16')
mixed_precision.set_global_policy(policy)

print(f"\n{'='*60}")
print("‚ö° MIXED PRECISION ACTIVADO")
print(f"{'='*60}")
print(f"Compute dtype: {policy.compute_dtype}")
print(f"Variable dtype: {policy.variable_dtype}")
print(f"\n‚úÖ Tensor Cores de A100 activados (velocidad 2x)")
print(f"‚úÖ Accuracy esperado: Sin degradaci√≥n (<0.1%)")
print(f"{'='*60}\n")

## üèóÔ∏è BLOQUE 3: Configuraci√≥n y Modelo

In [None]:
import os
import time
import numpy as np
from tensorflow.keras.applications import MobileNetV3Large
from tensorflow.keras.models import Model
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.optimizers.schedules import CosineDecay
from sklearn.utils.class_weight import compute_class_weight

# Importar configuraci√≥n base
from config import *
from utils import setup_gpu

# ==================== CONFIGURACI√ìN V3 OPTIMIZADA ====================
BATCH_SIZE = 64  # Duplicado de 32 ‚Üí 64 (√≥ptimo para A100)
EPOCHS = 80  # Probado en V2
LEARNING_RATE = 0.001  # LR inicial
EARLY_STOPPING_PATIENCE = 25  # Paciencia para 80 √©pocas

# Configurar GPU
setup_gpu(GPU_MEMORY_LIMIT)

print(f"\n{'='*60}")
print("üöÄ CONFIGURACI√ìN ULTIMATE V3")
print(f"{'='*60}")
print(f"Batch Size: {BATCH_SIZE} (2x vs V2)")
print(f"√âpocas: {EPOCHS}")
print(f"Learning Rate: {LEARNING_RATE} (Cosine Decay)")
print(f"Mixed Precision: ACTIVADO (FP16)")
print(f"Fine-tuning: DESHABILITADO")
print(f"\nTiempo estimado: 45-50 min (vs 146 min en V2)")
print(f"Accuracy esperado: >85%")
print(f"{'='*60}\n")

In [None]:
# Crear generadores de datos con BATCH SIZE 64
from tensorflow.keras.preprocessing.image import ImageDataGenerator

print("Creando generadores de datos (batch 64)...\n")

# Solo rescale (augmentation ya aplicado en preprocessing)
train_datagen = ImageDataGenerator(
    rescale=1./255,
    validation_split=VAL_SPLIT + TEST_SPLIT
)

val_datagen = ImageDataGenerator(
    rescale=1./255,
    validation_split=VAL_SPLIT + TEST_SPLIT
)

train_gen = train_datagen.flow_from_directory(
    DATA_DIR,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='training',
    shuffle=True,
    seed=RANDOM_SEED
)

val_gen = val_datagen.flow_from_directory(
    DATA_DIR,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='validation',
    shuffle=False,
    seed=RANDOM_SEED
)

test_gen = val_datagen.flow_from_directory(
    DATA_DIR,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='validation',
    shuffle=False,
    seed=RANDOM_SEED
)

print(f"üìä Dataset:")
print(f"  Training:   {train_gen.samples} im√°genes ({train_gen.samples // BATCH_SIZE} batches)")
print(f"  Validation: {val_gen.samples} im√°genes ({val_gen.samples // BATCH_SIZE} batches)")
print(f"  Test:       {test_gen.samples} im√°genes ({test_gen.samples // BATCH_SIZE} batches)")
print(f"\n‚ö° Batch size 64 = {(322/161):.1f}x menos pasos por √©poca")

# Calcular class weights
class_weights = compute_class_weight(
    class_weight='balanced',
    classes=np.unique(train_gen.classes),
    y=train_gen.classes
)
class_weight_dict = dict(enumerate(class_weights))
print(f"\n‚öñÔ∏è Class weights: {class_weight_dict}")

In [None]:
# Crear modelo ULTIMATE V3 (optimizado para FP16)
def create_ultimate_v3_model(num_classes, image_size, initial_learning_rate, steps_per_epoch, total_epochs):
    """
    Arquitectura ULTIMATE V3 - Batch 64 + Mixed Precision
    
    Optimizaciones:
    - Dense(384) ‚Üí Dense(192): Probada (V2: 84.53%)
    - Dropout(0.4, 0.35): Regularizaci√≥n √≥ptima
    - Mixed precision compatible (FP16)
    - Batch 64 para gradientes estables
    """
    
    # Cargar base preentrenada
    base_model = MobileNetV3Large(
        input_shape=(*image_size, 3),
        include_top=False,
        weights='imagenet'
    )
    
    # Congelar TODAS las capas base (NO fine-tuning)
    base_model.trainable = False
    
    # ARQUITECTURA 384 ‚Üí 192
    inputs = tf.keras.Input(shape=(*image_size, 3))
    x = base_model(inputs, training=False)
    x = GlobalAveragePooling2D()(x)
    
    # Primera capa densa: 384 neuronas
    x = Dense(384, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.001))(x)
    x = Dropout(0.4)(x)
    
    # Segunda capa densa: 192 neuronas
    x = Dense(192, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.001))(x)
    x = Dropout(0.35)(x)
    
    # Output layer (FP32 para estabilidad num√©rica)
    outputs = Dense(num_classes, activation='softmax', dtype='float32')(x)
    
    model = Model(inputs, outputs)
    
    # Cosine Decay ajustado a 80 √©pocas
    lr_schedule = CosineDecay(
        initial_learning_rate=initial_learning_rate,
        decay_steps=steps_per_epoch * total_epochs,
        alpha=0.1  # LR final = 10% del inicial
    )
    
    # Compilar (optimizer autom√°ticamente usa FP16 internamente)
    model.compile(
        optimizer=Adam(learning_rate=lr_schedule),
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )
    
    return model

# Crear modelo
print("\nüèóÔ∏è Creando modelo ULTIMATE V3...\n")
steps_per_epoch = train_gen.samples // BATCH_SIZE

model = create_ultimate_v3_model(
    num_classes=NUM_CLASSES,
    image_size=IMAGE_SIZE,
    initial_learning_rate=LEARNING_RATE,
    steps_per_epoch=steps_per_epoch,
    total_epochs=EPOCHS
)

print(f"üìê Total par√°metros: {model.count_params():,}")
trainable_params = sum([tf.size(w).numpy() for w in model.trainable_weights])
print(f"üìê Par√°metros entrenables: {trainable_params:,}")
print(f"üìê Ratio datos/params: {train_gen.samples / trainable_params:.2f}")
print(f"\n‚ö° Mixed precision: {policy.compute_dtype} compute, {policy.variable_dtype} variables")
print("\n‚úÖ Modelo ULTIMATE V3 creado (Batch 64 + FP16)!")

## üöÄ BLOQUE 4: Entrenamiento (80 √©pocas)

In [None]:
# Callbacks optimizados
callbacks = [
    EarlyStopping(
        monitor='val_accuracy',
        patience=EARLY_STOPPING_PATIENCE,
        restore_best_weights=True,
        verbose=1,
        mode='max'
    ),
    ModelCheckpoint(
        str(MODELS_DIR / 'mobilenetv3_ultimate_v3_best.keras'),
        monitor='val_accuracy',
        save_best_only=True,
        verbose=1,
        mode='max'
    )
]

print(f"\n{'='*60}")
print("üöÄ INICIANDO ENTRENAMIENTO ULTIMATE V3")
print(f"{'='*60}\n")
print("üéØ Objetivo: >85% accuracy, >80% recall")
print("‚ö° Optimizaciones:")
print(f"   ‚Ä¢ Batch size 64 (2x gradientes m√°s estables)")
print(f"   ‚Ä¢ Mixed precision FP16 (2x velocidad)")
print(f"   ‚Ä¢ Arquitectura 384‚Üí192 (probada en V2)")
print(f"   ‚Ä¢ Sin fine-tuning (evita colapso)")
print(f"\nüìä Resultados previos:")
print(f"   ‚Ä¢ V2 (batch 32): 84.53% en 146 min")
print(f"   ‚Ä¢ V3 (batch 64 + FP16): Esperado >85% en 45-50 min")
print(f"\n{'='*60}\n")

start_time = time.time()

history = model.fit(
    train_gen,
    epochs=EPOCHS,
    validation_data=val_gen,
    callbacks=callbacks,
    class_weight=class_weight_dict,
    verbose=1
)

training_time = time.time() - start_time
best_val_acc = max(history.history['val_accuracy'])
best_epoch = history.history['val_accuracy'].index(best_val_acc) + 1

print(f"\n{'='*60}")
print("‚úÖ ENTRENAMIENTO V3 COMPLETADO")
print(f"{'='*60}")
print(f"‚è±Ô∏è  Tiempo: {training_time/60:.2f} minutos")
print(f"‚ö° Speedup vs V2: {146.52/training_time*60:.2f}x m√°s r√°pido")
print(f"üìä Mejor Val Accuracy: {best_val_acc:.4f} ({best_val_acc*100:.2f}%) en √©poca {best_epoch}")
print(f"üìä Train Accuracy final: {history.history['accuracy'][-1]:.4f}")

if best_val_acc >= 0.85:
    print(f"\nüéâ ¬°OBJETIVO ALCANZADO! (>85%)")
    improvement = (best_val_acc - 0.8453) * 100
    print(f"üìà Mejora vs V2: +{improvement:.2f} puntos porcentuales")
else:
    gap = (0.85 - best_val_acc) * 100
    print(f"\n‚ö†Ô∏è  Faltaron {gap:.2f} puntos porcentuales para 85%")

print(f"{'='*60}\n")

## üìä BLOQUE 5: Evaluaci√≥n y Guardado

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix, classification_report
import json
from datetime import datetime
from utils import evaluate_model, plot_training_history, plot_confusion_matrix, save_training_log

print(f"\n{'='*60}")
print("üìä EVALUACI√ìN EN TEST SET")
print(f"{'='*60}\n")

# Evaluar modelo en test set
evaluation_results = evaluate_model(model, test_gen, CLASSES)

test_acc = evaluation_results['test_accuracy']
test_loss = evaluation_results['test_loss']

print(f"\n{'='*60}")
print("üìà RESULTADOS FINALES V3")
print(f"{'='*60}")
print(f"Test Accuracy: {test_acc:.4f} ({test_acc*100:.2f}%)")
print(f"Test Loss:     {test_loss:.4f}")

# Comparaci√≥n con V2
v2_acc = 0.8453
improvement = (test_acc - v2_acc) * 100
print(f"\nüìä Comparaci√≥n:")
print(f"   V2 (batch 32): {v2_acc*100:.2f}%")
print(f"   V3 (batch 64 + FP16): {test_acc*100:.2f}%")
print(f"   Diferencia: {improvement:+.2f} puntos porcentuales")

# Verificar objetivos
if test_acc >= 0.85:
    print(f"\nüéâ ¬°OBJETIVO DE ACCURACY ALCANZADO! (>85%)")
else:
    print(f"\n‚ö†Ô∏è  Accuracy: {test_acc:.4f} vs objetivo 0.85")

print(f"\n{'='*60}")
print("üìã M√âTRICAS POR CLASE")
print(f"{'='*60}")

recall_objetivo_alcanzado = True
for class_name in CLASSES:
    metrics = evaluation_results['classification_report'][class_name]
    recall = metrics['recall']
    precision = metrics['precision']
    f1 = metrics['f1-score']
    
    status = "‚úÖ" if recall >= 0.80 else "‚ùå"
    
    print(f"\n{status} {class_name}:")
    print(f"  Precision: {precision:.4f} ({precision*100:.2f}%)")
    print(f"  Recall:    {recall:.4f} ({recall*100:.2f}%)")
    print(f"  F1-Score:  {f1:.4f} ({f1*100:.2f}%)")
    
    if recall < 0.80:
        recall_objetivo_alcanzado = False

if recall_objetivo_alcanzado:
    print(f"\nüéâ ¬°OBJETIVO DE RECALL ALCANZADO EN TODAS LAS CLASES! (>80%)")
else:
    print(f"\n‚ö†Ô∏è  Algunas clases tienen recall < 80%")

print(f"\n{'='*60}\n")

In [None]:
# Guardar resultados
print("üíæ Guardando resultados V3...\n")

# 1. Gr√°fico de entrenamiento
plot_path = LOGS_DIR / 'mobilenetv3_ultimate_v3_training_history.png'
plot_training_history(history, plot_path)
print(f"‚úÖ Gr√°fico guardado: {plot_path}")

# 2. Matriz de confusi√≥n
cm_path = LOGS_DIR / 'mobilenetv3_ultimate_v3_confusion_matrix.png'
cm = plot_confusion_matrix(
    evaluation_results['y_true'],
    evaluation_results['y_pred'],
    CLASSES,
    cm_path
)
print(f"‚úÖ Matriz de confusi√≥n guardada: {cm_path}")

# 3. Modelo final
model_path = MODELS_DIR / 'mobilenetv3_ultimate_v3_final.keras'
model.save(str(model_path))
print(f"‚úÖ Modelo final guardado: {model_path}")

# 4. Log detallado
hyperparameters = {
    'model_name': 'MobileNetV3-Large ULTIMATE V3',
    'version': 'V3 - Batch 64 + Mixed Precision',
    'architecture': 'Dense(384)->Dense(192)',
    'image_size': IMAGE_SIZE,
    'batch_size': BATCH_SIZE,
    'epochs': EPOCHS,
    'learning_rate': LEARNING_RATE,
    'lr_schedule': 'CosineDecay',
    'optimizer': 'Adam',
    'dropout': [0.4, 0.35],
    'l2_regularization': 0.001,
    'mixed_precision': 'mixed_float16',
    'fine_tuning': 'Disabled',
    'gpu_optimization': 'A100 with Tensor Cores'
}

log_path = LOGS_DIR / 'mobilenetv3_ultimate_v3_training_log.json'

save_training_log(
    log_path,
    'MobileNetV3-Large ULTIMATE V3',
    hyperparameters,
    history,
    evaluation_results,
    cm,
    training_time
)
print(f"‚úÖ Log guardado: {log_path}")

# 5. Resumen final comparativo
print(f"\n{'='*60}")
print("üéâ ¬°ENTRENAMIENTO ULTIMATE V3 COMPLETADO!")
print(f"{'='*60}")
print(f"\n‚è±Ô∏è  Tiempos de entrenamiento:")
print(f"   ‚Ä¢ V2 (batch 32):        146.52 min")
print(f"   ‚Ä¢ V3 (batch 64 + FP16): {training_time/60:.2f} min")
print(f"   ‚Ä¢ Speedup:              {146.52/(training_time/60):.2f}x m√°s r√°pido")

print(f"\nüìä Test Accuracy:")
print(f"   ‚Ä¢ V1 (60 √©pocas):       83.81% ‚Üí 58.02% (colapso)")
print(f"   ‚Ä¢ V2 (80 √©pocas):       84.53%")
print(f"   ‚Ä¢ V3 (batch 64 + FP16): {test_acc*100:.2f}%")

print(f"\nüéØ Objetivos:")
print(f"   ‚Ä¢ Accuracy >85%: {'‚úÖ ALCANZADO' if test_acc >= 0.85 else '‚ùå NO ALCANZADO'}")
print(f"   ‚Ä¢ Recall >80%:   {'‚úÖ ALCANZADO' if recall_objetivo_alcanzado else '‚ùå NO ALCANZADO'}")

print(f"\nüíæ Archivos guardados en:")
print(f"   ‚Ä¢ Modelo: {model_path}")
print(f"   ‚Ä¢ Logs: {LOGS_DIR}")

print(f"\n‚ö° Optimizaciones V3:")
print(f"   ‚Ä¢ Batch size 64 (mejor convergencia)")
print(f"   ‚Ä¢ Mixed precision FP16 (Tensor Cores A100)")
print(f"   ‚Ä¢ Sin fine-tuning (estabilidad garantizada)")
print(f"{'='*60}\n")