# ü´Ä LSTM para Clasificaci√≥n Binaria Supervisada de ECG

Este notebook implementa un **LSTM puro** para clasificaci√≥n binaria supervisada (ECG normal vs an√≥malo).

**Caracter√≠sticas principales:**
- Arquitectura LSTM pura (sin CNN) para captura de dependencias temporales
- Entrenamiento supervisado con etiquetas (0=normal, 1=an√≥malo)
- Datos preprocesados desde `Datos_supervisados/tensors_200hz` (archivos .pt)
- Integraci√≥n con MLflow para tracking de experimentos
- Orquestaci√≥n con Prefect 2.x
- Soporte autom√°tico para GPU (RTX 5080 compatible)

> ‚ö†Ô∏è **IMPORTANTE EN WINDOWS:** Ejecuta la celda de **Setup DLLs CUDA** (celda 2) **ANTES** de la celda de imports. Esto es necesario para que PyTorch pueda cargar las DLLs de CUDA correctamente.

> ‚ñ∂Ô∏è **Instrucciones:** 
> 1. Ejecuta la celda de **Setup DLLs CUDA** primero
> 2. Configura los par√°metros en la secci√≥n de **CONFIGURACI√ìN GENERAL**
> 3. Ajusta la ruta `DATA_DIR` a tu carpeta de datos (debe apuntar a `Datos_supervisados/tensors_200hz`)
> 4. Ejecuta todas las dem√°s celdas en orden


## üìã √çndice

1. **Setup CUDA y dependencias** - Configuraci√≥n de DLLs y librer√≠as
2. **Configuraci√≥n general** - Imports, semillas, dispositivo, hiperpar√°metros
3. **Carga y preparaci√≥n de datos** - Funciones para cargar desde `Datos_supervisados`
4. **Definici√≥n del modelo LSTM** - Arquitectura del modelo de clasificaci√≥n
5. **Funciones de entrenamiento y evaluaci√≥n** - Loops de entrenamiento y validaci√≥n
6. **Integraci√≥n con MLflow** - Configuraci√≥n y logging
7. **Orquestaci√≥n con Prefect** - Flujo principal con Prefect
8. **Ejecuci√≥n del flujo completo** - Celda final para ejecutar todo


---

## 1. ‚öôÔ∏è Setup CUDA y Dependencias


In [103]:
# ========================================
# üîß Setup RTX 5080 ‚Äî dependencias + CUDA DLL
# Ejecuta una sola vez (o tras actualizar drivers/librer√≠as)
# ========================================
import os
import sys
import subprocess
from pathlib import Path
from textwrap import dedent

print(f"Python: {sys.executable}")
print(f"Working dir: {Path.cwd().resolve()}")

# Rutas candidatas para DLLs de CUDA
CUDA_CANDIDATES = [
    os.environ.get("CUDA_PATH"),
    os.environ.get("CUDA_PATH_V12_8"),
    r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8",
    r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin",
    r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\libnvvp",
    r"C:\Program Files\NVIDIA\CUDNN",
]

# A√±adir rutas DLL en Windows (necesario antes de importar torch)
added = []
if hasattr(os, "add_dll_directory"):
    for candidate in CUDA_CANDIDATES:
        if not candidate:
            continue
        path = Path(candidate)
        if path.is_dir():
            try:
                os.add_dll_directory(str(path))
                added.append(str(path))
            except (FileNotFoundError, OSError):
                pass

if added:
    print("DLL directories a√±adidos:")
    for path in added:
        print(f"  - {path}")

# Instalar dependencias base si no est√°n instaladas
BASE_PACKAGES = [
    "mlflow>=2.16",
    "prefect>=3",
    "scikit-learn",
    "matplotlib",
    "pandas",
    "numpy",
]

def pip_install(spec: str) -> None:
    module_name = spec.split("==")[0].split("[")[0].replace("-", "_")
    try:
        __import__(module_name)
        print(f"‚úî {spec} ya instalado")
    except Exception:
        print(f"‚è≥ Instalando {spec} ...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", spec])

for pkg in BASE_PACKAGES:
    pip_install(pkg)

# Comando para instalar PyTorch nightly con CUDA 12.8 (para RTX 5080)
TORCH_INSTALL_CMD = [
    sys.executable,
    "-m",
    "pip",
    "install",
    "--upgrade",
    "--pre",
    "torch",
    "torchvision",
    "torchaudio",
    "--index-url",
    "https://download.pytorch.org/whl/nightly/cu128",
]

def ensure_torch_cuda() -> "tuple[object | None, dict]":
    """Importa torch, o instala la nightly cu128 si hace falta."""
    info: dict[str, str | float | bool] = {}
    try:
        import torch  # type: ignore
        info["torch_version"] = getattr(torch, "__version__", "desconocida")
        info["cuda_version"] = getattr(getattr(torch, "version", object()), "cuda", "desconocida")
        info["cuda_available"] = bool(torch.cuda.is_available())
        if "cu128" not in info["torch_version"] and not str(info["cuda_version"]).startswith("12.8"):
            raise RuntimeError(
                f"Build {info['torch_version']} no es cu128. Se reinstalar√° la nightly para RTX 5080."
            )
        return torch, info
    except Exception as err:
        print("‚ö†Ô∏è Torch no usable todav√≠a:", err)
        print("   Desinstalando PyTorch corrupto...")
        subprocess.check_call([sys.executable, "-m", "pip", "uninstall", "-y", "torch", "torchvision", "torchaudio"])
        print("   Instalando nightly cu128 desde PyTorch (puede tardar).")
        subprocess.check_call(TORCH_INSTALL_CMD)
        print("\n" + "="*60)
        print("‚ö†Ô∏è IMPORTANTE: PyTorch fue reinstalado.")
        print("   DEBES REINICIAR EL KERNEL DE JUPYTER ahora:")
        print("   Kernel ‚Üí Restart Kernel")
        print("   Luego ejecuta esta celda de nuevo.")
        print("="*60)
        import importlib
        import time
        time.sleep(2)
        importlib.invalidate_caches()
        try:
            import torch  # type: ignore
            info["torch_version"] = getattr(torch, "__version__", "desconocida")
            info["cuda_version"] = getattr(getattr(torch, "version", object()), "cuda", "desconocida")
            info["cuda_available"] = bool(torch.cuda.is_available())
            return torch, info
        except Exception as e2:
            print(f"\n‚ùå No se pudo importar PyTorch despu√©s de reinstalar: {e2}")
            print("   Por favor, REINICIA EL KERNEL y ejecuta esta celda de nuevo.")
            raise RuntimeError("Reinicia el kernel de Jupyter y ejecuta esta celda de nuevo.") from e2

# Intentar importar/instalar PyTorch
torch, torch_info = ensure_torch_cuda()

print("\nTorch info:")
for k, v in torch_info.items():
    print(f"  - {k}: {v}")

if torch_info.get("cuda_available"):
    try:
        gpu_name = torch.cuda.get_device_name(0)
        cc = torch.cuda.get_device_properties(0)
        print(f"GPU detectada: {gpu_name} | SM {cc.major}{cc.minor}")
    except Exception as e:
        print("‚ö†Ô∏è CUDA disponible pero no se pudo consultar GPU:", e)
else:
    print(dedent(
        """
        ‚ö†Ô∏è CUDA sigue inactiva. Revisa drivers / reinicia kernel tras la instalaci√≥n.
        Si el problema contin√∫a, ejecuta manualmente:
          pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
        """
    ))


Python: c:\Python311\python.exe
Working dir: S:\Proyecto final\Books
DLL directories a√±adidos:
  - C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8
  - C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8
  - C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8
  - C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin
  - C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\libnvvp
‚è≥ Instalando mlflow>=2.16 ...
‚è≥ Instalando prefect>=3 ...
‚è≥ Instalando scikit-learn ...
‚úî matplotlib ya instalado
‚úî pandas ya instalado
‚úî numpy ya instalado

Torch info:
  - torch_version: 2.10.0.dev20251121+cu128
  - cuda_version: 12.8
  - cuda_available: True
GPU detectada: NVIDIA GeForce RTX 5080 | SM 120


In [104]:
# ========================================
# Imports y dependencias
# ========================================
# ‚ö†Ô∏è IMPORTANTE: Ejecuta la celda anterior (Setup DLLs) antes de esta celda
# torch ya est√° importado en la celda anterior
import random
import json
import time
from pathlib import Path
from typing import Tuple, Dict, List, Optional

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# torch ya est√° importado en la celda anterior, solo importamos los subm√≥dulos
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, TensorDataset

from sklearn.metrics import (
    precision_score,
    recall_score,
    f1_score,
    confusion_matrix,
    accuracy_score,
    classification_report,
)
import mlflow
import mlflow.pytorch
from prefect import task, flow
from prefect.tasks import NO_CACHE

print("‚úì Todos los imports completados")


‚úì Todos los imports completados


---

## 2. ‚öôÔ∏è Configuraci√≥n General


In [105]:
# ========================================
# CONFIGURACI√ìN GENERAL
# ========================================

# --- Rutas y nombres ---
DATA_DIR = Path("../data/Datos_supervisados/tensors_200hz")   # TODO: cambiar por la carpeta donde est√°n train/val/test
EXPERIMENT_NAME = "ecg_lstm_supervisado"
RUN_NAME = "lstm_ecg_v1"
OUTPUT_DIR = Path("./outputs")                 # Directorio para guardar artefactos

# --- Datos de entrada ---
N_CHANNELS = 3          # derivaciones de ECG (3 canales)
SEQ_LEN = 2000          # TODO: timesteps por ejemplo (ej: 10 s a 200 Hz => 2000)
INPUT_SIZE = N_CHANNELS # features por timestep (3 si uso las 3 derivaciones como canales)

# --- Arquitectura LSTM ---
HIDDEN_SIZE = 64        # neuronas en la LSTM
NUM_LAYERS = 2          # cantidad de capas LSTM apiladas
DROPOUT = 0.2           # dropout entre capas LSTM (si NUM_LAYERS > 1)
BIDIRECTIONAL = False   # usar LSTM bidireccional o no

# --- Capa totalmente conectada ---
FC_UNITS = 32           # tama√±o de la capa lineal antes de la salida
FC_DROPOUT = 0.3        # dropout en la parte fully-connected

# --- Entrenamiento ---
# ‚ö†Ô∏è‚ö†Ô∏è‚ö†Ô∏è PROBLEMA DE MEMORIA ‚ö†Ô∏è‚ö†Ô∏è‚ö†Ô∏è
# Los archivos .pt cargan TODO en RAM (~6GB para X_train.pt).
# Si tienes errores de memoria:
#   1. Reduce BATCH_SIZE a 8, 4 o incluso 2
#   2. O convierte a HDF5 usando: convert_pt_to_hdf5(DATA_DIR)
#   3. O aumenta la RAM disponible (se recomienda al menos 16GB)
BATCH_SIZE = 8          # ‚ö†Ô∏è MUY REDUCIDO para evitar problemas de memoria
                        # Si a√∫n falla, reduce a 4 o 2
LEARNING_RATE = 1e-3
NUM_EPOCHS = 50
WEIGHT_DECAY = 1e-5     # regularizaci√≥n L2 (0.0 si no se quiere)

# --- Learning Rate Scheduler ---
USE_SCHEDULER = True
SCHEDULER_PATIENCE = 3
SCHEDULER_FACTOR = 0.5
SCHEDULER_MIN_LR = 1e-6
SCHEDULER_MODE = 'max'  # Monitorear val_f1_macro (maximizar)

# --- Gradient Clipping ---
CLIP_GRAD_NORM = 1.0

# --- Otros ---
SEED = 42
USE_CUDA = True         # si hay GPU disponible, usarla

# --- Optimizaciones GPU ---
ENABLE_CUDNN_BENCHMARK = True

# --- MLflow ---
MLFLOW_TRACKING_URI = None  # None = usa el directorio local (sqlite:///mlflow.db)

# Crear directorio de salida
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

# Diccionario de configuraci√≥n (para pasar a funciones)
CONFIG = {
    "DATA_DIR": DATA_DIR,
    "EXPERIMENT_NAME": EXPERIMENT_NAME,
    "RUN_NAME": RUN_NAME,
    "OUTPUT_DIR": OUTPUT_DIR,
    "N_CHANNELS": N_CHANNELS,
    "SEQ_LEN": SEQ_LEN,
    "INPUT_SIZE": INPUT_SIZE,
    "HIDDEN_SIZE": HIDDEN_SIZE,
    "NUM_LAYERS": NUM_LAYERS,
    "DROPOUT": DROPOUT,
    "BIDIRECTIONAL": BIDIRECTIONAL,
    "FC_UNITS": FC_UNITS,
    "FC_DROPOUT": FC_DROPOUT,
    "BATCH_SIZE": BATCH_SIZE,
    "LEARNING_RATE": LEARNING_RATE,
    "NUM_EPOCHS": NUM_EPOCHS,
    "WEIGHT_DECAY": WEIGHT_DECAY,
    "USE_SCHEDULER": USE_SCHEDULER,
    "SCHEDULER_PATIENCE": SCHEDULER_PATIENCE,
    "SCHEDULER_FACTOR": SCHEDULER_FACTOR,
    "SCHEDULER_MIN_LR": SCHEDULER_MIN_LR,
    "SCHEDULER_MODE": SCHEDULER_MODE,
    "CLIP_GRAD_NORM": CLIP_GRAD_NORM,
    "SEED": SEED,
    "USE_CUDA": USE_CUDA,
    "ENABLE_CUDNN_BENCHMARK": ENABLE_CUDNN_BENCHMARK,
    "MLFLOW_TRACKING_URI": MLFLOW_TRACKING_URI,
}

print("‚úì Configuraci√≥n cargada:")
print(json.dumps({k: str(v) if isinstance(v, Path) else v for k, v in CONFIG.items()}, indent=2, ensure_ascii=False))


‚úì Configuraci√≥n cargada:
{
  "DATA_DIR": "..\\data\\Datos_supervisados\\tensors_200hz",
  "EXPERIMENT_NAME": "ecg_lstm_supervisado",
  "RUN_NAME": "lstm_ecg_v1",
  "OUTPUT_DIR": "outputs",
  "N_CHANNELS": 3,
  "SEQ_LEN": 2000,
  "INPUT_SIZE": 3,
  "HIDDEN_SIZE": 64,
  "NUM_LAYERS": 2,
  "DROPOUT": 0.2,
  "BIDIRECTIONAL": false,
  "FC_UNITS": 32,
  "FC_DROPOUT": 0.3,
  "BATCH_SIZE": 8,
  "LEARNING_RATE": 0.001,
  "NUM_EPOCHS": 50,
  "WEIGHT_DECAY": 1e-05,
  "USE_SCHEDULER": true,
  "SCHEDULER_PATIENCE": 3,
  "SCHEDULER_FACTOR": 0.5,
  "SCHEDULER_MIN_LR": 1e-06,
  "SCHEDULER_MODE": "max",
  "CLIP_GRAD_NORM": 1.0,
  "SEED": 42,
  "USE_CUDA": true,
  "ENABLE_CUDNN_BENCHMARK": true,
  "MLFLOW_TRACKING_URI": null
}


In [106]:
# ========================================
# Configuraci√≥n de semillas aleatorias y optimizaciones GPU
# ========================================
def set_seed_everywhere(seed: int = 42, enable_cudnn_benchmark: bool = True) -> None:
    """Fija semillas para reproducibilidad y optimiza GPU."""
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)
        torch.backends.cudnn.deterministic = False  # ‚¨ÜÔ∏è Cambiado a False para mejor rendimiento
        torch.backends.cudnn.benchmark = enable_cudnn_benchmark  # ‚¨ÜÔ∏è NUEVO: Acelera entrenamiento
        # Limpiar cach√© de GPU
        torch.cuda.empty_cache()
        print(f"‚úì cuDNN Benchmark: {'Habilitado' if enable_cudnn_benchmark else 'Deshabilitado'}")

set_seed_everywhere(SEED, enable_cudnn_benchmark=CONFIG.get("ENABLE_CUDNN_BENCHMARK", True))
print(f"‚úì Semilla fijada: {SEED}")


‚úì cuDNN Benchmark: Habilitado
‚úì Semilla fijada: 42


In [107]:
# ========================================
# Configuraci√≥n de dispositivo (GPU/CPU)
# ========================================
def get_device() -> torch.device:
    """Detecta y configura el dispositivo (GPU si est√° disponible)."""
    if USE_CUDA and torch.cuda.is_available():
        device = torch.device("cuda")
        gpu_name = torch.cuda.get_device_name(0)
        print(f"‚úì GPU detectada: {gpu_name}")
        print(f"  CUDA Version: {torch.version.cuda}")
        print(f"  PyTorch Version: {torch.__version__}")
    else:
        device = torch.device("cpu")
        print("‚ö† GPU no disponible, usando CPU")
    return device

DEVICE = get_device()
print(f"Dispositivo seleccionado: {DEVICE}")


‚úì GPU detectada: NVIDIA GeForce RTX 5080
  CUDA Version: 12.8
  PyTorch Version: 2.10.0.dev20251121+cu128
Dispositivo seleccionado: cuda


---

## 3. üìÇ Carga y Preparaci√≥n de Datos


---

## ‚ö†Ô∏è SOLUCI√ìN AL PROBLEMA DE MEMORIA

Si tienes errores de memoria al cargar los datos, **convierte los archivos .pt a HDF5**:

```python
# Ejecuta esto ANTES de entrenar:
convert_pt_to_hdf5(Path("../data/Datos_supervisados/tensors_200hz"))
```

Esto crear√° archivos .h5 en `tensors_200hz/hdf5/` que permiten acceso aleatorio sin cargar todo en memoria.

El c√≥digo detectar√° autom√°ticamente los archivos HDF5 y los usar√° si est√°n disponibles.


In [108]:
# ========================================
# ‚ö†Ô∏è SOLUCI√ìN ALTERNATIVA: Convertir archivos .pt a HDF5 para acceso eficiente
# ========================================
# Si tienes problemas de memoria con archivos .pt, puedes convertir a HDF5
# que permite acceso aleatorio sin cargar todo en memoria.
#
# Para usar esta funci√≥n, ejecuta:
#   convert_pt_to_hdf5(Path("../data/Datos_supervisados/tensors_200hz"))
#
# Luego modifica DATA_DIR para apuntar a la carpeta con archivos .h5

def convert_pt_to_hdf5(data_dir: Path, output_dir: Path = None):
    """
    Convierte archivos .pt a HDF5 para acceso eficiente en memoria.
    
    Args:
        data_dir: Ruta a la carpeta con archivos .pt
        output_dir: Ruta donde guardar archivos .h5 (por defecto: data_dir / "hdf5")
    """
    try:
        import h5py
    except ImportError:
        print("‚ùå h5py no est√° instalado. Inst√°lalo con: pip install h5py")
        return
    
    if output_dir is None:
        output_dir = data_dir / "hdf5"
    output_dir.mkdir(parents=True, exist_ok=True)
    
    print("üîÑ Convirtiendo archivos .pt a HDF5...")
    print(f"   Entrada: {data_dir}")
    print(f"   Salida: {output_dir}")
    
    files_to_convert = ["X_train.pt", "X_val.pt", "X_test.pt", "y_train.pt", "y_val.pt", "y_test.pt"]
    
    for filename in files_to_convert:
        pt_file = data_dir / filename
        h5_file = output_dir / filename.replace(".pt", ".h5")
        
        if not pt_file.exists():
            print(f"  ‚ö†Ô∏è {filename} no existe, saltando...")
            continue
        
        print(f"  Convirtiendo {filename}...")
        
        # Cargar tensor
        tensor = torch.load(pt_file, map_location='cpu')
        
        # Guardar como HDF5
        with h5py.File(h5_file, 'w') as f:
            f.create_dataset('data', data=tensor.numpy(), compression='gzip', compression_opts=4)
            f.attrs['shape'] = tensor.shape
            f.attrs['dtype'] = str(tensor.dtype)
        
        print(f"  ‚úì {filename} convertido a {h5_file.name}")
        del tensor
        torch.cuda.empty_cache() if torch.cuda.is_available() else None
    
    print("‚úÖ Conversi√≥n completada!")
    print(f"   Ahora puedes usar los archivos .h5 en lugar de .pt")
    print(f"   Modifica el c√≥digo para usar HDF5Dataset en lugar de LazyTensorDataset")



In [109]:
# ========================================
# Dataset HDF5 que permite acceso aleatorio sin cargar todo en memoria
# ========================================
class HDF5Dataset(Dataset):
    """
    Dataset que lee desde archivos HDF5 con acceso aleatorio eficiente.
    NO carga todo en memoria, solo lee lo necesario cuando se accede.
    Esta es la soluci√≥n recomendada para archivos grandes.
    """
    def __init__(self, X_file: Path, y_file: Path):
        """
        Args:
            X_file: Ruta al archivo .h5 con los features
            y_file: Ruta al archivo .h5 con las etiquetas (o .pt si a√∫n no est√° convertido)
        """
        try:
            import h5py
        except ImportError:
            raise ImportError("h5py no est√° instalado. Inst√°lalo con: pip install h5py")
        
        self.X_file = Path(X_file)
        self.y_file = Path(y_file)
        
        # Cargar etiquetas (son peque√±as, pueden ser .pt o .h5)
        if self.y_file.suffix == '.h5':
            with h5py.File(self.y_file, 'r') as f:
                self.y = torch.from_numpy(f['data'][:])
        else:
            # Si a√∫n es .pt, cargarlo normalmente (es peque√±o)
            self.y = torch.load(self.y_file, map_location='cpu')
        
        self.len = len(self.y)
        
        # Abrir archivo HDF5 (solo para lectura, no carga todo)
        self.h5_file = h5py.File(self.X_file, 'r')
        self.X_dataset = self.h5_file['data']
        
        print(f"  ‚úì {self.X_file.name} abierto en modo HDF5 (acceso aleatorio, sin cargar todo en memoria)")
        print(f"     Shape: {self.X_dataset.shape}")
    
    def __len__(self):
        return self.len
    
    def __getitem__(self, idx):
        # Leer solo la muestra necesaria (muy eficiente)
        x = torch.from_numpy(self.X_dataset[idx])
        y = self.y[idx]
        return x, y
    
    def __del__(self):
        # Cerrar archivo HDF5 cuando se destruye el dataset
        if hasattr(self, 'h5_file'):
            self.h5_file.close()


# ========================================
# Dataset personalizado que carga datos bajo demanda (lazy loading)
# ‚ö†Ô∏è ADVERTENCIA: Carga TODO en memoria. Usa HDF5Dataset si es posible.
# ========================================
class LazyTensorDataset(Dataset):
    """
    Dataset que carga tensors desde archivos .pt bajo demanda (lazy loading).
    Los datos se cargan solo cuando se accede por primera vez y se mantienen en CPU.
    Los datos se transfieren a GPU solo cuando el DataLoader los necesita.
    Esto reduce el uso de memoria inicial.
    
    ‚ö†Ô∏è ADVERTENCIA: Cuando se carga X por primera vez, se carga TODO el archivo en memoria.
    Para archivos muy grandes (>4GB), esto puede causar problemas de memoria.
    Considera reducir BATCH_SIZE o usar un formato m√°s eficiente (HDF5, etc.).
    """
    def __init__(self, X_file: Path, y_file: Path, load_immediately: bool = False):
        """
        Args:
            X_file: Ruta al archivo .pt con los features
            y_file: Ruta al archivo .pt con las etiquetas
            load_immediately: Si True, carga X inmediatamente (por defecto False para lazy loading)
        """
        self.X_file = Path(X_file)
        self.y_file = Path(y_file)
        
        # Cargar solo las etiquetas primero (son peque√±as)
        print(f"  Cargando {self.y_file.name}...")
        self.y = torch.load(self.y_file, map_location='cpu')
        self.len = len(self.y)
        
        # Inicializar X como None - se cargar√° bajo demanda
        self.X = None
        self._X_loaded = False
        
        # Si se solicita carga inmediata, cargar ahora
        if load_immediately:
            self._load_X()
    
    def _load_X(self):
        """Carga el tensor X solo cuando se necesita por primera vez."""
        if not self._X_loaded:
            file_size_mb = self.X_file.stat().st_size / (1024 * 1024)
            file_size_gb = file_size_mb / 1024
            print(f"  ‚ö†Ô∏è Cargando {self.X_file.name} en CPU...")
            print(f"     Tama√±o del archivo: ~{file_size_gb:.2f} GB ({file_size_mb:.1f} MB)")
            print(f"     ‚ö†Ô∏è ADVERTENCIA: Esto cargar√° TODO el archivo en RAM.")
            print(f"     Esto puede tardar y usar RAM. Los datos se mantendr√°n en CPU.")
            
            try:
                # Intentar cargar con gesti√≥n de memoria
                import gc
                gc.collect()  # Limpiar memoria antes de cargar
                
                # Mostrar memoria disponible si es posible
                try:
                    import psutil
                    mem = psutil.virtual_memory()
                    print(f"     RAM disponible: {mem.available / (1024**3):.2f} GB / {mem.total / (1024**3):.2f} GB")
                    if mem.available < file_size_gb * 1024 * 1024 * 1024 * 1.5:  # Necesitamos ~1.5x el tama√±o del archivo
                        print(f"     ‚ö†Ô∏è ADVERTENCIA: Puede que no haya suficiente RAM disponible!")
                except ImportError:
                    pass  # psutil no est√° instalado, continuar
                
                self.X = torch.load(self.X_file, map_location='cpu')
                self._X_loaded = True
                
                print(f"  ‚úì {self.X_file.name} cargado en CPU (shape: {self.X.shape})")
                print(f"     Memoria usada: ~{self.X.element_size() * self.X.nelement() / (1024**3):.2f} GB")
            except RuntimeError as e:
                if "not enough memory" in str(e):
                    print(f"\n  ‚ùå ERROR CR√çTICO: No hay suficiente memoria para cargar {self.X_file.name}")
                    print(f"     Tama√±o requerido: ~{file_size_gb:.2f} GB")
                    print(f"\n  üîß SOLUCIONES:")
                    print(f"     1. ‚ö° Reduce BATCH_SIZE a 4 u 8 en la configuraci√≥n (actual: {CONFIG.get('BATCH_SIZE', 'N/A')})")
                    print(f"     2. üíæ Cierra otras aplicaciones que usen mucha memoria")
                    print(f"     3. üöÄ Aumenta la RAM disponible (se recomienda al menos {file_size_gb * 1.5:.1f} GB)")
                    print(f"     4. üì¶ Convierte a HDF5 para acceso eficiente:")
                    print(f"        - Ejecuta: convert_pt_to_hdf5(Path('{self.X_file.parent}'))")
                    print(f"        - Luego modifica el c√≥digo para usar HDF5Dataset")
                    print(f"\n  üí° NOTA: Los archivos .pt cargan TODO en memoria. HDF5 permite acceso aleatorio.")
                    raise RuntimeError(
                        f"No hay suficiente memoria para cargar {self.X_file.name} (~{file_size_gb:.2f} GB). "
                        f"Reduce BATCH_SIZE o convierte a HDF5."
                    ) from e
                else:
                    raise
    
    def __len__(self):
        return self.len
    
    def __getitem__(self, idx):
        # Cargar X si a√∫n no est√° cargado (lazy loading)
        if not self._X_loaded:
            self._load_X()
        
        # Retornar datos desde CPU (el DataLoader los transferir√° a GPU si es necesario)
        return self.X[idx], self.y[idx]


# ========================================
# Funci√≥n para cargar informaci√≥n de datos sin cargar X (solo etiquetas)
# ========================================
def load_tensor_data_info(
    data_dir: Path,
) -> Dict:
    """
    Carga solo la informaci√≥n de los datos (estad√≠sticas de etiquetas) sin cargar X.
    Usa la configuraci√≥n para las formas de X.
    
    Args:
        data_dir: Ruta a la carpeta tensors_200hz
        
    Returns:
        Diccionario con informaci√≥n de los datos
    """
    print("="*70)
    print("üìÇ CARGANDO INFORMACI√ìN DE DATOS DESDE tensors_200hz")
    print("="*70)
    print(f"Directorio: {data_dir.resolve()}")
    
    # Cargar solo las etiquetas (son peque√±as) para obtener estad√≠sticas
    print("\n‚è≥ Cargando informaci√≥n de datos (solo etiquetas, sin cargar X)...")
    y_train = torch.load(data_dir / "y_train.pt", map_location='cpu')
    y_val = torch.load(data_dir / "y_val.pt", map_location='cpu')
    y_test = torch.load(data_dir / "y_test.pt", map_location='cpu')
    
    # Usar la configuraci√≥n para las formas de X (no cargamos X para ahorrar memoria)
    # La forma esperada es (n_samples, SEQ_LEN, N_CHANNELS)
    X_shape_train = (len(y_train), CONFIG["SEQ_LEN"], CONFIG["N_CHANNELS"])
    X_shape_val = (len(y_val), CONFIG["SEQ_LEN"], CONFIG["N_CHANNELS"])
    X_shape_test = (len(y_test), CONFIG["SEQ_LEN"], CONFIG["N_CHANNELS"])
    sample_shape = (CONFIG["SEQ_LEN"], CONFIG["N_CHANNELS"])
    
    # Informaci√≥n
    info = {
        "X_shape_train": X_shape_train,
        "X_shape_val": X_shape_val,
        "X_shape_test": X_shape_test,
        "sample_shape": sample_shape,
        "n_train": len(y_train),
        "n_val": len(y_val),
        "n_test": len(y_test),
        "y_train_normales": (y_train == 0).sum().item(),
        "y_train_anomalos": (y_train == 1).sum().item(),
        "y_val_normales": (y_val == 0).sum().item(),
        "y_val_anomalos": (y_val == 1).sum().item(),
        "y_test_normales": (y_test == 0).sum().item(),
        "y_test_anomalos": (y_test == 1).sum().item(),
    }
    
    print(f"\n‚úì Informaci√≥n de datos (usando configuraci√≥n para formas de X):")
    print(f"  X_train: {X_shape_train} | y_train: ({info['n_train']},) (normales: {info['y_train_normales']}, an√≥malos: {info['y_train_anomalos']})")
    print(f"  X_val:   {X_shape_val} | y_val:   ({info['n_val']},) (normales: {info['y_val_normales']}, an√≥malos: {info['y_val_anomalos']})")
    print(f"  X_test:  {X_shape_test} | y_test:  ({info['n_test']},) (normales: {info['y_test_normales']}, an√≥malos: {info['y_test_anomalos']})")
    print("="*70)
    
    return info


def create_dataloaders_from_files(
    data_dir: Path,
    batch_size: int,
    shuffle_train: bool = True,
    load_train_immediately: bool = False,
) -> Tuple[DataLoader, DataLoader, DataLoader, np.ndarray, np.ndarray]:
    """
    Crea DataLoaders desde archivos .pt usando Dataset personalizado.
    Los datos se mantienen en CPU y se transfieren a GPU solo cuando se necesitan.
    
    Args:
        data_dir: Ruta a la carpeta tensors_200hz
        batch_size: Tama√±o del batch
        shuffle_train: Si True, mezcla los datos de entrenamiento
        load_train_immediately: Si True, carga train inmediatamente (por defecto False para lazy loading)
    
    Returns:
        Tuple con (train_loader, val_loader, test_loader, y_val_np, y_test_np)
    """
    print("\nüì¶ Creando datasets (los datos se cargar√°n en CPU bajo demanda)...")
    print("‚ö†Ô∏è  NOTA: Los datos se cargar√°n cuando el DataLoader comience a iterar.")
    print("    Si tienes problemas de memoria, reduce BATCH_SIZE en la configuraci√≥n.")
    
    # Crear datasets (cargar√°n los datos en CPU bajo demanda)
    # Val y test son m√°s peque√±os, as√≠ que podemos cargarlos inmediatamente si queremos
    train_dataset = LazyTensorDataset(
        data_dir / "X_train.pt",
        data_dir / "y_train.pt",
        load_immediately=load_train_immediately,
    )
    val_dataset = LazyTensorDataset(
        data_dir / "X_val.pt",
        data_dir / "y_val.pt",
        load_immediately=False,  # Val es m√°s peque√±o, pero a√∫n as√≠ lazy loading
    )
    test_dataset = LazyTensorDataset(
        data_dir / "X_test.pt",
        data_dir / "y_test.pt",
        load_immediately=False,  # Test es m√°s peque√±o, pero a√∫n as√≠ lazy loading
    )
    
    # Cargar etiquetas para m√©tricas (son peque√±as)
    y_val = torch.load(data_dir / "y_val.pt", map_location='cpu')
    y_test = torch.load(data_dir / "y_test.pt", map_location='cpu')
    y_val_np = y_val.cpu().numpy()
    y_test_np = y_test.cpu().numpy()
    
    # Crear dataloaders
    train_loader = DataLoader(
        train_dataset,
        batch_size=batch_size,
        shuffle=shuffle_train,
        num_workers=0,  # 0 para Windows
        pin_memory=torch.cuda.is_available(),
    )
    val_loader = DataLoader(
        val_dataset,
        batch_size=batch_size,
        shuffle=False,
        num_workers=0,
        pin_memory=torch.cuda.is_available(),
    )
    test_loader = DataLoader(
        test_dataset,
        batch_size=batch_size,
        shuffle=False,
        num_workers=0,
        pin_memory=torch.cuda.is_available(),
    )
    
    print(f"\n‚úì DataLoaders creados:")
    print(f"  Train: {len(train_loader)} batches ({len(train_dataset)} muestras)")
    print(f"  Val:   {len(val_loader)} batches ({len(val_dataset)} muestras)")
    print(f"  Test:  {len(test_loader)} batches ({len(test_dataset)} muestras)")
    
    return train_loader, val_loader, test_loader, y_val_np, y_test_np


In [110]:
# ========================================
# Crear DataLoaders
# ‚¨ÜÔ∏è SIMPLIFICADO: Ahora recibe datasets directamente
# ========================================
def create_dataloaders(
    train_dataset: Dataset,
    val_dataset: Dataset,
    test_dataset: Dataset,
    batch_size: int = None,  # ‚¨ÜÔ∏è Si es None, usa todo el dataset
    shuffle_train: bool = True,
) -> Tuple[DataLoader, DataLoader, DataLoader]:
    """
    Crea DataLoaders para train, val y test.
    Si batch_size es None, procesa todo el dataset de una vez (sin batches).
    
    Args:
        train_dataset, val_dataset, test_dataset: Datasets de PyTorch
        batch_size: Tama√±o de batch (None = todo el dataset de una vez)
    
    Returns:
        Tuple con (train_loader, val_loader, test_loader)
    """
    
    # Si batch_size es None, usar todo el dataset (sin batches)
    train_batch_size = len(train_dataset) if batch_size is None else batch_size
    val_batch_size = len(val_dataset) if batch_size is None else batch_size
    test_batch_size = len(test_dataset) if batch_size is None else batch_size
    
    # Crear dataloaders
    train_loader = DataLoader(
        train_dataset,
        batch_size=train_batch_size,
        shuffle=shuffle_train,
        num_workers=0,  # 0 para Windows
        pin_memory=torch.cuda.is_available(),
    )
    val_loader = DataLoader(
        val_dataset,
        batch_size=val_batch_size,
        shuffle=False,
        num_workers=0,
        pin_memory=torch.cuda.is_available(),
    )
    test_loader = DataLoader(
        test_dataset,
        batch_size=test_batch_size,
        shuffle=False,
        num_workers=0,
        pin_memory=torch.cuda.is_available(),
    )
    
    print(f"\n‚úì DataLoaders creados:")
    if batch_size is None:
        print(f"  Train: 1 batch ({len(train_dataset)} muestras) - SIN DIVISI√ìN EN BATCHES")
        print(f"  Val:   1 batch ({len(val_dataset)} muestras) - SIN DIVISI√ìN EN BATCHES")
        print(f"  Test:  1 batch ({len(test_dataset)} muestras) - SIN DIVISI√ìN EN BATCHES")
    else:
        print(f"  Train: {len(train_loader)} batches ({len(train_dataset)} muestras)")
        print(f"  Val:   {len(val_loader)} batches ({len(val_dataset)} muestras)")
        print(f"  Test:  {len(test_loader)} batches ({len(test_dataset)} muestras)")
    
    return train_loader, val_loader, test_loader


---

## 4. üß† Definici√≥n del Modelo LSTM


In [111]:
# ========================================
# Clase LSTM para Clasificaci√≥n Binaria
# ========================================
class LSTMClassifier(nn.Module):
    """
    LSTM puro para clasificaci√≥n binaria de series temporales (ECG normal vs an√≥malo).
    
    Arquitectura:
    - LSTM apilado (m√∫ltiples capas, opcionalmente bidireccional)
    - Capa fully connected con dropout
    - Salida binaria (sigmoid)
    """
    
    def __init__(
        self,
        input_size: int,
        hidden_size: int,
        num_layers: int,
        dropout: float,
        bidirectional: bool,
        fc_units: int,
        fc_dropout: float,
    ):
        super(LSTMClassifier, self).__init__()
        
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.bidirectional = bidirectional
        
        # Capa LSTM
        self.lstm = nn.LSTM(
            input_size=input_size,
            hidden_size=hidden_size,
            num_layers=num_layers,
            batch_first=True,
            dropout=dropout if num_layers > 1 else 0.0,
            bidirectional=bidirectional,
        )
        
        # Tama√±o de salida de LSTM (doble si es bidireccional)
        lstm_output_size = hidden_size * 2 if bidirectional else hidden_size
        
        # Capa fully connected
        self.fc1 = nn.Linear(lstm_output_size, fc_units)
        self.relu = nn.ReLU()
        self.dropout_fc = nn.Dropout(fc_dropout)
        
        # Capa de salida (binaria)
        self.fc2 = nn.Linear(fc_units, 1)
        self.sigmoid = nn.Sigmoid()
        
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """
        Forward pass.
        
        Args:
            x: Tensor de forma (batch_size, seq_len, input_size)
        
        Returns:
            Tensor de forma (batch_size,) con probabilidades
        """
        # LSTM
        lstm_out, (hidden, cell) = self.lstm(x)
        
        # Usar el √∫ltimo hidden state de la √∫ltima capa
        # Si es bidireccional, concatenar forward y backward
        if self.bidirectional:
            # hidden shape: (num_layers * 2, batch_size, hidden_size)
            # Tomar la √∫ltima capa: forward y backward
            forward_hidden = hidden[-2]  # (batch_size, hidden_size)
            backward_hidden = hidden[-1]  # (batch_size, hidden_size)
            last_hidden = torch.cat([forward_hidden, backward_hidden], dim=1)  # (batch_size, hidden_size * 2)
        else:
            # hidden shape: (num_layers, batch_size, hidden_size)
            last_hidden = hidden[-1]  # (batch_size, hidden_size)
        
        # Fully connected
        out = self.fc1(last_hidden)
        out = self.relu(out)
        out = self.dropout_fc(out)
        
        # Salida binaria
        out = self.fc2(out)
        out = self.sigmoid(out)
        
        return out.squeeze(-1)  # (batch_size,)
    
    def predict_proba(self, x: torch.Tensor) -> torch.Tensor:
        """Devuelve probabilidades (mismo que forward)."""
        return self.forward(x)
    
    def predict(self, x: torch.Tensor, threshold: float = 0.5) -> torch.Tensor:
        """Devuelve predicciones binarias."""
        proba = self.forward(x)
        return (proba > threshold).long()


In [112]:
# ========================================
# Instanciar modelo
# ========================================
def create_model(config: Dict) -> LSTMClassifier:
    """Crea e instancia el modelo LSTM."""
    model = LSTMClassifier(
        input_size=config["INPUT_SIZE"],
        hidden_size=config["HIDDEN_SIZE"],
        num_layers=config["NUM_LAYERS"],
        dropout=config["DROPOUT"],
        bidirectional=config["BIDIRECTIONAL"],
        fc_units=config["FC_UNITS"],
        fc_dropout=config["FC_DROPOUT"],
    )
    
    # Contar par√°metros
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    
    print(f"‚úì Modelo creado:")
    print(f"  Par√°metros totales: {total_params:,} ({total_params / 1e6:.2f}M)")
    print(f"  Par√°metros entrenables: {trainable_params:,}")
    print(f"  LSTM: {config['NUM_LAYERS']} capas, hidden_size={config['HIDDEN_SIZE']}, bidirectional={config['BIDIRECTIONAL']}")
    
    return model


In [113]:
# ========================================
# Funci√≥n de entrenamiento por √©poca
# ========================================
def train_one_epoch(
    model: LSTMClassifier,
    train_loader: DataLoader,
    optimizer: optim.Optimizer,
    criterion: nn.Module,
    device: torch.device,
    clip_grad_norm: Optional[float] = None,
) -> Tuple[float, float]:
    """
    Entrena el modelo por una √©poca.
    
    Returns:
        Tupla con (loss_promedio, accuracy_promedio)
    """
    model.train()
    total_loss = 0.0
    correct = 0
    total = 0
    
    for batch_x, batch_y in train_loader:
        batch_x = batch_x.to(device, non_blocking=True)
        batch_y = batch_y.to(device, non_blocking=True).float()
        
        # Forward pass
        outputs = model(batch_x)
        loss = criterion(outputs, batch_y)
        
        # Backward pass
        optimizer.zero_grad()
        loss.backward()
        
        # Gradient clipping (antes de optimizer.step)
        if clip_grad_norm is not None and clip_grad_norm > 0:
            torch.nn.utils.clip_grad_norm_(model.parameters(), clip_grad_norm)
        
        optimizer.step()
        
        # Acumular m√©tricas
        total_loss += loss.item() * batch_x.size(0)
        predictions = (outputs > 0.5).long()
        correct += (predictions == batch_y.long()).sum().item()
        total += batch_x.size(0)
    
    avg_loss = total_loss / total if total > 0 else 0.0
    avg_accuracy = correct / total if total > 0 else 0.0
    
    return avg_loss, avg_accuracy


In [114]:
# ========================================
# Funci√≥n de evaluaci√≥n
# ========================================
def evaluate(
    model: LSTMClassifier,
    dataloader: DataLoader,
    criterion: nn.Module,
    device: torch.device,
) -> Tuple[float, float, np.ndarray, np.ndarray]:
    """
    Eval√∫a el modelo en un dataloader.
    
    Returns:
        Tupla con (loss, accuracy, y_true, y_pred)
    """
    model.eval()
    total_loss = 0.0
    correct = 0
    total = 0
    
    all_preds = []
    all_labels = []
    
    with torch.no_grad():
        for batch_x, batch_y in dataloader:
            batch_x = batch_x.to(device, non_blocking=True)  # ‚¨ÜÔ∏è non_blocking para mejor rendimiento
            batch_y = batch_y.to(device, non_blocking=True).float()
            
            # Forward pass
            outputs = model(batch_x)
            loss = criterion(outputs, batch_y)
            
            # Acumular m√©tricas
            total_loss += loss.item() * batch_x.size(0)
            predictions = (outputs > 0.5).long()
            correct += (predictions == batch_y.long()).sum().item()
            total += batch_x.size(0)
            
            # Guardar predicciones y etiquetas
            all_preds.append(predictions.cpu().numpy())
            all_labels.append(batch_y.long().cpu().numpy())
    
    avg_loss = total_loss / total if total > 0 else 0.0
    avg_accuracy = correct / total if total > 0 else 0.0
    
    y_true = np.concatenate(all_labels)
    y_pred = np.concatenate(all_preds)
    
    return avg_loss, avg_accuracy, y_true, y_pred


In [115]:
# ========================================
# Funci√≥n para calcular m√©tricas completas
# ========================================
def compute_metrics(y_true: np.ndarray, y_pred: np.ndarray) -> Dict[str, float]:
    """
    Calcula m√©tricas completas de clasificaci√≥n.
    
    Returns:
        Diccionario con todas las m√©tricas
    """
    accuracy = accuracy_score(y_true, y_pred)
    
    # Calcular m√©tricas por clase usando classification_report
    report = classification_report(
        y_true, y_pred,
        target_names=["normal", "anomalo"],
        output_dict=True,
        zero_division=0
    )
    
    # M√©tricas para clase normal (0)
    metrics_normal = report.get("normal", {})
    precision_normal = metrics_normal.get("precision", 0.0)
    recall_normal = metrics_normal.get("recall", 0.0)
    f1_normal = metrics_normal.get("f1-score", 0.0)
    
    # M√©tricas para clase an√≥mala (1)
    metrics_anom = report.get("anomalo", {})
    precision_anom = metrics_anom.get("precision", 0.0)
    recall_anom = metrics_anom.get("recall", 0.0)
    f1_anom = metrics_anom.get("f1-score", 0.0)
    
    # M√©tricas generales (macro avg)
    macro_avg = report.get("macro avg", {})
    precision_macro = macro_avg.get("precision", 0.0)
    recall_macro = macro_avg.get("recall", 0.0)
    f1_macro = macro_avg.get("f1-score", 0.0)
    
    # Matriz de confusi√≥n
    cm = confusion_matrix(y_true, y_pred, labels=[0, 1])
    tn, fp, fn, tp = cm.ravel()
    specificity = tn / max(1, tn + fp)  # TNR
    sensitivity = tp / max(1, tp + fn)   # TPR (recall de clase an√≥mala)
    
    return {
        "accuracy": accuracy,
        "specificity": specificity,
        "sensitivity": sensitivity,
        "precision_normal": precision_normal,
        "recall_normal": recall_normal,
        "f1_normal": f1_normal,
        "precision_anom": precision_anom,
        "recall_anom": recall_anom,
        "f1_anom": f1_anom,
        "precision_macro": precision_macro,
        "recall_macro": recall_macro,
        "f1_macro": f1_macro,
        "confusion_matrix": cm,
    }


---

## 6. üìä Integraci√≥n con MLflow


In [116]:
# ========================================
# Configuraci√≥n de MLflow
# ========================================
def setup_mlflow(config: Dict) -> str:
    """
    Configura MLflow y crea/obtiene el experimento.
    
    Returns:
        ID del experimento
    """
    # Configurar tracking URI
    if config.get("MLFLOW_TRACKING_URI") is not None:
        mlflow.set_tracking_uri(config["MLFLOW_TRACKING_URI"])
    else:
        # Usar sqlite en el directorio padre
        PARENT_DIR = Path.cwd().parent.resolve()
        TRACKING_DB = (PARENT_DIR / "mlflow.db").resolve()
        mlflow.set_tracking_uri(f"sqlite:///{TRACKING_DB.as_posix()}")
        print(f"‚úì MLflow tracking URI: sqlite:///{TRACKING_DB.as_posix()}")
    
    # Crear o obtener experimento
    experiment_name = config["EXPERIMENT_NAME"]
    
    try:
        experiment = mlflow.get_experiment_by_name(experiment_name)
        if experiment is None:
            # Crear directorio de artefactos
            PARENT_DIR = Path.cwd().parent.resolve()
            ARTIFACT_ROOT = (PARENT_DIR / "mlflow_artifacts").resolve()
            ARTIFACT_ROOT.mkdir(parents=True, exist_ok=True)
            experiment_id = mlflow.create_experiment(experiment_name, artifact_location=ARTIFACT_ROOT.as_uri())
            print(f"‚úì Experimento MLflow creado: {experiment_name} (ID: {experiment_id})")
            print(f"  Artifact root: {ARTIFACT_ROOT.as_uri()}")
        else:
            experiment_id = experiment.experiment_id
            print(f"‚úì Experimento MLflow existente: {experiment_name} (ID: {experiment_id})")
    except Exception as e:
        print(f"‚ö† Error al configurar MLflow: {e}")
        experiment_id = mlflow.set_experiment(experiment_name)
    
    return experiment_id


In [117]:
# ========================================
# Funci√≥n para guardar matriz de confusi√≥n como artefacto
# ========================================
def save_confusion_matrix(
    cm: np.ndarray,
    output_dir: Path,
    tag: str,
) -> Tuple[Path, Path]:
    """
    Guarda la matriz de confusi√≥n como PNG y CSV.
    
    Returns:
        Tupla con rutas (png_path, csv_path)
    """
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)
    
    # Guardar como CSV
    csv_path = output_dir / f"confusion_matrix_{tag}.csv"
    df_cm = pd.DataFrame(cm, index=["Normal", "An√≥malo"], columns=["Normal", "An√≥malo"])
    df_cm.to_csv(csv_path)
    
    # Guardar como PNG
    png_path = output_dir / f"confusion_matrix_{tag}.png"
    fig, ax = plt.subplots(figsize=(6, 5))
    im = ax.imshow(cm, interpolation="nearest", cmap="Blues")
    ax.figure.colorbar(im, ax=ax)
    
    # Etiquetas
    ax.set(xticks=np.arange(cm.shape[1]), yticks=np.arange(cm.shape[0]))
    ax.set_xticklabels(["Normal", "An√≥malo"])
    ax.set_yticklabels(["Normal", "An√≥malo"])
    ax.set_xlabel("Predicci√≥n")
    ax.set_ylabel("Real")
    ax.set_title(f"Matriz de Confusi√≥n - {tag.upper()}")
    
    # A√±adir valores en las celdas
    thresh = cm.max() / 2.0
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            ax.text(
                j, i, f"{cm[i, j]}",
                ha="center", va="center",
                color="white" if cm[i, j] > thresh else "black"
            )
    
    plt.tight_layout()
    plt.savefig(png_path, dpi=150)
    plt.close()
    
    return png_path, csv_path


In [118]:
# ========================================
# Funci√≥n para guardar gr√°ficos de curvas de entrenamiento
# ========================================
def save_training_curves(
    train_losses: List[float],
    train_accuracies: List[float],
    val_losses: List[float],
    val_f1_scores: List[float],
    output_dir: Path,
    learning_rates: Optional[List[float]] = None,  # ‚¨ÜÔ∏è NUEVO: Curva de LR
) -> Path:
    """
    Guarda gr√°ficos de curvas de entrenamiento.
    
    Returns:
        Ruta del archivo PNG guardado
    """
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)
    
    # Usar 2x3 si hay learning rates, sino 2x2
    if learning_rates:
        fig, axes = plt.subplots(2, 3, figsize=(18, 10))
    else:
        fig, axes = plt.subplots(2, 2, figsize=(12, 10))
    
    epochs = range(1, len(train_losses) + 1)
    
    # Loss
    axes[0, 0].plot(epochs, train_losses, label="Train Loss", color="blue")
    axes[0, 0].plot(epochs, val_losses, label="Val Loss", color="red")
    axes[0, 0].set_xlabel("√âpoca")
    axes[0, 0].set_ylabel("Loss (BCE)")
    axes[0, 0].set_title("Loss de Entrenamiento y Validaci√≥n")
    axes[0, 0].grid(True, alpha=0.3)
    axes[0, 0].legend()
    
    # Accuracy
    axes[0, 1].plot(epochs, train_accuracies, label="Train Accuracy", color="blue")
    axes[0, 1].set_xlabel("√âpoca")
    axes[0, 1].set_ylabel("Accuracy")
    axes[0, 1].set_title("Accuracy de Entrenamiento")
    axes[0, 1].grid(True, alpha=0.3)
    axes[0, 1].legend()
    
    # F1 Score en validaci√≥n
    if val_f1_scores:
        axes[1, 0].plot(epochs, val_f1_scores, label="Val F1 (macro)", color="red")
        axes[1, 0].set_xlabel("√âpoca")
        axes[1, 0].set_ylabel("F1-Score")
        axes[1, 0].set_title("F1-Score en Validaci√≥n")
        axes[1, 0].grid(True, alpha=0.3)
        axes[1, 0].legend()
    
    # Learning Rate (si est√° disponible)
    if learning_rates:
        axes[0, 2].plot(epochs, learning_rates, label="Learning Rate", color="green")
        axes[0, 2].set_xlabel("√âpoca")
        axes[0, 2].set_ylabel("Learning Rate")
        axes[0, 2].set_title("Learning Rate durante Entrenamiento")
        axes[0, 2].set_yscale('log')  # Escala logar√≠tmica para mejor visualizaci√≥n
        axes[0, 2].grid(True, alpha=0.3)
        axes[0, 2].legend()
        
        # Comparaci√≥n Train vs Val Accuracy (necesitar√≠amos val_accuracies, pero no las tenemos)
        axes[1, 1].plot(epochs, train_accuracies, label="Train Accuracy", color="blue", alpha=0.7)
        axes[1, 1].set_xlabel("√âpoca")
        axes[1, 1].set_ylabel("Accuracy")
        axes[1, 1].set_title("Train Accuracy")
        axes[1, 1].grid(True, alpha=0.3)
        axes[1, 1].legend()
        
        # Loss comparaci√≥n
        axes[1, 2].plot(epochs, train_losses, label="Train Loss", color="blue", alpha=0.7)
        axes[1, 2].plot(epochs, val_losses, label="Val Loss", color="red", alpha=0.7)
        axes[1, 2].set_xlabel("√âpoca")
        axes[1, 2].set_ylabel("Loss")
        axes[1, 2].set_title("Train vs Val Loss")
        axes[1, 2].grid(True, alpha=0.3)
        axes[1, 2].legend()
    else:
        # Comparaci√≥n Train vs Val Accuracy (si no hay LR)
        axes[1, 1].plot(epochs, train_accuracies, label="Train Accuracy", color="blue", alpha=0.7)
        axes[1, 1].set_xlabel("√âpoca")
        axes[1, 1].set_ylabel("Accuracy")
        axes[1, 1].set_title("Accuracy de Entrenamiento")
        axes[1, 1].grid(True, alpha=0.3)
        axes[1, 1].legend()
    
    plt.tight_layout()
    
    png_path = output_dir / "training_curves.png"
    plt.savefig(png_path, dpi=150)
    plt.close()
    
    return png_path


---

## 7. ü™Ñ Orquestaci√≥n con Prefect


In [119]:
# ========================================
# Tarea Prefect: Cargar datos
# ========================================
@task(name="load_data", log_prints=True, cache_policy=NO_CACHE)
def task_load_data(config: Dict):
    """Tarea Prefect para cargar datos desde tensors_200hz."""
    print("üìÇ Cargando datos...")
    
    # Cargar informaci√≥n de datos primero (para verificar formas)
    load_tensor_data_info(config["DATA_DIR"])
    
    # Crear DataLoaders usando Dataset personalizado (mantiene datos en CPU)
    train_loader, val_loader, test_loader, y_val_np, y_test_np = create_dataloaders_from_files(
        config["DATA_DIR"],
        batch_size=config["BATCH_SIZE"],
        shuffle_train=True,
    )
    
    print("‚úì Datos cargados y preparados")
    return train_loader, val_loader, test_loader, y_val_np, y_test_np


In [120]:
# ========================================
# Tarea Prefect: Entrenar modelo
# ========================================
@task(name="train_model", log_prints=True, cache_policy=NO_CACHE)
def task_train_model(
    model: LSTMClassifier,
    train_loader: DataLoader,
    val_loader: DataLoader,
    y_val: np.ndarray,
    config: Dict,
    device: torch.device,
    experiment_id: str,
):
    """Tarea Prefect para entrenar el modelo."""
    print("üèãÔ∏è Iniciando entrenamiento...")
    print(f"  üìä Verificando DataLoaders...")
    print(f"    Train: {len(train_loader)} batches ({len(train_loader.dataset)} muestras)")
    print(f"    Val: {len(val_loader)} batches ({len(val_loader.dataset)} muestras)")
    print(f"    Device: {device}")
    
    # Mover modelo a dispositivo
    print(f"  üîÑ Moviendo modelo a {device}...")
    model = model.to(device)
    print(f"  ‚úì Modelo en {device}")
    
    # Optimizador y criterio
    print(f"  üîÑ Inicializando optimizador y criterio...")
    optimizer = optim.Adam(
        model.parameters(),
        lr=config["LEARNING_RATE"],
        weight_decay=config["WEIGHT_DECAY"],
    )
    criterion = nn.BCELoss()
    print(f"  ‚úì Optimizador y criterio listos")
    
    # ‚¨ÜÔ∏è NUEVO: Learning Rate Scheduler
    scheduler = None
    if config.get("USE_SCHEDULER", False):
        scheduler = optim.lr_scheduler.ReduceLROnPlateau(
            optimizer,
            mode=config.get("SCHEDULER_MODE", "max"),
            factor=config.get("SCHEDULER_FACTOR", 0.5),
            patience=config.get("SCHEDULER_PATIENCE", 5),
            min_lr=config.get("SCHEDULER_MIN_LR", 1e-6),
        )
        print(f"‚úì Learning Rate Scheduler configurado:")
        print(f"  Modo: {config.get('SCHEDULER_MODE', 'max')}")
        print(f"  Patience: {config.get('SCHEDULER_PATIENCE', 5)} √©pocas")
        print(f"  Factor: {config.get('SCHEDULER_FACTOR', 0.5)} (reduce a la mitad)")
        print(f"  LR m√≠nimo: {config.get('SCHEDULER_MIN_LR', 1e-6)}")
    
    # Gradient clipping
    clip_grad_norm = config.get("CLIP_GRAD_NORM", None)
    if clip_grad_norm is not None and clip_grad_norm > 0:
        print(f"‚úì Gradient Clipping habilitado: {clip_grad_norm}")
    
    # Listas para tracking
    train_losses = []
    train_accuracies = []
    val_losses = []
    val_f1_scores = []
    learning_rates = []  # ‚¨ÜÔ∏è NUEVO: Track LR
    best_f1 = 0.0
    best_model_state = None
    
    # Iniciar run de MLflow
    print(f"  üîÑ Iniciando run de MLflow...")
    with mlflow.start_run(experiment_id=experiment_id, run_name=config["RUN_NAME"]):
        print(f"  ‚úì Run de MLflow iniciado")
        # Log hiperpar√°metros
        print(f"  üîÑ Loggeando hiperpar√°metros en MLflow...")
        mlflow.log_params({
            "n_channels": config["N_CHANNELS"],
            "seq_len": config["SEQ_LEN"],
            "input_size": config["INPUT_SIZE"],
            "hidden_size": config["HIDDEN_SIZE"],
            "num_layers": config["NUM_LAYERS"],
            "dropout": config["DROPOUT"],
            "bidirectional": config["BIDIRECTIONAL"],
            "fc_units": config["FC_UNITS"],
            "fc_dropout": config["FC_DROPOUT"],
            "batch_size": config["BATCH_SIZE"],
            "learning_rate": config["LEARNING_RATE"],
            "num_epochs": config["NUM_EPOCHS"],
            "weight_decay": config["WEIGHT_DECAY"],
            "use_scheduler": config.get("USE_SCHEDULER", False),
            "scheduler_patience": config.get("SCHEDULER_PATIENCE", 3),
            "scheduler_factor": config.get("SCHEDULER_FACTOR", 0.5),
            "scheduler_min_lr": config.get("SCHEDULER_MIN_LR", 1e-6),
            "clip_grad_norm": config.get("CLIP_GRAD_NORM", None),
            "cudnn_benchmark": config.get("ENABLE_CUDNN_BENCHMARK", True),
            "seed": config["SEED"],
        })
        print(f"  ‚úì Hiperpar√°metros loggeados")
        
        # Loop de entrenamiento
        print(f"\nüöÄ Iniciando loop de entrenamiento ({config['NUM_EPOCHS']} √©pocas)...")
        print(f"  Sin mensajes de progreso por batch (solo resultados por √©poca)\n")
        for epoch in range(1, config["NUM_EPOCHS"] + 1):
            print(f"\n{'='*60}")
            print(f"üìÖ √âPOCA {epoch}/{config['NUM_EPOCHS']}")
            print(f"{'='*60}")
            
            # Entrenar
            print(f"  üèãÔ∏è Entrenando...")
            train_loss, train_acc = train_one_epoch(
                model, train_loader, optimizer, criterion, device,
                clip_grad_norm=clip_grad_norm,
            )
            train_losses.append(train_loss)
            train_accuracies.append(train_acc)
            print(f"  ‚úì Entrenamiento completado: Loss={train_loss:.4f}, Acc={train_acc:.4f}")
            
            # Validar
            print(f"  üìä Validando...")
            val_loss, val_acc, y_val_true, y_val_pred = evaluate(
                model, val_loader, criterion, device
            )
            val_losses.append(val_loss)
            print(f"  ‚úì Validaci√≥n completada: Loss={val_loss:.4f}, Acc={val_acc:.4f}")
            
            # Calcular m√©tricas de validaci√≥n
            val_metrics = compute_metrics(y_val_true, y_val_pred)
            val_f1 = val_metrics["f1_macro"]
            val_f1_scores.append(val_f1)
            
            # ‚¨ÜÔ∏è NUEVO: Actualizar Learning Rate Scheduler
            current_lr = optimizer.param_groups[0]['lr']
            learning_rates.append(current_lr)
            
            if scheduler is not None:
                # ReduceLROnPlateau usa la m√©trica (val_f1 para maximizar)
                scheduler.step(val_f1)
                new_lr = optimizer.param_groups[0]['lr']
                if new_lr < current_lr:
                    print(f"  ‚¨áÔ∏è Learning Rate reducido: {current_lr:.6f} ‚Üí {new_lr:.6f}")
            
            # Log m√©tricas en MLflow
            mlflow.log_metrics({
                "train_loss": train_loss,
                "train_accuracy": train_acc,
                "val_loss": val_loss,
                "val_accuracy": val_metrics["accuracy"],
                "val_f1_macro": val_f1,
                "val_f1_normal": val_metrics["f1_normal"],
                "val_f1_anom": val_metrics["f1_anom"],
                "val_precision_macro": val_metrics["precision_macro"],
                "val_recall_macro": val_metrics["recall_macro"],
                "learning_rate": current_lr,  # ‚¨ÜÔ∏è NUEVO: Log LR actual
            }, step=epoch)
            
            # Guardar mejor modelo
            if val_f1 > best_f1:
                best_f1 = val_f1
                best_model_state = model.state_dict().copy()
            
            # Print progreso
            if epoch % 5 == 0 or epoch == 1:
                print(
                    f"Epoch {epoch:03d}/{config['NUM_EPOCHS']} | "
                    f"Train Loss: {train_loss:.4f} | Train Acc: {train_acc:.4f} | "
                    f"Val Loss: {val_loss:.4f} | Val Acc: {val_metrics['accuracy']:.4f} | "
                    f"Val F1: {val_f1:.4f} | LR: {current_lr:.6f}"  # ‚¨ÜÔ∏è NUEVO: Mostrar LR
                )
        
        # Cargar mejor modelo
        if best_model_state is not None:
            model.load_state_dict(best_model_state)
        
        # Guardar curvas de entrenamiento
        curves_path = save_training_curves(
            train_losses, train_accuracies, val_losses, val_f1_scores, config["OUTPUT_DIR"],
            learning_rates=learning_rates if learning_rates else None,  # ‚¨ÜÔ∏è NUEVO: Incluir LR
        )
        mlflow.log_artifact(str(curves_path))
        
        # Guardar matriz de confusi√≥n de validaci√≥n
        val_metrics_final = compute_metrics(y_val_true, y_val_pred)
        cm_val_path, _ = save_confusion_matrix(
            val_metrics_final["confusion_matrix"], config["OUTPUT_DIR"], "val"
        )
        mlflow.log_artifact(str(cm_val_path))
        
        # Guardar modelo
        mlflow.pytorch.log_model(model, "model")
        
        print(f"‚úì Entrenamiento completado. Mejor F1 (macro): {best_f1:.4f}")
    
    return model, train_losses, train_accuracies, val_losses, val_f1_scores, best_f1, learning_rates


In [121]:
# ========================================
# Tarea Prefect: Evaluar en test
# ========================================
@task(name="evaluate_test", log_prints=True, cache_policy=NO_CACHE)
def task_evaluate_test(
    model: LSTMClassifier,
    test_loader: DataLoader,
    y_test: np.ndarray,
    device: torch.device,
    config: Dict,
    experiment_id: str,
):
    """Tarea Prefect para evaluar en test."""
    print("üìä Evaluando en conjunto de test...")
    
    model = model.to(device)
    criterion = nn.BCELoss()
    
    # Evaluar
    test_loss, test_acc, y_test_true, y_test_pred = evaluate(
        model, test_loader, criterion, device
    )
    
    # Calcular m√©tricas completas
    test_metrics = compute_metrics(y_test_true, y_test_pred)
    test_metrics["loss"] = test_loss
    test_metrics["accuracy"] = test_acc
    
    # Log en MLflow
    with mlflow.start_run(experiment_id=experiment_id, run_name=config["RUN_NAME"]):
        mlflow.log_metrics({
            "test_loss": test_loss,
            "test_accuracy": test_metrics["accuracy"],
            "test_f1_macro": test_metrics["f1_macro"],
            "test_f1_normal": test_metrics["f1_normal"],
            "test_f1_anom": test_metrics["f1_anom"],
            "test_precision_macro": test_metrics["precision_macro"],
            "test_recall_macro": test_metrics["recall_macro"],
            "test_specificity": test_metrics["specificity"],
            "test_sensitivity": test_metrics["sensitivity"],
        })
        
        # Guardar matriz de confusi√≥n de test
        cm_test_path, _ = save_confusion_matrix(
            test_metrics["confusion_matrix"], config["OUTPUT_DIR"], "test"
        )
        mlflow.log_artifact(str(cm_test_path))
    
    print("‚úì Evaluaci√≥n en test completada:")
    print(f"  Accuracy: {test_metrics['accuracy']:.4f}")
    print(f"  Precision (normal): {test_metrics['precision_normal']:.4f} | Recall: {test_metrics['recall_normal']:.4f} | F1: {test_metrics['f1_normal']:.4f}")
    print(f"  Precision (an√≥malo): {test_metrics['precision_anom']:.4f} | Recall: {test_metrics['recall_anom']:.4f} | F1: {test_metrics['f1_anom']:.4f}")
    print(f"  F1 Macro: {test_metrics['f1_macro']:.4f}")
    
    return test_metrics


In [122]:
# ========================================
# Flujo principal de Prefect
# ========================================
@flow(name="lstm_classification_training_flow", log_prints=True)
def lstm_classification_training_flow(config: Dict = None):
    """
    Flujo principal de Prefect que orquesta todo el proceso:
    1. Carga y preparaci√≥n de datos
    2. Creaci√≥n del modelo
    3. Entrenamiento
    4. Evaluaci√≥n en test
    """
    if config is None:
        config = CONFIG
    
    print("üöÄ Iniciando flujo de entrenamiento LSTM Clasificaci√≥n...")
    print(f"Experimento MLflow: {config['EXPERIMENT_NAME']}")
    
    # Configurar MLflow
    experiment_id = setup_mlflow(config)
    
    # Cargar y preparar datos
    dataloaders = task_load_data(config)
    train_loader, val_loader, test_loader, y_val, y_test = dataloaders
    
    # Crear modelo
    print("üß† Creando modelo...")
    model = create_model(config)
    
    # Entrenar
    model, train_losses, train_accs, val_losses, val_f1_scores, best_f1, learning_rates = task_train_model(
        model, train_loader, val_loader, y_val, config, DEVICE, experiment_id
    )
    
    # Evaluar en test
    test_metrics = task_evaluate_test(
        model, test_loader, y_test, DEVICE, config, experiment_id
    )
    
    print("\n" + "="*60)
    print("‚úÖ FLUJO COMPLETADO")
    print("="*60)
    print(f"Mejor F1 en validaci√≥n: {best_f1:.4f}")
    print(f"F1 en test: {test_metrics['f1_macro']:.4f}")
    print(f"\nRevisa MLflow para ver todos los artefactos y m√©tricas.")
    
    return {
        "model": model,
        "test_metrics": test_metrics,
        "best_f1": best_f1,
    }


---

## 8. üöÄ Ejecuci√≥n del Flujo Completo


In [123]:
# ========================================
# Ejecutar el flujo completo
# ========================================
if __name__ == "__main__":
    results = lstm_classification_training_flow(CONFIG)
    print("\n‚úì Proceso finalizado exitosamente")


2025-11-24 22:55:17 INFO  [prefect.flow_runs] Beginning flow run 'bulky-gharial' for flow 'lstm_classification_training_flow'
2025-11-24 22:55:17 INFO  [prefect.flow_runs] üöÄ Iniciando flujo de entrenamiento LSTM Clasificaci√≥n...
2025-11-24 22:55:17 INFO  [prefect.flow_runs] Experimento MLflow: ecg_lstm_supervisado
2025-11-24 22:55:17 INFO  [prefect.flow_runs] ‚úì MLflow tracking URI: sqlite:///S:/Proyecto final/mlflow.db
2025-11-24 22:55:17 INFO  [prefect.flow_runs] ‚úì Experimento MLflow existente: ecg_lstm_supervisado (ID: 8)
2025-11-24 22:55:17 INFO  [prefect.task_runs] üìÇ Cargando datos...
2025-11-24 22:55:17 INFO  [prefect.task_runs] üìÇ CARGANDO INFORMACI√ìN DE DATOS DESDE tensors_200hz
2025-11-24 22:55:17 INFO  [prefect.task_runs] Directorio: S:\Proyecto final\data\Datos_supervisados\tensors_200hz
2025-11-24 22:55:17 INFO  [prefect.task_runs] 
‚è≥ Cargando informaci√≥n de datos (solo etiquetas, sin cargar X)...
2025-11-24 22:55:17 INFO  [prefect.task_runs] 
‚úì Informaci√

RuntimeError: No hay suficiente memoria para cargar X_train.pt (~6.05 GB). Reduce BATCH_SIZE o convierte a HDF5.

---

## ‚úÖ Checklist Final

Antes de ejecutar el notebook completo:

1. ‚úÖ **Ajusta la ruta `DATA_DIR`** en la secci√≥n de configuraci√≥n (debe apuntar a `Datos_supervisados`)
2. ‚úÖ **Verifica los par√°metros** en la secci√≥n de configuraci√≥n (INPUT_SIZE, SEQ_LEN, etc.)
3. ‚úÖ **Cambia el nombre del experimento MLflow** (`EXPERIMENT_NAME`) si quieres crear uno nuevo
4. ‚úÖ **Ejecuta todas las celdas en orden** (empezando por Setup CUDA)

### üìù Notas importantes:

- El modelo se entrena con **datos supervisados** (etiquetas 0=normal, 1=an√≥malo)
- Los datos se cargan desde `Datos_supervisados/numpy/` (X_train.npy, y_train.npy, etc.)
- Todos los artefactos (modelo, gr√°ficos, matrices de confusi√≥n) se guardan en `OUTPUT_DIR` y en MLflow
- Las m√©tricas se registran en MLflow: `train_loss`, `train_accuracy`, `val_*`, `test_*`
- El modelo usa **BCE Loss** (Binary Cross Entropy) para clasificaci√≥n binaria
