# Módulo 2.2: RNN/LSTM para Series Temporales con MLflow

## Teoría: Redes Neuronales Recurrentes (RNN)

### ¿Qué son las RNNs?
Las RNNs son arquitecturas diseñadas para procesar **secuencias** de datos, donde el orden importa:
- Series temporales
- Texto y lenguaje natural
- Audio y video
- Datos secuenciales en general

### Arquitecturas de RNN:

#### 1. **RNN Vanilla**
```
h_t = tanh(W_hh * h_{t-1} + W_xh * x_t + b_h)
y_t = W_hy * h_t + b_y
```
- **Problema**: Vanishing/Exploding gradients
- No captura dependencias a largo plazo

#### 2. **LSTM (Long Short-Term Memory)**
Soluciona el problema de gradientes con:
- **Forget Gate**: Qué olvidar del estado previo
- **Input Gate**: Qué información nueva agregar
- **Output Gate**: Qué output generar
- **Cell State**: Memoria a largo plazo

```
f_t = σ(W_f · [h_{t-1}, x_t] + b_f)     # Forget gate
i_t = σ(W_i · [h_{t-1}, x_t] + b_i)     # Input gate
C̃_t = tanh(W_C · [h_{t-1}, x_t] + b_C)  # Candidate values
C_t = f_t * C_{t-1} + i_t * C̃_t         # New cell state
o_t = σ(W_o · [h_{t-1}, x_t] + b_o)     # Output gate
h_t = o_t * tanh(C_t)                   # New hidden state
```

**Ventajas de LSTM**:
- Captura dependencias a largo plazo
- Evita vanishing gradients
- Memoria selectiva

#### 3. **GRU (Gated Recurrent Unit)**
Versión simplificada de LSTM:
- Menos parámetros
- Más rápido de entrenar
- Rendimiento similar en muchos casos

### Bidirectional RNN
Procesa la secuencia en ambas direcciones:
```
Forward:  x_1 → x_2 → x_3 → ... → x_n
Backward: x_n → ... → x_3 → x_2 → x_1
```
Útil cuando el contexto futuro también importa.

## Objetivos
- Construir RNN/LSTM para series temporales
- Forecasting de múltiples pasos
- Tracking con MLflow
- Arquitecturas bidireccionales
- Stacked LSTMs

In [None]:
import mlflow
import mlflow.keras
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models, callbacks
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import warnings
warnings.filterwarnings('ignore')

print(f"TensorFlow version: {tf.__version__}")

In [None]:
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("tensorflow-rnn-timeseries")

## 1. Generación de Serie Temporal Sintética

Vamos a generar una serie temporal con múltiples componentes:
- Tendencia
- Estacionalidad
- Ruido

In [None]:
def generate_synthetic_timeseries(n_samples=2000, seed=42):
    
    np.random.seed(seed)
    
    time = np.arange(n_samples)
    
    trend = 0.01 * time + 10
    
    seasonal_1 = 5 * np.sin(2 * np.pi * time / 50)
    seasonal_2 = 3 * np.cos(2 * np.pi * time / 100)
    
    noise = np.random.normal(0, 1, n_samples)
    
    series = trend + seasonal_1 + seasonal_2 + noise
    
    return time, series

time, series = generate_synthetic_timeseries()

plt.figure(figsize=(15, 5))
plt.plot(time, series, linewidth=0.8)
plt.title('Synthetic Time Series')
plt.xlabel('Time')
plt.ylabel('Value')
plt.grid(True, alpha=0.3)
plt.savefig('timeseries_full.png', dpi=150)
plt.show()

print(f"Series length: {len(series)}")
print(f"Mean: {series.mean():.2f}")
print(f"Std: {series.std():.2f}")
print(f"Min: {series.min():.2f}")
print(f"Max: {series.max():.2f}")

## 2. Preparación de Datos para RNN

Transformamos la serie temporal en secuencias supervisadas:
- **X**: Ventana de `window_size` valores pasados
- **y**: Siguiente valor a predecir

In [None]:
def create_sequences(data, window_size, forecast_horizon=1):
    """
    Crea secuencias para entrenamiento de RNN
    
    Args:
        data: Serie temporal 1D
        window_size: Tamaño de la ventana de entrada
        forecast_horizon: Cuántos pasos adelante predecir
    
    Returns:
        X: Secuencias de entrada (samples, window_size, 1)
        y: Targets (samples, forecast_horizon)
    """
    X, y = [], []
    
    for i in range(len(data) - window_size - forecast_horizon + 1):
        X.append(data[i:i + window_size])
        
        if forecast_horizon == 1:
            y.append(data[i + window_size])
        else:
            y.append(data[i + window_size:i + window_size + forecast_horizon])
    
    X = np.array(X).reshape(-1, window_size, 1)
    y = np.array(y)
    
    return X, y

scaler = MinMaxScaler()
series_scaled = scaler.fit_transform(series.reshape(-1, 1)).flatten()

window_size = 50
forecast_horizon = 1

X, y = create_sequences(series_scaled, window_size, forecast_horizon)

train_size = int(0.8 * len(X))
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]

print(f"X_train shape: {X_train.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"X_test shape: {X_test.shape}")
print(f"y_test shape: {y_test.shape}")

print(f"\nExample:")
print(f"Input window (first 5 values): {X_train[0, :5, 0]}")
print(f"Target: {y_train[0]}")

## 3. Modelo Simple: LSTM Básico

In [None]:
def create_simple_lstm(window_size, units=50):
    model = models.Sequential([
        layers.LSTM(units, input_shape=(window_size, 1)),
        layers.Dense(1)
    ])
    return model

model = create_simple_lstm(window_size)
model.summary()

In [None]:
with mlflow.start_run(run_name="simple_lstm") as run:
    
    model = create_simple_lstm(window_size, units=50)
    
    lr = 0.001
    epochs = 50
    batch_size = 32
    
    mlflow.log_param("model_type", "SimpleLSTM")
    mlflow.log_param("lstm_units", 50)
    mlflow.log_param("window_size", window_size)
    mlflow.log_param("learning_rate", lr)
    mlflow.log_param("epochs", epochs)
    mlflow.log_param("batch_size", batch_size)
    mlflow.log_param("total_parameters", model.count_params())
    
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=lr),
        loss='mse',
        metrics=['mae']
    )
    
    early_stopping = callbacks.EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True
    )
    
    history = model.fit(
        X_train, y_train,
        epochs=epochs,
        batch_size=batch_size,
        validation_split=0.1,
        callbacks=[early_stopping],
        verbose=1
    )
    
    y_pred_train = model.predict(X_train).flatten()
    y_pred_test = model.predict(X_test).flatten()
    
    train_mse = mean_squared_error(y_train, y_pred_train)
    train_mae = mean_absolute_error(y_train, y_pred_train)
    train_r2 = r2_score(y_train, y_pred_train)
    
    test_mse = mean_squared_error(y_test, y_pred_test)
    test_mae = mean_absolute_error(y_test, y_pred_test)
    test_r2 = r2_score(y_test, y_pred_test)
    
    mlflow.log_metric("train_mse", train_mse)
    mlflow.log_metric("train_mae", train_mae)
    mlflow.log_metric("train_r2", train_r2)
    mlflow.log_metric("test_mse", test_mse)
    mlflow.log_metric("test_mae", test_mae)
    mlflow.log_metric("test_r2", test_r2)
    
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    
    axes[0, 0].plot(history.history['loss'], label='Train Loss')
    axes[0, 0].plot(history.history['val_loss'], label='Val Loss')
    axes[0, 0].set_xlabel('Epoch')
    axes[0, 0].set_ylabel('MSE')
    axes[0, 0].set_title('Training History')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    axes[0, 1].plot(y_test[:200], label='Actual', alpha=0.7)
    axes[0, 1].plot(y_pred_test[:200], label='Predicted', alpha=0.7)
    axes[0, 1].set_xlabel('Time')
    axes[0, 1].set_ylabel('Value (Normalized)')
    axes[0, 1].set_title('Predictions vs Actual (First 200 points)')
    axes[0, 1].legend()
    axes[0, 1].grid(True, alpha=0.3)
    
    axes[1, 0].scatter(y_test, y_pred_test, alpha=0.5, s=10)
    axes[1, 0].plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', lw=2)
    axes[1, 0].set_xlabel('Actual')
    axes[1, 0].set_ylabel('Predicted')
    axes[1, 0].set_title(f'Actual vs Predicted (R²={test_r2:.4f})')
    axes[1, 0].grid(True, alpha=0.3)
    
    residuals = y_test - y_pred_test
    axes[1, 1].hist(residuals, bins=50, edgecolor='black', alpha=0.7)
    axes[1, 1].set_xlabel('Residual')
    axes[1, 1].set_ylabel('Frequency')
    axes[1, 1].set_title('Residuals Distribution')
    axes[1, 1].axvline(x=0, color='r', linestyle='--', lw=2)
    axes[1, 1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig('lstm_evaluation.png', dpi=150)
    mlflow.log_artifact('lstm_evaluation.png')
    plt.show()
    
    mlflow.keras.log_model(model, "lstm_model")
    
    mlflow.set_tag("architecture", "LSTM")
    mlflow.set_tag("task", "time_series_forecasting")
    
    print(f"\nTest Results:")
    print(f"MSE: {test_mse:.6f}")
    print(f"MAE: {test_mae:.6f}")
    print(f"R²: {test_r2:.6f}")

## 4. Modelo Avanzado: Stacked LSTM Bidireccional

In [None]:
def create_stacked_bidirectional_lstm(window_size):
    model = models.Sequential([
        layers.Bidirectional(
            layers.LSTM(64, return_sequences=True),
            input_shape=(window_size, 1)
        ),
        layers.Dropout(0.2),
        
        layers.Bidirectional(
            layers.LSTM(32, return_sequences=True)
        ),
        layers.Dropout(0.2),
        
        layers.Bidirectional(
            layers.LSTM(16)
        ),
        layers.Dropout(0.2),
        
        layers.Dense(32, activation='relu'),
        layers.Dropout(0.2),
        layers.Dense(1)
    ])
    return model

model_advanced = create_stacked_bidirectional_lstm(window_size)
model_advanced.summary()

In [None]:
with mlflow.start_run(run_name="stacked_bidirectional_lstm") as run:
    
    model = create_stacked_bidirectional_lstm(window_size)
    
    lr = 0.001
    epochs = 50
    batch_size = 32
    
    mlflow.log_param("model_type", "StackedBidirectionalLSTM")
    mlflow.log_param("num_lstm_layers", 3)
    mlflow.log_param("bidirectional", True)
    mlflow.log_param("window_size", window_size)
    mlflow.log_param("learning_rate", lr)
    mlflow.log_param("epochs", epochs)
    mlflow.log_param("batch_size", batch_size)
    mlflow.log_param("total_parameters", model.count_params())
    
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=lr),
        loss='mse',
        metrics=['mae']
    )
    
    early_stopping = callbacks.EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True
    )
    
    reduce_lr = callbacks.ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=5,
        min_lr=1e-6
    )
    
    history = model.fit(
        X_train, y_train,
        epochs=epochs,
        batch_size=batch_size,
        validation_split=0.1,
        callbacks=[early_stopping, reduce_lr],
        verbose=1
    )
    
    y_pred_test = model.predict(X_test).flatten()
    
    test_mse = mean_squared_error(y_test, y_pred_test)
    test_mae = mean_absolute_error(y_test, y_pred_test)
    test_r2 = r2_score(y_test, y_pred_test)
    
    mlflow.log_metric("test_mse", test_mse)
    mlflow.log_metric("test_mae", test_mae)
    mlflow.log_metric("test_r2", test_r2)
    
    mlflow.keras.log_model(model, "stacked_bidirectional_lstm_model")
    
    print(f"\nStacked Bidirectional LSTM Results:")
    print(f"Test R²: {test_r2:.6f}")

## 5. Multi-Step Forecasting

Predecir múltiples pasos hacia el futuro

In [None]:
forecast_horizon = 10
X_multi, y_multi = create_sequences(series_scaled, window_size, forecast_horizon)

train_size = int(0.8 * len(X_multi))
X_train_multi, X_test_multi = X_multi[:train_size], X_multi[train_size:]
y_train_multi, y_test_multi = y_multi[:train_size], y_multi[train_size:]

print(f"Multi-step forecasting:")
print(f"X_train shape: {X_train_multi.shape}")
print(f"y_train shape: {y_train_multi.shape}")
print(f"Forecast horizon: {forecast_horizon} steps")

In [None]:
def create_multistep_lstm(window_size, forecast_horizon):
    model = models.Sequential([
        layers.LSTM(100, return_sequences=True, input_shape=(window_size, 1)),
        layers.Dropout(0.2),
        layers.LSTM(50),
        layers.Dropout(0.2),
        layers.Dense(50, activation='relu'),
        layers.Dense(forecast_horizon)
    ])
    return model

with mlflow.start_run(run_name="multistep_lstm") as run:
    
    model = create_multistep_lstm(window_size, forecast_horizon)
    
    mlflow.log_param("model_type", "MultistepLSTM")
    mlflow.log_param("forecast_horizon", forecast_horizon)
    mlflow.log_param("window_size", window_size)
    
    model.compile(
        optimizer='adam',
        loss='mse',
        metrics=['mae']
    )
    
    early_stopping = callbacks.EarlyStopping(
        monitor='val_loss',
        patience=15,
        restore_best_weights=True
    )
    
    history = model.fit(
        X_train_multi, y_train_multi,
        epochs=50,
        batch_size=32,
        validation_split=0.1,
        callbacks=[early_stopping],
        verbose=1
    )
    
    y_pred_multi = model.predict(X_test_multi)
    
    test_mse = mean_squared_error(y_test_multi, y_pred_multi)
    test_mae = mean_absolute_error(y_test_multi, y_pred_multi)
    
    mlflow.log_metric("test_mse", test_mse)
    mlflow.log_metric("test_mae", test_mae)
    
    fig, axes = plt.subplots(2, 1, figsize=(15, 10))
    
    for i in range(min(5, len(y_test_multi))):
        axes[0].plot(range(forecast_horizon), y_test_multi[i], 'o-', alpha=0.6, label=f'Actual {i+1}')
        axes[0].plot(range(forecast_horizon), y_pred_multi[i], 'x--', alpha=0.6, label=f'Pred {i+1}')
    axes[0].set_xlabel('Forecast Step')
    axes[0].set_ylabel('Value')
    axes[0].set_title(f'{forecast_horizon}-Step Ahead Forecasts')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    step_errors = []
    for step in range(forecast_horizon):
        step_mae = mean_absolute_error(y_test_multi[:, step], y_pred_multi[:, step])
        step_errors.append(step_mae)
    
    axes[1].plot(range(1, forecast_horizon + 1), step_errors, 'o-', linewidth=2)
    axes[1].set_xlabel('Forecast Horizon (steps ahead)')
    axes[1].set_ylabel('MAE')
    axes[1].set_title('Error by Forecast Horizon')
    axes[1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig('multistep_forecast.png', dpi=150)
    mlflow.log_artifact('multistep_forecast.png')
    plt.show()
    
    mlflow.keras.log_model(model, "multistep_lstm_model")
    
    print(f"Multi-step forecast MSE: {test_mse:.6f}")
    print(f"Multi-step forecast MAE: {test_mae:.6f}")

## 6. Comparación de Modelos RNN

In [None]:
experiment = mlflow.get_experiment_by_name("tensorflow-rnn-timeseries")
runs = mlflow.search_runs(experiment_ids=[experiment.experiment_id])

print("Comparación de modelos RNN/LSTM:")
comparison = runs[[
    'tags.mlflow.runName',
    'metrics.test_r2',
    'metrics.test_mse',
    'metrics.test_mae',
    'params.total_parameters'
]].sort_values('metrics.test_r2', ascending=False)

print(comparison)

## Resumen del Módulo 2.2

### Conceptos Clave:

1. **RNN y LSTM**
   - RNN para datos secuenciales
   - LSTM soluciona vanishing gradients
   - GRU como alternativa más simple

2. **Arquitecturas Avanzadas**
   - Stacked LSTMs: múltiples capas
   - Bidirectional: procesa en ambas direcciones
   - Combinación de ambas

3. **Forecasting**
   - Single-step: predecir siguiente valor
   - Multi-step: predecir múltiples pasos
   - Error aumenta con horizonte de predicción

4. **Preparación de Datos**
   - Crear secuencias con ventana deslizante
   - Normalización es crucial
   - Train/test split temporal (no aleatorio)

### MLflow para Series Temporales:
- Log de window_size y forecast_horizon
- Tracking de métricas por horizonte de predicción
- Visualización de predicciones
- Comparación de arquitecturas

### Próximo Módulo:
PyTorch con Transfer Learning