# Validación Matemática de Ventanas Temporales Óptimas

**Objetivo**: Determinar empíricamente la ventana temporal óptima `[t_start, t_end]` para cada evento E1-E11.

**Pregunta central**: ¿Cuándo debemos comenzar a recoger ticks **anticipando** el evento y hasta cuándo después del evento para maximizar información predictiva?

## Enfoque Matemático

Usaremos 3 métodos complementarios:

1. **Information Gain (IG)**: Mide reducción de entropía por día
2. **Feature Importance (FI)**: Usando LightGBM para medir contribución predictiva
3. **Mutual Information (MI)**: Mide dependencia entre features de día `t` y target

### Notación

- `t=0`: Día del evento
- `t<0`: Días antes del evento (anticipación)
- `t>0`: Días después del evento (confirmación)
- `X_t`: Features del día `t` (precios, volumen, DIB bars, etc.)
- `y`: Target (retorno futuro, ej. `ret_5d`)

## 1. Setup y Carga de Datos

In [None]:
import polars as pl
import numpy as np
import pandas as pd
from pathlib import Path
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import mutual_info_score
from scipy.stats import entropy
import lightgbm as lgb
from typing import Dict, List, Tuple

# Config
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (14, 8)

# Paths
TRADES_DIR = Path('../../../../raw/polygon/trades_pilot50_validation')
DIB_DIR = Path('../../../../processed/dib_bars/pilot50_validation')
WATCHLIST = Path('../../../../processed/watchlist_E1_E11.parquet')

print(f"Trades dir exists: {TRADES_DIR.exists()}")
print(f"DIB dir exists: {DIB_DIR.exists()}")
print(f"Watchlist exists: {WATCHLIST.exists()}")

## 2. Definición Matemática de Ventana Óptima

### 2.1 Información Mutua

Para cada día `t` relativo al evento:

$$I(X_t; y) = \sum_{x_t} \sum_{y} p(x_t, y) \log \frac{p(x_t, y)}{p(x_t)p(y)}$$

**Interpretación**: Cuánta información sobre el target `y` nos da conocer las features del día `t`.

### 2.2 Information Gain

$$IG(X_t) = H(y) - H(y|X_t)$$

Donde:
- $H(y)$ = Entropía del target sin información
- $H(y|X_t)$ = Entropía condicional dado las features del día `t`

### 2.3 Ventana Óptima

La ventana óptima `[t_start, t_end]` maximiza la información total:

$$[t^*_{start}, t^*_{end}] = \arg\max_{t_{start}, t_{end}} \sum_{t=t_{start}}^{t_{end}} I(X_t; y)$$

**Con restricciones**:
1. $I(X_t; y) > \theta$ (threshold mínimo de información)
2. Cost: Cada día adicional tiene costo de descarga/procesamiento
3. $t_{start} \geq -7$ (máximo 7 días anticipación práctica)
4. $t_{end} \leq +7$ (máximo 7 días confirmación práctica)

## 3. Método 1: Information Gain por Día

Calculamos cuánta información aporta cada día `t` relativo al evento.

In [None]:
def calculate_mutual_information(
    X: np.ndarray,  # Features del día t
    y: np.ndarray,  # Target
    bins: int = 10
) -> float:
    """
    Calcula información mutua I(X;y) entre features de un día y target.
    
    Args:
        X: Features (N_samples, N_features)
        y: Target (N_samples,)
        bins: Bins para discretizar variables continuas
    
    Returns:
        Mutual information score (bits)
    """
    # Discretizar target para mutual information
    y_binned = pd.cut(y, bins=bins, labels=False)
    
    # Calcular MI para cada feature y promediar
    mi_scores = []
    for col_idx in range(X.shape[1]):
        x_col = X[:, col_idx]
        
        # Discretizar feature
        x_binned = pd.cut(x_col, bins=bins, labels=False, duplicates='drop')
        
        # Calcular MI
        # Eliminar NaN de binning
        valid_mask = ~(pd.isna(x_binned) | pd.isna(y_binned))
        if valid_mask.sum() > 0:
            mi = mutual_info_score(x_binned[valid_mask], y_binned[valid_mask])
            mi_scores.append(mi)
    
    return np.mean(mi_scores) if mi_scores else 0.0


def calculate_information_gain(
    X: np.ndarray,
    y: np.ndarray,
    bins: int = 10
) -> float:
    """
    Calcula Information Gain: IG(X) = H(y) - H(y|X)
    
    Returns:
        Information gain (bits)
    """
    # Entropía sin condición
    y_binned = pd.cut(y, bins=bins, labels=False)
    y_counts = pd.Series(y_binned).value_counts(normalize=True)
    H_y = entropy(y_counts.values, base=2)
    
    # Entropía condicional promedio sobre features
    H_y_given_X = []
    for col_idx in range(X.shape[1]):
        x_col = X[:, col_idx]
        x_binned = pd.cut(x_col, bins=bins, labels=False, duplicates='drop')
        
        # H(y|X=x) para cada valor de x
        valid_mask = ~(pd.isna(x_binned) | pd.isna(y_binned))
        if valid_mask.sum() == 0:
            continue
            
        df_temp = pd.DataFrame({
            'x': x_binned[valid_mask],
            'y': y_binned[valid_mask]
        })
        
        conditional_entropy = 0
        for x_val in df_temp['x'].unique():
            y_given_x = df_temp[df_temp['x'] == x_val]['y']
            p_x = len(y_given_x) / len(df_temp)
            y_counts_given_x = y_given_x.value_counts(normalize=True)
            h = entropy(y_counts_given_x.values, base=2)
            conditional_entropy += p_x * h
        
        H_y_given_X.append(conditional_entropy)
    
    H_y_given_X_avg = np.mean(H_y_given_X) if H_y_given_X else H_y
    
    return max(0, H_y - H_y_given_X_avg)


print("✓ Funciones de información mutua definidas")

## 4. Método 2: Feature Importance con LightGBM

Entrenamos un modelo simple y medimos importancia de features por día.

In [None]:
def calculate_feature_importance_by_day(
    features_by_day: Dict[int, np.ndarray],  # {day_relative: features}
    y: np.ndarray,
    feature_names: List[str]
) -> Dict[int, float]:
    """
    Entrena LightGBM y calcula importancia promedio de features por día.
    
    Args:
        features_by_day: {-3: X_minus3, -2: X_minus2, ..., 0: X_0, ..., +3: X_plus3}
        y: Target variable
        feature_names: Nombres de features por día
    
    Returns:
        {day: importance_score} para cada día
    """
    # Concatenar todas las features con prefijo de día
    X_combined = []
    feature_names_combined = []
    day_to_feature_indices = {}
    
    current_idx = 0
    for day, X_day in sorted(features_by_day.items()):
        X_combined.append(X_day)
        
        # Track indices for this day
        n_features_day = X_day.shape[1]
        day_to_feature_indices[day] = list(range(current_idx, current_idx + n_features_day))
        current_idx += n_features_day
        
        # Add feature names with day prefix
        for fname in feature_names:
            feature_names_combined.append(f"t{day:+d}_{fname}")
    
    X_combined = np.hstack(X_combined)
    
    # Train LightGBM
    train_data = lgb.Dataset(X_combined, label=y, feature_name=feature_names_combined)
    
    params = {
        'objective': 'regression',
        'metric': 'rmse',
        'num_leaves': 31,
        'learning_rate': 0.05,
        'feature_fraction': 0.9,
        'verbose': -1
    }
    
    model = lgb.train(
        params,
        train_data,
        num_boost_round=100,
        valid_sets=[train_data],
        callbacks=[lgb.log_evaluation(period=0)]
    )
    
    # Get feature importance
    importance = model.feature_importance(importance_type='gain')
    
    # Aggregate importance by day
    importance_by_day = {}
    for day, indices in day_to_feature_indices.items():
        day_importance = importance[indices].sum()
        importance_by_day[day] = day_importance
    
    # Normalize
    total_importance = sum(importance_by_day.values())
    if total_importance > 0:
        importance_by_day = {k: v/total_importance for k, v in importance_by_day.items()}
    
    return importance_by_day


print("✓ Función de feature importance definida")

## 5. Algoritmo de Determinación de Ventana Óptima

### Algoritmo: Threshold-Based Window Selection

```
Para cada evento E_k:
    1. Calcular I(X_t; y) para t ∈ [-7, +7]
    2. Normalizar: I_norm(t) = I(t) / max(I)
    3. Definir threshold θ (ej. 0.1 = 10% del máximo)
    4. t_start = min{t : I_norm(t) > θ}
    5. t_end = max{t : I_norm(t) > θ}
    6. Verificar continuidad (no gaps > 2 días)
    7. Retornar [t_start, t_end]
```

In [None]:
def determine_optimal_window(
    information_by_day: Dict[int, float],
    threshold: float = 0.1,
    max_gap: int = 2
) -> Tuple[int, int]:
    """
    Determina ventana óptima [t_start, t_end] basado en información mutua.
    
    Args:
        information_by_day: {day: information_score}
        threshold: Umbral mínimo de información (relativo al máximo)
        max_gap: Máximo gap permitido en días sin información
    
    Returns:
        (t_start, t_end) ventana óptima
    """
    if not information_by_day:
        return (0, 0)
    
    # Normalizar información
    max_info = max(information_by_day.values())
    if max_info == 0:
        return (0, 0)
    
    info_norm = {k: v/max_info for k, v in information_by_day.items()}
    
    # Encontrar días que superan threshold
    significant_days = sorted([day for day, info in info_norm.items() if info >= threshold])
    
    if not significant_days:
        # Fallback: usar día con máxima información
        best_day = max(information_by_day.items(), key=lambda x: x[1])[0]
        return (best_day, best_day)
    
    # Verificar continuidad (no gaps grandes)
    # Si hay gap > max_gap, dividir en segmentos
    segments = []
    current_segment = [significant_days[0]]
    
    for i in range(1, len(significant_days)):
        if significant_days[i] - significant_days[i-1] <= max_gap + 1:
            current_segment.append(significant_days[i])
        else:
            segments.append(current_segment)
            current_segment = [significant_days[i]]
    segments.append(current_segment)
    
    # Elegir segmento con mayor información total
    best_segment = max(segments, key=lambda seg: sum(information_by_day[d] for d in seg))
    
    t_start = min(best_segment)
    t_end = max(best_segment)
    
    return (t_start, t_end)


def visualize_information_by_day(
    information_by_day: Dict[int, float],
    optimal_window: Tuple[int, int],
    event_name: str,
    method: str = 'MI'
):
    """
    Visualiza información por día y ventana óptima.
    """
    days = sorted(information_by_day.keys())
    info = [information_by_day[d] for d in days]
    
    fig, ax = plt.subplots(figsize=(12, 6))
    
    # Plot información
    ax.bar(days, info, alpha=0.6, label=f'{method} Score')
    
    # Marcar día del evento
    ax.axvline(x=0, color='red', linestyle='--', linewidth=2, label='Día Evento (t=0)')
    
    # Marcar ventana óptima
    t_start, t_end = optimal_window
    ax.axvspan(t_start, t_end, alpha=0.2, color='green', label=f'Ventana Óptima [{t_start}, {t_end}]')
    
    # Threshold line
    threshold_value = max(info) * 0.1
    ax.axhline(y=threshold_value, color='orange', linestyle=':', label='Threshold (10% max)')
    
    ax.set_xlabel('Días Relativos al Evento', fontsize=12)
    ax.set_ylabel(f'{method} Score', fontsize=12)
    ax.set_title(f'{event_name}: Información por Día Relativo\nVentana Óptima: [{t_start}, {t_end}]', fontsize=14)
    ax.legend()
    ax.grid(True, alpha=0.3)
    
    plt.tight_layout()
    return fig


print("✓ Algoritmo de ventana óptima definido")

## 6. Análisis Costo-Beneficio

### 6.1 Trade-off: Información vs Costo

Cada día adicional tiene costo:
- Descarga de datos
- Procesamiento computacional  
- Almacenamiento

**Función objetivo con costo**:

$$
\max_{[t_{start}, t_{end}]} \left[ \sum_{t=t_{start}}^{t_{end}} I(X_t; y) - \lambda \cdot (t_{end} - t_{start} + 1) \right]
$$

Donde $\lambda$ es el costo por día.

In [None]:
def cost_benefit_analysis(
    information_by_day: Dict[int, float],
    cost_per_day: float = 0.01,  # Costo relativo por día
    max_window: int = 7
) -> pd.DataFrame:
    """
    Analiza trade-off entre información ganada y costo de ventana.
    
    Returns:
        DataFrame con análisis para cada ventana posible
    """
    results = []
    
    days = sorted(information_by_day.keys())
    
    # Probar todas las ventanas posibles
    for t_start in range(-max_window, max_window + 1):
        for t_end in range(t_start, max_window + 1):
            # Calcular información total en ventana
            info_total = sum(
                information_by_day.get(t, 0) 
                for t in range(t_start, t_end + 1)
            )
            
            # Calcular costo
            window_size = t_end - t_start + 1
            cost = cost_per_day * window_size
            
            # Net benefit
            net_benefit = info_total - cost
            
            # Information per day
            info_per_day = info_total / window_size if window_size > 0 else 0
            
            results.append({
                't_start': t_start,
                't_end': t_end,
                'window_size': window_size,
                'total_information': info_total,
                'cost': cost,
                'net_benefit': net_benefit,
                'info_per_day': info_per_day
            })
    
    df = pd.DataFrame(results)
    
    # Encontrar ventana óptima por net benefit
    best_idx = df['net_benefit'].idxmax()
    df['is_optimal'] = False
    df.loc[best_idx, 'is_optimal'] = True
    
    return df.sort_values('net_benefit', ascending=False)


def plot_cost_benefit(
    df_cost_benefit: pd.DataFrame,
    event_name: str
):
    """
    Visualiza análisis costo-beneficio.
    """
    fig, axes = plt.subplots(2, 2, figsize=(14, 10))
    
    # 1. Net Benefit vs Window Size
    ax = axes[0, 0]
    scatter = ax.scatter(
        df_cost_benefit['window_size'],
        df_cost_benefit['net_benefit'],
        c=df_cost_benefit['total_information'],
        cmap='viridis',
        alpha=0.6,
        s=50
    )
    # Marcar óptimo
    optimal = df_cost_benefit[df_cost_benefit['is_optimal']].iloc[0]
    ax.scatter(
        optimal['window_size'],
        optimal['net_benefit'],
        color='red',
        s=200,
        marker='*',
        label='Óptimo',
        edgecolors='black',
        linewidths=2
    )
    ax.set_xlabel('Window Size (días)')
    ax.set_ylabel('Net Benefit')
    ax.set_title('Net Benefit vs Window Size')
    ax.legend()
    ax.grid(True, alpha=0.3)
    plt.colorbar(scatter, ax=ax, label='Total Information')
    
    # 2. Information per Day
    ax = axes[0, 1]
    ax.scatter(
        df_cost_benefit['window_size'],
        df_cost_benefit['info_per_day'],
        alpha=0.6
    )
    ax.scatter(
        optimal['window_size'],
        optimal['info_per_day'],
        color='red',
        s=200,
        marker='*',
        edgecolors='black',
        linewidths=2
    )
    ax.set_xlabel('Window Size (días)')
    ax.set_ylabel('Information per Day')
    ax.set_title('Eficiencia: Info por Día')
    ax.grid(True, alpha=0.3)
    
    # 3. Heatmap: t_start vs t_end (Net Benefit)
    ax = axes[1, 0]
    pivot = df_cost_benefit.pivot(index='t_start', columns='t_end', values='net_benefit')
    sns.heatmap(pivot, ax=ax, cmap='RdYlGn', center=0, annot=False, cbar_kws={'label': 'Net Benefit'})
    ax.set_title('Net Benefit: t_start vs t_end')
    
    # 4. Top 10 ventanas
    ax = axes[1, 1]
    top10 = df_cost_benefit.head(10)
    y_pos = np.arange(len(top10))
    labels = [f"[{row['t_start']}, {row['t_end']}]" for _, row in top10.iterrows()]
    colors = ['red' if row['is_optimal'] else 'steelblue' for _, row in top10.iterrows()]
    
    ax.barh(y_pos, top10['net_benefit'], color=colors, alpha=0.7)
    ax.set_yticks(y_pos)
    ax.set_yticklabels(labels)
    ax.set_xlabel('Net Benefit')
    ax.set_title('Top 10 Ventanas por Net Benefit')
    ax.grid(True, alpha=0.3, axis='x')
    
    plt.suptitle(f'{event_name}: Análisis Costo-Beneficio de Ventanas', fontsize=16, y=1.00)
    plt.tight_layout()
    return fig


print("✓ Funciones de análisis costo-beneficio definidas")

## 7. Pipeline Completo: Ejemplo Simulado

Vamos a simular el análisis para un evento E1 con datos sintéticos (antes de tener DIB bars reales).

In [None]:
# Simular datos sintéticos para demostración
np.random.seed(42)

# Simular información por día (patrón realista)
# Expectativa: Información aumenta cerca del evento (t=0)
def simulate_information_pattern(event_type='spike'):
    days = range(-7, 8)
    
    if event_type == 'spike':
        # Evento súbito: máxima info en t=0, decae rápido
        info = {d: np.exp(-0.5 * d**2) + np.random.normal(0, 0.05) for d in days}
    elif event_type == 'gradual':
        # Evento gradual: info crece antes, persiste después
        info = {d: (1 / (1 + np.exp(-d))) * np.exp(-0.1 * d**2) + np.random.normal(0, 0.05) for d in days}
    elif event_type == 'anticipation':
        # Evento con anticipación: info máxima antes del evento
        info = {d: np.exp(-0.3 * (d + 2)**2) + np.random.normal(0, 0.05) for d in days}
    else:
        # Flat: sin patrón claro
        info = {d: 0.1 + np.random.normal(0, 0.05) for d in days}
    
    # Normalizar y asegurar no negativos
    max_info = max(info.values())
    info = {k: max(0, v/max_info) for k, v in info.items()}
    
    return info


# Simular 3 tipos de eventos
event_patterns = {
    'E1_VolExplosion': 'spike',
    'E4_Parabolic': 'gradual',
    'E7_FirstRedDay': 'anticipation'
}

results_summary = []

for event_name, pattern_type in event_patterns.items():
    print(f"\n{'='*60}")
    print(f"EVENTO: {event_name} (patrón: {pattern_type})")
    print(f"{'='*60}")
    
    # 1. Simular información por día
    info_by_day = simulate_information_pattern(pattern_type)
    
    # 2. Determinar ventana óptima (threshold-based)
    optimal_window_threshold = determine_optimal_window(
        info_by_day,
        threshold=0.1,
        max_gap=2
    )
    
    print(f"\nVentana óptima (threshold 10%): [{optimal_window_threshold[0]}, {optimal_window_threshold[1]}]")
    
    # 3. Análisis costo-beneficio
    df_cb = cost_benefit_analysis(
        info_by_day,
        cost_per_day=0.05,  # 5% cost relative to max info
        max_window=7
    )
    
    optimal_cb = df_cb[df_cb['is_optimal']].iloc[0]
    optimal_window_cb = (int(optimal_cb['t_start']), int(optimal_cb['t_end']))
    
    print(f"Ventana óptima (cost-benefit): [{optimal_window_cb[0]}, {optimal_window_cb[1]}]")
    print(f"  - Window size: {optimal_cb['window_size']} días")
    print(f"  - Total information: {optimal_cb['total_information']:.3f}")
    print(f"  - Net benefit: {optimal_cb['net_benefit']:.3f}")
    print(f"  - Info per day: {optimal_cb['info_per_day']:.3f}")
    
    # 4. Visualizar
    fig1 = visualize_information_by_day(
        info_by_day,
        optimal_window_threshold,
        event_name,
        method='MI'
    )
    plt.show()
    
    fig2 = plot_cost_benefit(df_cb, event_name)
    plt.show()
    
    # Guardar resultados
    results_summary.append({
        'event': event_name,
        'pattern': pattern_type,
        't_start_threshold': optimal_window_threshold[0],
        't_end_threshold': optimal_window_threshold[1],
        't_start_cb': optimal_window_cb[0],
        't_end_cb': optimal_window_cb[1],
        'window_size_cb': optimal_cb['window_size'],
        'total_info': optimal_cb['total_information'],
        'net_benefit': optimal_cb['net_benefit']
    })

# Resumen final
df_summary = pd.DataFrame(results_summary)
print("\n" + "="*80)
print("RESUMEN: VENTANAS OPTIMAS POR EVENTO (SIMULADO)")
print("="*80)
print(df_summary.to_string(index=False))

## 8. Comparación con Ventanas Cualitativas (F.3)

Comparamos las ventanas empíricas vs las definidas cualitativamente.

In [None]:
# Ventanas cualitativas de F.3
EVENT_WINDOWS_QUALITATIVE = {
    'E0_AlwaysTrue': 1,
    'E1_VolExplosion': 1,
    'E2_GapUp': 2,
    'E3_PriceSpikeIntraday': 1,
    'E4_Parabolic': 3,
    'E5_BreakoutATH': 2,
    'E6_MultipleGreenDays': 2,
    'E7_FirstRedDay': 2,
    'E8_GapDownViolent': 2,
    'E9_CrashIntraday': 1,
    'E10_FirstGreenBounce': 1,
    'E11_VolumeBounce': 2
}

def compare_windows(
    empirical_windows: Dict[str, Tuple[int, int]],
    qualitative_windows: Dict[str, int]
) -> pd.DataFrame:
    """
    Compara ventanas empíricas vs cualitativas.
    
    Args:
        empirical_windows: {event: (t_start, t_end)}
        qualitative_windows: {event: ±N} (simétrico)
    """
    comparisons = []
    
    for event in empirical_windows.keys():
        t_start_emp, t_end_emp = empirical_windows[event]
        window_size_emp = t_end_emp - t_start_emp + 1
        
        if event in qualitative_windows:
            window_qual = qualitative_windows[event]
            window_size_qual = 2 * window_qual + 1  # ±N -> total 2N+1 días
            
            # Comparar
            diff = window_size_emp - window_size_qual
            match = 'Match' if diff == 0 else ('Más pequeño' if diff < 0 else 'Más grande')
            
            comparisons.append({
                'Evento': event,
                'Empírico [t_start, t_end]': f"[{t_start_emp}, {t_end_emp}]",
                'Empírico Size': window_size_emp,
                'Cualitativo ±N': f"±{window_qual}",
                'Cualitativo Size': window_size_qual,
                'Diferencia': diff,
                'Status': match
            })
    
    return pd.DataFrame(comparisons)


# Crear ventanas empíricas desde resultados simulados
empirical_windows = {
    row['event']: (row['t_start_cb'], row['t_end_cb'])
    for _, row in df_summary.iterrows()
}

df_comparison = compare_windows(empirical_windows, EVENT_WINDOWS_QUALITATIVE)

print("\n" + "="*80)
print("COMPARACION: EMPIRICO vs CUALITATIVO")
print("="*80)
print(df_comparison.to_string(index=False))

# Visualizar comparación
fig, ax = plt.subplots(figsize=(12, 6))

x = np.arange(len(df_comparison))
width = 0.35

ax.bar(x - width/2, df_comparison['Empírico Size'], width, label='Empírico', alpha=0.8)
ax.bar(x + width/2, df_comparison['Cualitativo Size'], width, label='Cualitativo (F.3)', alpha=0.8)

ax.set_xlabel('Evento')
ax.set_ylabel('Window Size (días)')
ax.set_title('Comparación: Ventanas Empíricas vs Cualitativas')
ax.set_xticks(x)
ax.set_xticklabels(df_comparison['Evento'], rotation=45, ha='right')
ax.legend()
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

## 9. Conclusiones y Recomendaciones

### 9.1 Hallazgos Clave

1. **Ventanas asimétricas**: Los eventos pueden requerir más días **antes** o **después** del evento (no siempre simétrico ±N)

2. **Trade-off información-costo**: Ventanas más grandes no siempre son mejores (rendimientos decrecientes)

3. **Variabilidad por evento**: Cada evento E1-E11 tiene patrón temporal diferente

### 9.2 Próximos Pasos con Datos Reales

Una vez tengamos DIB bars de Pilot50:

1. **Ejecutar este notebook con datos reales**
2. **Validar para cada evento E1-E11**
3. **Actualizar EVENT_WINDOWS en F.3** con ventanas empíricas
4. **Documentar en F.6** los resultados de validación

### 9.3 Ventanas Recomendadas (Provisionales)

Basado en simulación y razonamiento:

```python
EVENT_WINDOWS_EMPIRICAL = {
    'E1_VolExplosion': (0, 1),      # Evento súbito, poca anticipación
    'E4_Parabolic': (-2, 2),        # Proceso gradual, necesita contexto
    'E7_FirstRedDay': (-2, 1),      # Anticipación importante
    # ... etc
}
```

**PENDIENTE**: Validar con datos reales del Pilot50.