# Decomposer Forecasting Comparison

This notebook compares forecasting performance across different approaches:

1. **Vanilla Prophet** - Baseline Prophet with custom seasonalities
2. **Vanilla NeuralProphet** - NeuralProphet without decomposition
3. **NeuralProphet + HighLowFreqDecomposer** - 2-band decomposition (12h, 24h, low_freq)
4. **NeuralProphet + SignalDecomposer (5-band)** - Multi-scale decomposition (sub-daily, daily, weekly, bi-weekly, monthly)
5. **EnsembleForecaster** - Combines specialized high-freq and low-freq NeuralProphet models

The key innovation is the new `SignalDecomposer` / `HighLowFreqDecomposer` that:
- Uses Prophet to extend the signal forward before decomposition
- Pushes edge effects into the discarded future portion
- Provides sklearn-style fit/transform API

The `EnsembleForecaster` builds on this by:
- Using separate configs for high-freq and low-freq models
- Each model optimized for its frequency band
- Simple summation combines the forecasts

In [None]:
from __future__ import annotations

import warnings
warnings.filterwarnings("ignore")

import logging
logging.disable(logging.INFO)

import os
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

import sys
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

sys.path.insert(0, str(Path.cwd().parent))

from rubin_oracle import (
    ProphetConfig,
    ProphetForecaster,
    NeuralProphetConfig,
    NeuralProphetForecaster,
    HighLowFreqDecomposer,
    SignalDecomposer,
)

# Paths
PROJECT_ROOT = Path.cwd().parent
DATA_PATH = PROJECT_ROOT / "data" / "temp2024.csv"
CONFIGS_PATH = PROJECT_ROOT / "configs"

# Constants - Using HOURLY frequency
STEPS_PER_HOUR = 1
STEPS_PER_DAY = 24

# Styling
sns.set_context("notebook", font_scale=0.9)
plt.rcParams["figure.figsize"] = (14, 6)

COLORS = {
    "actual": "#2c3e50",
    "prophet": "#95a5a6",
    "np_vanilla": "#3498db",
    "np_decomp": "#e74c3c",
    "np_5band": "#9b59b6",
}

print(f"Project root: {PROJECT_ROOT}")
print(f"Data path: {DATA_PATH}")
print(f"Using HOURLY frequency: {STEPS_PER_DAY} samples/day")

## 1. Load Data

In [None]:
# Load temperature data
df = pd.read_csv(DATA_PATH)
df["ds"] = pd.to_datetime(df["ds"], utc=True).dt.tz_convert("America/Santiago")
df = df[["ds", "y"]].copy()

# Remove timezone for Prophet compatibility
df["ds"] = df["ds"].dt.tz_localize(None)

# Drop duplicates (DST transitions can create duplicates)
df = df.drop_duplicates(subset="ds", keep="first").reset_index(drop=True)

print(f"Raw dataset: {len(df):,} samples ({len(df) / 96:.1f} days at 15-min)")

# Resample to hourly frequency
df = df.set_index("ds").resample("1h").mean().reset_index()
df = df.dropna()

print(f"Resampled to hourly: {len(df):,} samples ({len(df) / STEPS_PER_DAY:.1f} days)")
print(f"Date range: {df['ds'].min()} to {df['ds'].max()}")
print("\nSample data:")
df.head()

## 2. Train/Test Split

In [None]:
# Use 300 days for training, forecast 24h ahead
TRAIN_DAYS = 120
FORECAST_HORIZON = 24  # 24 samples = 24 hours at hourly freq

train_samples = TRAIN_DAYS * STEPS_PER_DAY + 3
df_train = df.iloc[:train_samples].copy()
df_test = df.iloc[train_samples:train_samples + FORECAST_HORIZON].copy()

print(f"Training: {len(df_train):,} samples ({len(df_train) / STEPS_PER_DAY:.0f} days)")
print(f"Test: {len(df_test)} samples ({len(df_test)} hours)")
print(f"\nForecast start: {df_test['ds'].iloc[0]}")

## 3. Vanilla Prophet (Baseline)

In [None]:
print("Fitting Vanilla Prophet...")

# Load config
prophet_config = ProphetConfig.from_yaml(CONFIGS_PATH / "prophet_default.yaml")
print(f"Config: {prophet_config.name}")

# Fit
prophet_model = ProphetForecaster(prophet_config)
prophet_model.fit(df_train)

# Metrics
print("\nProphet Metrics:")
for key, val in prophet_model.metrics_.items():
    print(f"  {key}: {val:.4f}" if isinstance(val, float) else f"  {key}: {val}")

# Forecast
prophet_fc = prophet_model.forecast()
yhat_prophet = prophet_fc["yhat"].values

# Bias correction
bias = df_train["y"].iloc[-1] - yhat_prophet[0]
yhat_prophet_corrected = yhat_prophet + bias

y_true = df_test["y"].values
rmse_prophet = np.sqrt(((y_true - yhat_prophet_corrected) ** 2).mean())
print(f"\nProphet Forecast RMSE (with bias correction): {rmse_prophet:.4f}")

In [None]:
# Plot Prophet
prophet_model.plot(df_test=df_test, window_days=7)

## 4. Vanilla NeuralProphet (No Decomposition)

In [None]:
print("Fitting Vanilla NeuralProphet...")

# Load config (already has freq: 1h)
np_config = NeuralProphetConfig.from_yaml(CONFIGS_PATH / "neuralprophet_default.yaml")

# Reduce epochs for faster testing
np_config_dict = np_config.model_dump()
np_config_dict["epochs"] = 30
np_config_dict["lag_days"] = 1
np_config = NeuralProphetConfig.model_validate(np_config_dict)

print(f"Config: {np_config.name}")
print(f"Freq: {np_config.freq}, Lag days: {np_config.lag_days}, Epochs: {np_config.epochs}")

# Fit
np_vanilla = NeuralProphetForecaster(np_config)
np_vanilla.fit(df_train, verbose=True)

# Metrics
print("\nVanilla NeuralProphet Metrics:")
for key, val in np_vanilla.metrics_.items():
    print(f"  {key}: {val:.4f}" if isinstance(val, float) else f"  {key}: {val}")

# Forecast
np_vanilla_fc = np_vanilla.forecast(df_test, np_vanilla.latest_timestamp)
yhat_np_vanilla = np_vanilla_fc["yhat"].values

rmse_np_vanilla = np.sqrt(((y_true - yhat_np_vanilla) ** 2).mean())
print(f"\nVanilla NeuralProphet Forecast RMSE: {rmse_np_vanilla:.4f}")

In [None]:
# Plot NeuralProphet vanilla
np_vanilla.plot(df_test=df_test, window_days=7)

## 5. NeuralProphet + HighLowFreqDecomposer (External Decomposition)

This approach uses the new `HighLowFreqDecomposer` which:
- Extends signal forward using Prophet before decomposition
- Extracts `y_high_12h`, `y_high_24h`, and `y_low_freq` components
- Pushes edge effects into the future (discarded) portion

In [None]:
print("Setting up HighLowFreqDecomposer...")

# Create decomposer
decomposer = HighLowFreqDecomposer(
    freq=STEPS_PER_DAY,
    extension_days=2.0,
    history_buffer_days=14.0,
    verbose=True,
)

# Fit and transform training data
print("\nDecomposing training data...")
decomposer.fit(df_train.tail(14 * 24))
df_train_decomposed = decomposer.transform(df_train)
df_train_decomposed_all = decomposer.fit_transform(df).iloc[:train_samples].copy()

print(f"\nDecomposed columns: {list(df_train_decomposed.columns)}")
print(f"Feature names: {decomposer.get_feature_names()}")
print(f"Training end: {decomposer.training_end_}")

In [None]:
# Visualize decomposition
fig, axes = plt.subplots(3, 1, figsize=(14, 10), sharex=True)

# Last 7 days of training
n_plot = 7 * STEPS_PER_DAY
df_plot = df_train_decomposed.tail(n_plot)

axes[0].plot(df_plot["ds"], df_plot["y"], "k", lw=0.8)
axes[0].set_ylabel("Original (°C)")
axes[0].set_title("HighLowFreqDecomposer Signal Decomposition", fontsize=14)

# axes[1].plot(df_plot["ds"], df_plot["y_high_12h"], color="#e74c3c", lw=1.2)
# axes[1].set_ylabel("y_high_12h (°C)")
# axes[1].axhline(0, color="gray", linestyle="--", alpha=0.3)

axes[1].plot(df_plot["ds"], df_plot["y_high_24h"], color="#3498db", lw=1.2)
axes[1].set_ylabel("y_high_24h (°C)")
axes[1].axhline(0, color="gray", linestyle="--", alpha=0.3)

axes[2].plot(df_plot["ds"], df_plot["y_low_freq"], color="#27ae60", lw=1.2)
axes[2].set_ylabel("y_low_freq (°C)")
axes[2].set_xlabel("Date")


# Last 7 days of training
n_plot = 7 * STEPS_PER_DAY
df_plot = df_train_decomposed_all.tail(n_plot)

axes[0].plot(df_plot["ds"], df_plot["y"], "k", lw=0.8)
axes[0].set_ylabel("Original (°C)")
axes[0].set_title("HighLowFreqDecomposer Signal Decomposition", fontsize=14)

# axes[1].plot(df_plot["ds"], df_plot["y_high_12h"], color="grey", lw=0.8)
# axes[1].set_ylabel("y_high_12h (°C)")
# axes[1].axhline(0, color="gray", linestyle="--", alpha=0.3)

axes[1].plot(df_plot["ds"], df_plot["y_high_24h"], color="grey", lw=0.8)
axes[1].set_ylabel("y_high_24h (°C)")
axes[1].axhline(0, color="gray", linestyle="--", alpha=0.3)

axes[2].plot(df_plot["ds"], df_plot["y_low_freq"], color="grey", lw=0.8)
axes[2].set_ylabel("y_low_freq (°C)")
axes[2].set_xlabel("Date")


for ax in axes:
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Variance breakdown
print("\nVariance breakdown:")
total_var = df_train_decomposed["y"].var()
for col in decomposer.get_feature_names():
    var = df_train_decomposed[col].var()
    print(f"  {col}: {var:.4f} ({var/total_var*100:.1f}%)")

In [None]:
print("Fitting NeuralProphet with external decomposition...")

# Load config (already has 1h frequency, no internal decomposition)
np_decomp_config = NeuralProphetConfig.from_yaml(CONFIGS_PATH / "neuralprophet_external_decomp.yaml")

print(f"Config: {np_decomp_config.name}")
print(f"Freq: {np_decomp_config.freq}, Lag days: {np_decomp_config.lag_days}, Epochs: {np_decomp_config.epochs}")
print(f"Internal decomposer: {np_decomp_config.decomposer.method}")

# Fit on decomposed data with explicit regressors
np_decomp = NeuralProphetForecaster(np_decomp_config)
np_decomp.fit(
    df_train_decomposed,
    verbose=True,
    regressors=decomposer.get_feature_names(),  # Explicitly pass decomposed columns
)

# Metrics
print("\nNeuralProphet + HighLowFreq Decomposer Metrics:")
for key, val in np_decomp.metrics_.items():
    print(f"  {key}: {val:.4f}" if isinstance(val, float) else f"  {key}: {val}")

In [None]:
# Transform test data using the decomposer
print("Decomposing test data (with Prophet extension)...")
df_test_decomposed = decomposer.transform(df_test)

print(f"Test decomposed shape: {df_test_decomposed.shape}")
print(f"NaN values: {df_test_decomposed[decomposer.get_feature_names()].isna().sum().sum()}")

# Forecast
np_decomp_fc = np_decomp.forecast(df_test_decomposed, np_decomp.latest_timestamp)
yhat_np_decomp = np_decomp_fc["yhat"].values

# Handle potential length mismatch
n_pred = min(len(y_true), len(yhat_np_decomp))
rmse_np_decomp = np.sqrt(((y_true[:n_pred] - yhat_np_decomp[:n_pred]) ** 2).mean())
print(f"\nNeuralProphet + HighLowFreqDecomposer Forecast RMSE: {rmse_np_decomp:.4f}")
print(f"Predictions: {len(yhat_np_decomp)} / {len(y_true)} expected")

In [None]:
# Plot NeuralProphet with decomposer
np_decomp.plot(df_test=df_test_decomposed, window_days=6)

## 6. NeuralProphet + SignalDecomposer (5-band)

Multi-scale decomposition with 5 frequency bands:
- **Band 0**: Sub-daily (0.1-0.8 days = 2.4-19.2 hours)
- **Band 1**: Daily (0.8-1.5 days = 19.2-36 hours)
- **Band 2**: Weekly (1.5-4 days)
- **Band 3**: Bi-weekly (4-25 days)
- **Band 4**: Monthly (25-60 days)

In [None]:
print("Setting up SignalDecomposer (5-band)...")

# 5-band period pairs
PERIOD_PAIRS_5BAND = [
    (0.10, 0.80),   # Band 0: Sub-daily (< 17h)
    (0.80, 1.50),   # Band 1: Daily (17h - 36h)
    (1.5, 7.00),   # Band 2: Weekly (1.5-4d)
    (7.00, 25.00),  # Band 3: Bi-weekly (4-25d)
    (25.00, 60.00), # Band 4: Monthly (25-60d)
]

decomposer_5band = SignalDecomposer(
    freq=STEPS_PER_DAY,
    period_pairs=PERIOD_PAIRS_5BAND,
    extension_days=2.0,
    history_buffer_days=14.0,
    filter_type = "butterworth",
    verbose=True,
)

# Fit and transform training data
print("\nDecomposing training data (5-band)...")
decomposer_5band.fit(df_train.tail(14 * STEPS_PER_DAY))
df_train_5band = decomposer_5band.transform(df_train)
df_train_5band_all = decomposer_5band.transform(df).iloc[:train_samples].copy()

print(f"\nDecomposed columns: {list(df_train_5band.columns)}")
print(f"Feature names: {decomposer_5band.get_feature_names()}")

In [None]:
# Visualize 5-band decomposition
fig, axes = plt.subplots(6, 1, figsize=(14, 14), sharex=True)

n_plot = 7 * STEPS_PER_DAY
df_plot = df_train_5band.tail(n_plot)
band_names = ["Sub-daily", "Daily", "Weekly", "Bi-weekly", "Monthly"]
band_colors = ["#e74c3c", "#3498db", "#27ae60", "#f39c12", "#9b59b6"]

axes[0].plot(df_plot["ds"], df_plot["y"], "k", lw=0.8)
axes[0].set_ylabel("Original (°C)")
axes[0].set_title("SignalDecomposer 5-Band Decomposition", fontsize=14)

for i, (name, color) in enumerate(zip(band_names, band_colors)):
    col = f"y_band_{i}"
    axes[i+1].plot(df_plot["ds"], df_plot[col], color=color, lw=1.2)
    axes[i+1].set_ylabel(f"{name} (°C)")
    axes[i+1].axhline(0, color="gray", linestyle="--", alpha=0.3)

df_plot = df_train_5band_all.tail(n_plot)
for i, (name, color) in enumerate(zip(band_names, band_colors)):
    col = f"y_band_{i}"
    axes[i+1].plot(df_plot["ds"], df_plot[col], color="grey", lw=0.8, ls='--')
    axes[i+1].set_ylabel(f"{name} (°C)")
    axes[i+1].axhline(0, color="gray", linestyle="--", alpha=0.3)
    lims = [df_plot[col].mean()-3*df_plot[col].std(), df_plot[col].mean()+3*df_plot[col].std()]
    axes[i+1].set_ylim(*lims)

axes[-1].set_xlabel("Date")

for ax in axes:
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Variance breakdown
print("\nVariance breakdown (5-band):")
total_var = df_train_5band["y"].var()
for i, name in enumerate(band_names):
    col = f"y_band_{i}"
    var = df_train_5band[col].var()
    print(f"  {name} ({col}): {var:.4f} ({var/total_var*100:.1f}%)")

In [None]:
print("Fitting NeuralProphet with 5-band decomposition...")

# Use same config as HighLowFreq (already has 1h frequency)
np_5band_config = NeuralProphetConfig.from_yaml(CONFIGS_PATH / "neuralprophet_external_decomp.yaml")

print(f"Config: {np_5band_config.name}")
print(f"Freq: {np_5band_config.freq}, Lag days: {np_5band_config.lag_days}, Epochs: {np_5band_config.epochs}")

# Fit on 5-band decomposed data with explicit regressors
np_5band = NeuralProphetForecaster(np_5band_config)
np_5band.fit(
    df_train_5band,
    verbose=True,
    regressors=decomposer_5band.get_feature_names(),  # Explicitly pass 5 band columns
)

# Metrics
print("\nNeuralProphet + 5-band Decomposer Metrics:")
for key, val in np_5band.metrics_.items():
    print(f"  {key}: {val:.4f}" if isinstance(val, float) else f"  {key}: {val}")

In [None]:
# Transform test data using 5-band decomposer
print("Decomposing test data (5-band with Prophet extension)...")
# np_5band._fit_df = df_train_5band
# np_5band.latest_timestamp = df_train['ds'].max().tz_localize("America/Santiago")

df_test_5band = decomposer_5band.transform(df_test)

print(f"Test decomposed shape: {df_test_5band.shape}")
print(f"NaN values: {df_test_5band[decomposer_5band.get_feature_names()].isna().sum().sum()}")

# Forecast
np_5band_fc = np_5band.forecast(df_test_5band, np_5band.latest_timestamp)
yhat_np_5band = np_5band_fc["yhat"].values

# Handle potential length mismatch
n_pred = min(len(y_true), len(yhat_np_5band))
rmse_np_5band = np.sqrt(((y_true[:n_pred] - yhat_np_5band[:n_pred]) ** 2).mean())
print(f"\nNeuralProphet + 5-band Decomposer Forecast RMSE: {rmse_np_5band:.4f}")
print(f"Predictions: {len(yhat_np_5band)} / {len(y_true)} expected")

In [None]:
np_5band.plot(df_test=df_test_5band, window_days=5, lead_time=-1)

In [None]:
bias = y_true[0]-yhat_np_5band[0]
plt.plot(yhat_np_5band+bias)
plt.plot(y_true)

## 7. EnsembleForecaster (High + Low Frequency Models)

The new `EnsembleForecaster` combines two separate NeuralProphet models:
- **High-frequency model**: Optimized for fast-changing components (12h, 24h bands)
- **Low-frequency model**: Optimized for slow-changing components (weekly+ bands)

Key benefits:
- Each model has hyperparameters tuned for its frequency band
- Decomposition happens externally (separation of concerns)
- Simple summation combines the forecasts

In [None]:
from rubin_oracle import EnsembleForecaster, NeuralProphetConfig

print("Setting up EnsembleForecaster...")

# Load configs for high-freq and low-freq models
high_freq_config = NeuralProphetConfig.from_yaml(CONFIGS_PATH / "neuralprophet_high_freq.yaml")
low_freq_config = NeuralProphetConfig.from_yaml(CONFIGS_PATH / "neuralprophet_low_freq.yaml")

print(f"High-freq config: lag_days={high_freq_config.lag_days}, epochs={high_freq_config.epochs}, ar_layers={high_freq_config.ar_layers}")
print(f"Low-freq config: lag_days={low_freq_config.lag_days}, epochs={low_freq_config.epochs}, ar_layers={low_freq_config.ar_layers}")

# Create ensemble with column mappings from 5-band decomposition
# High-freq: bands 0, 1 (sub-daily + daily)
# Low-freq: bands 2, 3, 4 (weekly, bi-weekly, monthly)
ensemble = EnsembleForecaster(
    high_freq_config=high_freq_config,
    low_freq_config=low_freq_config,
    high_freq_cols=["y_band_0", "y_band_1", "y_band_2"],
    low_freq_cols=["y_band_3", "y_band_4"],
    bias_correction=True,
    bias_window_hours=6.0,
)

print(f"\nHigh-freq columns: {ensemble._high_freq_cols}")
print(f"Low-freq columns: {ensemble._low_freq_cols}")

In [None]:
print("Fitting EnsembleForecaster on pre-decomposed data...")

# Use the 5-band decomposed training data (already created earlier)
ensemble.fit(df_train_5band, verbose=True)

# Metrics
print("\nEnsembleForecaster Metrics:")
for key, val in ensemble.metrics_.items():
    print(f"  {key}: {val:.4f}" if isinstance(val, float) else f"  {key}: {val}")

In [None]:
df_test_5band = decomposer_5band.transform(df_test, include_history=True)
df_test_5band

In [None]:
# Generate ensemble forecast on pre-decomposed test data
print("Generating ensemble forecast...")
ensemble_fc = ensemble.fitted().drop_duplicates("ds").groupby("ds").first().reset_index().tail(25)
yhat_ensemble = ensemble_fc["yhat"].values

# Compute RMSE
n_pred = min(len(y_true), len(yhat_ensemble))
rmse_ensemble = np.sqrt(((y_true[:n_pred] - yhat_ensemble[:n_pred]) ** 2).mean())
print(f"\nEnsembleForecaster Forecast RMSE: {rmse_ensemble:.4f}")
print(f"Predictions: {len(yhat_ensemble)} / {len(y_true)} expected")

In [None]:
# Plot ensemble forecast with component breakdown
fig, axs = ensemble.plot(df_test=df_test_5band, window_days=4, show_components=True)

In [None]:
ensemble.plot_components(window_days=12)

## 8. Forecast Comparison

In [None]:
# Visualize all forecasts
fig, ax = plt.subplots(figsize=(16, 7))

hours = np.arange(FORECAST_HORIZON)

# Add ensemble color
COLORS["ensemble"] = "#2ecc71"

# Actual
ax.plot(hours, y_true, "-", color=COLORS["actual"], lw=2.5, label="Actual")

# Prophet
ax.plot(
    hours, yhat_prophet_corrected, "--",
    color=COLORS["prophet"], lw=1.5,
    label=f"Prophet (RMSE={rmse_prophet:.3f})"
)

# Vanilla NeuralProphet
ax.plot(
    hours, yhat_np_vanilla, "-",
    color=COLORS["np_vanilla"], lw=1.5,
    label=f"NeuralProphet Vanilla (RMSE={rmse_np_vanilla:.3f})"
)

# NeuralProphet + HighLow Decomposer
ax.plot(
    hours, yhat_np_decomp, "-",
    color=COLORS["np_decomp"], lw=1.5,
    label=f"NeuralProphet + HighLow (RMSE={rmse_np_decomp:.3f})"
)

# NeuralProphet + 5-band Decomposer
ax.plot(
    hours, yhat_np_5band, "-",
    color=COLORS["np_5band"], lw=1.5,
    label=f"NeuralProphet + 5-band (RMSE={rmse_np_5band:.3f})"
)

# EnsembleForecaster
ax.plot(
    hours, yhat_ensemble[:24], "-",
    color=COLORS["ensemble"], lw=2,
    label=f"Ensemble (RMSE={rmse_ensemble:.3f})"
)

ax.set_xlabel("Forecast Horizon (hours)", fontsize=12)
ax.set_ylabel("Temperature (°C)", fontsize=12)
ax.set_title("24h Forecast Comparison: Decomposer vs Vanilla Models", fontsize=14)
ax.legend(fontsize=10, loc="best")
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 24)

plt.tight_layout()
plt.show()

## 9. RMSE by Lead Time

In [None]:
# Compute rolling RMSE
def rolling_rmse(y_true, yhat, window=4):
    rmse = []
    for i in range(len(y_true)):
        start = max(0, i - window + 1)
        rmse.append(np.sqrt(np.mean((y_true[start:i+1] - yhat[start:i+1]) ** 2)))
    return np.array(rmse)

fig, ax = plt.subplots(figsize=(12, 5))

hours = np.arange(FORECAST_HORIZON)

ax.plot(hours, rolling_rmse(y_true, yhat_prophet_corrected), "-",
        color=COLORS["prophet"], lw=2, label="Prophet")
ax.plot(hours, rolling_rmse(y_true, yhat_np_vanilla), "-",
        color=COLORS["np_vanilla"], lw=2, label="NeuralProphet Vanilla")
ax.plot(hours, rolling_rmse(y_true, yhat_np_decomp), "-",
        color=COLORS["np_decomp"], lw=2, label="NeuralProphet + HighLow")
ax.plot(hours, rolling_rmse(y_true, yhat_np_5band), "-",
        color=COLORS["np_5band"], lw=2, label="NeuralProphet + 5-band")
ax.plot(hours, rolling_rmse(y_true, yhat_ensemble), "-",
        color=COLORS["ensemble"], lw=2, label="Ensemble")

ax.axhline(y=1.0, color="gray", linestyle="--", alpha=0.5, label="1°C threshold")
ax.set_xlabel("Forecast Horizon (hours)", fontsize=12)
ax.set_ylabel("Rolling RMSE (°C)", fontsize=12)
ax.set_title("Rolling RMSE by Lead Time", fontsize=14)
ax.legend(fontsize=10)
ax.grid(True, alpha=0.3)
ax.set_xlim(0, 24)
ax.set_ylim(0, 2)

plt.tight_layout()
plt.show()

## 10. Summary

In [None]:
# Summary table
print("=" * 70)
print("SUMMARY: 24h Forecast RMSE")
print("=" * 70)

results = {
    "Model": [
        "Prophet (baseline)",
        "NeuralProphet Vanilla",
        "NeuralProphet + HighLowFreq",
        "NeuralProphet + 5-band",
        "EnsembleForecaster",
    ],
    "RMSE": [rmse_prophet, rmse_np_vanilla, rmse_np_decomp, rmse_np_5band, rmse_ensemble],
    "In-sample R²": [
        prophet_model.metrics_["r2"],
        np_vanilla.metrics_["r2"],
        np_decomp.metrics_["r2"],
        np_5band.metrics_["r2"],
        "N/A",  # Ensemble has component metrics
    ],
}

df_results = pd.DataFrame(results)
df_results["Improvement vs Prophet"] = df_results["RMSE"].apply(
    lambda x: f"{(rmse_prophet - x) / rmse_prophet * 100:+.1f}%"
)

print(df_results.to_string(index=False))

print("\n" + "=" * 70)
best_idx = df_results["RMSE"].idxmin()
best_model = df_results.loc[best_idx, "Model"]
best_rmse = df_results.loc[best_idx, "RMSE"]
print(f"BEST MODEL: {best_model} (RMSE: {best_rmse:.4f})")
print("=" * 70)

## 11. Save/Load Demo

In [None]:
import tempfile

with tempfile.TemporaryDirectory() as tmpdir:
    save_path = Path(tmpdir) / "np_decomp_model"

    # Save forecaster with decomposer
    print("Saving model with decomposer...")
    np_decomp.save(save_path, decomposer=decomposer)
    print(f"Saved files: {[f.name for f in save_path.iterdir()]}")

    # Load
    print("\nLoading model...")
    loaded_model = NeuralProphetForecaster.load(save_path)
    loaded_decomposer = NeuralProphetForecaster.load_decomposer(save_path)

    # Refit history (needed after loading without history)
    print("Refitting decomposer history...")
    loaded_decomposer.refit_history(df_train)

    # Get latest timestamp from training data
    issue_time = df_train["ds"].max()

    # Verify prediction
    df_test_decomp_loaded = loaded_decomposer.transform(df_test)
    fc_loaded = loaded_model.forecast(df_test_decomp_loaded, issue_time=issue_time)

    print(f"\nLoaded model prediction shape: {fc_loaded.shape}")
    print(f"Loaded model RMSE: {np.sqrt(((y_true - fc_loaded['yhat'].values) ** 2).mean()):.4f}")

## 12. Conclusions

### Key Findings

1. **HighLowFreqDecomposer** provides clean separation of signal into sub-daily, daily, and low-frequency components
2. **Prophet extension** before decomposition eliminates edge effects in the decomposed signal
3. **External decomposition** allows more flexibility than internal NeuralProphet decomposition
4. **EnsembleForecaster** combines specialized models for different frequency bands

### New API Features

- `SignalDecomposer.fit_transform(df)` - Fit and decompose training data
- `SignalDecomposer.transform(df)` - Decompose test data with automatic extension
- `SignalDecomposer.get_feature_names()` - Get decomposed column names
- `EnsembleForecaster(high_freq_config, low_freq_config, ...)` - Combine models for different frequency bands
- `NeuralProphetForecaster.save(path, decomposer=...)` - Save with decomposer
- `NeuralProphetForecaster.load_decomposer(path)` - Load decomposer separately