# 9. Forecasting pipeline

## Setup and imports

In [1]:
from functions import *

## Loading and preprocessing the datasets
Both the training and forecasting datasets are resampled to hourly frequency and missing values are filled using time-based interpolation.
After that, time- and weather-related features are added using the same feature engineering pipeline as in Notebook 8 to ensure consistency with the ML model inputs.

In [2]:
train_raw = load_data()
train_raw = train_raw.sort_index()
train_raw = train_raw.asfreq("h")
train_raw = train_raw.interpolate(method="time", limit_direction="both")
train_raw = train_raw.reset_index()

train_fe = add_time_weather_features(train_raw)

train_fe = train_fe.round(5)
show_table_info(train_fe)


DATASET SUMMARY
Shape: 8,760 rows × 25 columns
Time span: 2013-07-01 00:00:00+00:00 -> 2014-06-30 23:00:00+00:00

                         Column    Type   NA %
                        pv_mod1 float64  0.00%
                        pv_mod2 float64  0.00%
                        pv_mod3 float64  0.00%
                         demand float64  0.00%
                             pv float64  0.00%
                          price float64  0.00%
                    temperature float64  0.00%
                 pressure (hPa) float64  0.00%
                cloud_cover (%) float64  0.00%
            cloud_cover_low (%) float64  0.00%
            cloud_cover_mid (%) float64  0.00%
           cloud_cover_high (%) float64  0.00%
          wind_speed_10m (km/h) float64  0.00%
     shortwave_radiation (W/m²) float64  0.00%
        direct_radiation (W/m²) float64  0.00%
       diffuse_radiation (W/m²) float64  0.00%
direct_normal_irradiance (W/m²) float64  0.00%
                           hour   int32

Unnamed: 0,Column,Type,NA %
0,pv_mod1,float64,0.00%
1,pv_mod2,float64,0.00%
2,pv_mod3,float64,0.00%
3,demand,float64,0.00%
4,pv,float64,0.00%
5,price,float64,0.00%
6,temperature,float64,0.00%
7,pressure (hPa),float64,0.00%
8,cloud_cover (%),float64,0.00%
9,cloud_cover_low (%),float64,0.00%


In [3]:
forecast_raw = load_forecast_data()
forecast_raw = forecast_raw.sort_index()
forecast_raw = forecast_raw.asfreq("h")
forecast_raw = forecast_raw.interpolate(method="time", limit_direction="both")
forecast_raw = forecast_raw.reset_index()

forecast_fe = add_time_weather_features(forecast_raw)

forecast_fe = forecast_fe.round(5)
show_table_info(forecast_fe)


DATASET SUMMARY
Shape: 168 rows × 22 columns
Time span: 2014-07-01 00:00:00+00:00 -> 2014-07-07 23:00:00+00:00

                         Column    Type   NA %
                         demand float64  0.00%
                             pv float64  0.00%
                          price float64  0.00%
                    temperature float64  0.00%
                 pressure (hPa) float64  0.00%
                cloud_cover (%)   int64  0.00%
            cloud_cover_low (%)   int64  0.00%
            cloud_cover_mid (%)   int64  0.00%
           cloud_cover_high (%)   int64  0.00%
          wind_speed_10m (km/h) float64  0.00%
     shortwave_radiation (W/m²)   int64  0.00%
        direct_radiation (W/m²)   int64  0.00%
       diffuse_radiation (W/m²)   int64  0.00%
direct_normal_irradiance (W/m²) float64  0.00%
                           hour   int32  0.00%
                        weekday   int32  0.00%
                     is_weekend   int64  0.00%
                       hour_sin float64  

Unnamed: 0,Column,Type,NA %
0,demand,float64,0.00%
1,pv,float64,0.00%
2,price,float64,0.00%
3,temperature,float64,0.00%
4,pressure (hPa),float64,0.00%
5,cloud_cover (%),int64,0.00%
6,cloud_cover_low (%),int64,0.00%
7,cloud_cover_mid (%),int64,0.00%
8,cloud_cover_high (%),int64,0.00%
9,wind_speed_10m (km/h),float64,0.00%


## Prepare feature sets
Feature selection for XGBoost and ML models
We select the same set of engineered features used in Notebook 8.
This ensures the machine learning model receives exactly the same input structure during training and during forecasting on the new unseen dataset.

In [4]:
FEATURES = [
    "timestamp","demand","hour_sin","hour_cos","is_weekend",
    "cooling_degree","heating_degree","temperature",
    "pressure (hPa)","cloud_cover (%)","wind_speed_10m (km/h)",
    "shortwave_radiation (W/m²)",
    "direct_radiation (W/m²)","diffuse_radiation (W/m²)",
    "direct_normal_irradiance (W/m²)",
    "price"
]

train_fe = train_fe[FEATURES]
forecast_fe = forecast_fe[FEATURES]

train_fe.head()

Unnamed: 0,timestamp,demand,hour_sin,hour_cos,is_weekend,cooling_degree,heating_degree,temperature,pressure (hPa),cloud_cover (%),wind_speed_10m (km/h),shortwave_radiation (W/m²),direct_radiation (W/m²),diffuse_radiation (W/m²),direct_normal_irradiance (W/m²),price
0,2013-07-01 00:00:00+00:00,0.27,0.0,1.0,0,0.0,4.5,13.5,1011.3,4.0,10.5,0.0,0.0,0.0,0.0,0.01605
1,2013-07-01 01:00:00+00:00,0.23,0.25882,0.96593,0,0.0,4.8,13.2,1010.8,27.0,11.9,0.0,0.0,0.0,0.0,0.00095
2,2013-07-01 02:00:00+00:00,0.26,0.5,0.86603,0,0.0,4.9,13.1,1010.3,33.0,11.6,0.0,0.0,0.0,0.0,0.0006
3,2013-07-01 03:00:00+00:00,0.28,0.70711,0.70711,0,0.0,5.0,13.0,1010.3,28.0,11.2,51.45455,2.0,7.0,30.1,0.00046
4,2013-07-01 04:00:00+00:00,0.29,0.86603,0.5,0,0.0,4.2,13.8,1010.2,16.0,11.7,102.90909,30.0,31.0,252.0,0.00046


## Run pipeline: 4 models

Rolling 7-day forecasting pipeline
We run four models in a rolling forecasting setup (24-hour horizon, 0-hour lead time).
Each model is retrained every day using all available past data:

- SARIMA(1,1,1)(1,1,1,24) – best statistical model
- XGBoost – best ML model
- Naive baseline – last observed value
- Seasonal naive – value from 24 hours earlier

In [5]:
SARIMA_ORDER = (1,1,1)
SARIMA_SEASONAL = (1,1,1,24)
FORECAST_DAYS = 7
FORECAST_HORIZON = 24

sarima_fc = rolling_forecast_sarima(
    train_df=train_fe,
    future_df=forecast_fe,
    order=SARIMA_ORDER,
    seasonal=SARIMA_SEASONAL,
    days=FORECAST_DAYS,
    horizon=FORECAST_HORIZON
)

xgb_fc = rolling_forecast_xgboost(
    train_df=train_fe,
    future_df=forecast_fe,
    days=FORECAST_DAYS,
    horizon=FORECAST_HORIZON
)

naive_fc = rolling_forecast_naive(
    train_df=train_fe,
    future_df=forecast_fe,
    days=FORECAST_DAYS,
    horizon=FORECAST_HORIZON
)

snaive_fc = rolling_forecast_seasonal_naive(
    train_df=train_fe,
    future_df=forecast_fe,
    days=FORECAST_DAYS,
    horizon=FORECAST_HORIZON,
    lag=24
)

# Koondame üheks df-ks
all_fc = pd.concat([sarima_fc, xgb_fc, naive_fc, snaive_fc], ignore_index=True)
all_fc.head()

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)


Unnamed: 0,timestamp,y_true,y_pred,model,day
0,2014-07-01 00:00:00+00:00,0.25,0.304634,"SARIMA(1, 1, 1)(1, 1, 1, 24)",1
1,2014-07-01 01:00:00+00:00,0.26,0.297205,"SARIMA(1, 1, 1)(1, 1, 1, 24)",1
2,2014-07-01 02:00:00+00:00,0.24,0.300011,"SARIMA(1, 1, 1)(1, 1, 1, 24)",1
3,2014-07-01 03:00:00+00:00,0.25,0.294817,"SARIMA(1, 1, 1)(1, 1, 1, 24)",1
4,2014-07-01 04:00:00+00:00,0.27,0.313512,"SARIMA(1, 1, 1)(1, 1, 1, 24)",1


## Compare models
Now here we compute MAE, RMSE, nRMSE and MAPE for each model across the full 7-day forecast period.
The nRMSE metric is used as the main comparison indicator, because it normalises errors relative to the demand range, allowing fair comparison between models.

In [6]:
metrics_models = metrics_by_model(all_fc).round(4)
metrics_models

Unnamed: 0,model,MAE,RMSE,nRMSE,MAPE
0,Naive (last value),0.1319,0.3005,0.1624,0.1915
1,"SARIMA(1, 1, 1)(1, 1, 1, 24)",0.1949,0.2906,0.1571,0.4365
2,Seasonal naive (24h),0.1845,0.3835,0.2073,0.3473
3,XGBoost,0.1731,0.2733,0.1477,0.3884


## Save table

In [7]:
save_table(metrics_models, "ex9_table_model_comparison.csv")

Unnamed: 0,model,MAE,RMSE,nRMSE,MAPE
0,Naive (last value),0.1319,0.3005,0.1624,0.1915
1,"SARIMA(1, 1, 1)(1, 1, 1, 24)",0.1949,0.2906,0.1571,0.4365
2,Seasonal naive (24h),0.1845,0.3835,0.2073,0.3473
3,XGBoost,0.1731,0.2733,0.1477,0.3884


## nRMSE bar chart
This bar chart clearly shows the performance difference between the models.
Lower nRMSE indicates a more accurate forecast.

In [8]:
import plotly.graph_objects as go

fig_bar = go.Figure()
fig_bar.add_trace(go.Bar(
    x=metrics_models["model"],
    y=metrics_models["nRMSE"],
    marker_color=ENERGY_COLORS["grid"]
))

fig_bar.update_layout(
    title="Model comparison (7-day rolling forecast)",
    xaxis_title="Model",
    yaxis_title="nRMSE",
    **PLOT_STYLE
)

save_fig_plotly(fig_bar, "ex9_fig1_model_comparison.svg", 1100, 500)
fig_bar.show()


As you can see from the plot, XGBoost performs best overall.

Best model forecast

Below we select the best model based on nRMSE and plot its full 7-day forecast together with the actual demand.

In [9]:
best_model = metrics_models.sort_values("nRMSE").iloc[0]["model"]
print("Best model:", best_model)

best_df = all_fc[ all_fc["model"] == best_model ].sort_values("timestamp")

fig_best = plot_forecast(
    timestamp=best_df["timestamp"],
    y_true=best_df["y_true"],
    y_pred=best_df["y_pred"],
    model_name=best_model,
    filename="ex9_fig2_best_model.svg"
)
fig_best.show()


Best model: XGBoost


## Individual model forecasts

Below we plot the full 7-day forecast for each model separately. This helps compare how well each method follows the actual demand across the forecast horizon.

In [10]:
for name in metrics_models["model"]:
    df_m = all_fc[ all_fc["model"] == name ].sort_values("timestamp")
    fig = plot_forecast(
        df_m["timestamp"],
        df_m["y_true"],
        df_m["y_pred"],
        model_name=name,
        filename=f"ex9_fig_{name.replace(' ','_').replace('(','').replace(')','')}.svg"
    )
    fig.show()

## Combined forecast comparison

Plot showing all models and actual demand over the same 7-day window.

In [11]:
fig_all = go.Figure()

for model_name in metrics_models["model"]:
    df_m = all_fc[ all_fc["model"] == model_name ].sort_values("timestamp")
    fig_all.add_trace(go.Scatter(
        x=df_m["timestamp"],
        y=df_m["y_pred"],
        mode="lines",
        name=model_name,
        line=dict(width=1)
    ))

# Add actuals
fig_all.add_trace(go.Scatter(
    x=all_fc.sort_values("timestamp")["timestamp"],
    y=all_fc.sort_values("timestamp")["y_true"],
    mode="lines",
    name="Actual",
    line=dict(width=2, color="black")
))

fig_all.update_layout(
    title="7-day Forecast Comparison – All Models",
    xaxis_title="Time",
    yaxis_title="Demand",
    **PLOT_STYLE
)

save_fig_plotly(fig_all, "ex9_fig3_all_models_overlay.svg", width=1200, height=600)
fig_all.show()

## Conclusion

In this section, a complete 7-day rolling forecasting pipeline was implemented using both statistical and machine learning models.
The results show clear differences in accuracy between the approaches: the baseline models performed the weakest, SARIMA captured the daily seasonality better, and XGBoost achieved the lowest overall forecasting error.

The combined comparison plot and individual model forecasts confirm that XGBoost follows the real demand profile most closely, including daily cycles and larger fluctuations.
The pipeline is fully reproducible and will be used as the input for the optimisation tasks in the next chapter.