# Task 7 · Statistical Time-Series Models (ARIMA/SARIMA)

This notebook fits and evaluates ARIMA/SARIMA models for the household demand series, providing diagnostics, evaluation metrics, and artefacts for both the LaTeX report and the interactive dashboard.

In [1]:
from pathlib import Path
import sys

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from IPython.display import display

# Project imports
# Add the project root to the system path to allow importing from 'src'
ROOT = Path.cwd().resolve()
if not (ROOT / "src").exists():
    ROOT = ROOT.parent
if str(ROOT) not in sys.path:
    sys.path.append(str(ROOT))

# Import custom modules for modeling and plotting
from src.modeling_stats import (
    acf_pacf,
    stationarity_checks,
    fit_arima,
    forecast_arima,
    evaluate_forecast,
    walk_forward_daily,
)
from src.plotting import (
    plot_acf_pacf,
    plot_forecast_overlay,
    plot_walkforward_panels,
    plot_metrics_bar,
)

In [2]:
pd.options.display.max_rows = 12

# Define paths for input data and output artefacts
FIG_PATH = ROOT / "reports" / "figures"
TABLE_PATH = ROOT / "reports" / "tables"
DATA_PATH = ROOT / "data" / "raw" / "train_252145.csv"

# Ensure output directories exist
FIG_PATH.mkdir(parents=True, exist_ok=True)
TABLE_PATH.mkdir(parents=True, exist_ok=True)


def save_figure(fig: go.Figure, name: str, width: int = 1280, height: int = 720, scale: int = 2) -> None:
    """
    Save a Plotly figure to PNG and PDF formats.
    """
    png = FIG_PATH / f"{name}.png"
    pdf = FIG_PATH / f"{name}.pdf"
    fig.write_image(str(png), width=width, height=height, scale=scale)
    fig.write_image(str(pdf), width=width, height=height, scale=scale)


# Load and preprocess the dataset
df = pd.read_csv(DATA_PATH, parse_dates=["timestamp"]).sort_values("timestamp")
df["Demand"] = pd.to_numeric(df["Demand"], errors="coerce")
df = df.dropna(subset=["Demand"]).set_index("timestamp").sort_index()

# Resample to hourly cadence to guarantee regularity (fill gaps via interpolation)
hourly_demand = df["Demand"].resample("H").mean().interpolate(method="time", limit_direction="both").dropna()
demand_df = hourly_demand.reset_index().rename(columns={"timestamp": "timestamp", "Demand": "Demand"})

print(
    f"Demand sample: {demand_df['timestamp'].min()} → {demand_df['timestamp'].max()} | "
    f"Observations: {len(demand_df):,}"
)
display(demand_df.head())

Demand sample: 2013-07-01 00:00:00+00:00 → 2014-06-30 23:00:00+00:00 | Observations: 8,760


Unnamed: 0,timestamp,Demand
0,2013-07-01 00:00:00+00:00,0.27
1,2013-07-01 01:00:00+00:00,0.23
2,2013-07-01 02:00:00+00:00,0.26
3,2013-07-01 03:00:00+00:00,0.28
4,2013-07-01 04:00:00+00:00,0.29


## Stationarity diagnostics

In [3]:
# Perform stationarity checks (ADF, KPSS) and compute ACF/PACF
nlags = 72
acf_results = acf_pacf(hourly_demand, nlags=nlags)
stationarity_df = stationarity_checks(hourly_demand)

# Save stationarity results
stationarity_df.to_csv(TABLE_PATH / "stationarity_tests.csv", index=False)

# Plot ACF and PACF
fig_acf_pacf = plot_acf_pacf(
    acf_results["acf"],
    acf_results["pacf"],
    title="Demand ACF and PACF (Hourly)",
    style="academic",
)
save_figure(fig_acf_pacf, "stats_acf_pacf", width=1200, height=500)

# Display results
display(stationarity_df)
fig_acf_pacf

look-up table. The actual p-value is smaller than the p-value returned.

  kpss_stat, kpss_p, kpss_lags, *_ = kpss(clean_series, regression="c", nlags="auto")


Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




Unnamed: 0,test,statistic,p_value,lag
0,ADF,-11.773439,1.076326e-21,30
1,KPSS,1.227158,0.01,32


In [4]:
stationarity_df

Unnamed: 0,test,statistic,p_value,lag
0,ADF,-11.773439,1.076326e-21,30
1,KPSS,1.227158,0.01,32


## Candidate model definitions

We explore one non-seasonal ARIMA configuration and two seasonal SARIMA variants motivated by the daily cycle (s = 24). Seasonal differencing is set to D = 1 where required to address residual daily seasonality.

In [5]:
# Define candidate ARIMA/SARIMA models
# We test a non-seasonal ARIMA and two seasonal SARIMA configurations
MODEL_CANDIDATES = [
    {
        "name": "ARIMA(2,1,2)",
        "order": (2, 1, 2),
        "seasonal_order": (0, 0, 0, 0),
    },
    {
        "name": "SARIMA(1,1,1)(1,1,1,24)",
        "order": (1, 1, 1),
        "seasonal_order": (1, 1, 1, 24),
    },
    {
        "name": "SARIMA(2,1,1)(0,1,1,24)",
        "order": (2, 1, 1),
        "seasonal_order": (0, 1, 1, 24),
    },
]

# Define forecast parameters
forecast_horizon = 24
validation_window_hours = 24

# Define train/validation split (hold out the last 7 days for final validation, but here we use a single 24h window for initial check)
validation_cutoff = demand_df["timestamp"].max() - pd.Timedelta(days=7)
train_mask = demand_df["timestamp"] < validation_cutoff
val_mask = (demand_df["timestamp"] >= validation_cutoff) & (
    demand_df["timestamp"] < validation_cutoff + pd.Timedelta(hours=validation_window_hours)
)

train_series = demand_df.loc[train_mask].set_index("timestamp")["Demand"]
validation_df = demand_df.loc[val_mask].copy()

print(f"Training samples: {len(train_series):,}; validation horizon: {len(validation_df)}")

Training samples: 8,591; validation horizon: 24


## Whole-training split evaluation

In [6]:
single_split_records = []
single_split_predictions = []

# Iterate over each candidate model
for candidate in MODEL_CANDIDATES:
    print(f"Fitting {candidate['name']}...")
    
    # Fit the model on the training series
    result = fit_arima(train_series, order=candidate["order"], seasonal_order=candidate["seasonal_order"])
    if result is None:
        print(f"Failed to fit {candidate['name']}")
        single_split_records.append({"model_name": candidate["name"], "MAE": np.nan, "RMSE": np.nan, "nRMSE": np.nan})
        continue

    # Generate forecast for the validation window
    forecast_index = validation_df["timestamp"].iloc[:forecast_horizon]
    forecast = forecast_arima(result, horizon=forecast_horizon, index=forecast_index)
    
    # Evaluate performance
    metrics = evaluate_forecast(validation_df["Demand"].iloc[:forecast_horizon], forecast.values)
    single_split_records.append({"model_name": candidate["name"], **metrics})

    # Store predictions for visualization
    pred_df = pd.DataFrame(
        {
            "timestamp": forecast_index,
            "y_true": validation_df["Demand"].iloc[:forecast_horizon].values,
            "y_pred": forecast.values,
            "model_name": candidate["name"],
        }
    )
    single_split_predictions.append(pred_df)

# Compile metrics into a DataFrame
single_split_metrics = pd.DataFrame(single_split_records)
single_split_metrics["evaluation"] = "Whole-train split"

# Save results
single_split_predictions_df = pd.concat(single_split_predictions, ignore_index=True)
single_split_predictions_df.to_csv(TABLE_PATH / "stats_single_split_predictions.csv", index=False)
single_split_metrics.to_csv(TABLE_PATH / "model_candidates_metrics.csv", index=False)

# Display metrics sorted by nRMSE (normalized RMSE)
display(single_split_metrics.sort_values("nRMSE"))

Fitting ARIMA(2,1,2)...



No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.



Fitting SARIMA(1,1,1)(1,1,1,24)...



No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.



Fitting SARIMA(2,1,1)(0,1,1,24)...



No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.



Unnamed: 0,model_name,MAE,RMSE,nRMSE,evaluation
2,"SARIMA(2,1,1)(0,1,1,24)",0.139409,0.2332,0.197627,Whole-train split
1,"SARIMA(1,1,1)(1,1,1,24)",0.139234,0.234366,0.198615,Whole-train split
0,"ARIMA(2,1,2)",0.219628,0.291779,0.247271,Whole-train split


**Metric definition.** We report MAE, RMSE, and normalized RMSE where
\(	ext{nRMSE} = 	ext{RMSE} / (\max(y) - \min(y))\). This scale-invariant metric follows the specification in Task 7.

## Last-week daily walk-forward

In [7]:
walkforward_predictions_list = []
walkforward_metrics_list = []

# Perform walk-forward validation for each candidate
# This simulates a real-world scenario where the model is retrained daily
for candidate in MODEL_CANDIDATES:
    print(f"Running walk-forward validation for {candidate['name']}...")
    wf_pred, wf_metrics = walk_forward_daily(
        demand_df,
        target="Demand",
        days=7,  # Validate over the last 7 days
        horizon=forecast_horizon,
        order=candidate["order"],
        seasonal_order=candidate["seasonal_order"],
    )
    if wf_pred.empty or wf_metrics.empty:
        continue
    wf_pred["model_name"] = candidate["name"]
    wf_metrics["model_name"] = candidate["name"]
    walkforward_predictions_list.append(wf_pred)
    walkforward_metrics_list.append(wf_metrics)

# Concatenate results
if walkforward_predictions_list:
    walkforward_predictions_df = pd.concat(walkforward_predictions_list, ignore_index=True)
else:
    walkforward_predictions_df = pd.DataFrame(columns=["day_idx", "timestamp", "y_true", "y_pred", "model_name"])

if walkforward_metrics_list:
    walkforward_metrics_df = pd.concat(walkforward_metrics_list, ignore_index=True)
else:
    walkforward_metrics_df = pd.DataFrame(columns=["day_idx", "MAE", "RMSE", "nRMSE", "model_name"])

# Save walk-forward results
walkforward_predictions_df.to_csv(TABLE_PATH / "walkforward_predictions.csv", index=False)
walkforward_metrics_df.to_csv(TABLE_PATH / "walkforward_per_day_metrics.csv", index=False)

# Display the first few rows of the metrics
display(walkforward_metrics_df.head())

Running walk-forward validation for ARIMA(2,1,2)...



No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred 

Running walk-forward validation for SARIMA(1,1,1)(1,1,1,24)...



No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred 

Running walk-forward validation for SARIMA(2,1,1)(0,1,1,24)...



No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred frequency H will be used.


No frequency information was provided, so inferred 

Unnamed: 0,day_idx,MAE,RMSE,nRMSE,model_name
0,1,0.219614,0.292238,0.247659,"ARIMA(2,1,2)"
1,2,0.198862,0.247114,0.268602,"ARIMA(2,1,2)"
2,3,0.204163,0.240069,0.300086,"ARIMA(2,1,2)"
3,4,0.394966,0.655675,0.302154,"ARIMA(2,1,2)"
4,5,0.182884,0.205684,0.380896,"ARIMA(2,1,2)"


## Visual diagnostics

In [8]:
# Select the best model based on lowest nRMSE from the whole-train split
best_model_name = single_split_metrics.sort_values("nRMSE").iloc[0]["model_name"]
print(f"Best model (whole-train nRMSE): {best_model_name}")

best_single_split = single_split_predictions_df[single_split_predictions_df["model_name"] == best_model_name]
fig_forecast = plot_forecast_overlay(
    best_single_split,
    title=f"Validation forecast overlay – {best_model_name}",
    style="academic",
)
save_figure(fig_forecast, "stats_forecast_overlay_best", width=1100, height=600)
fig_forecast


Best model (whole-train nRMSE): SARIMA(2,1,1)(0,1,1,24)




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




In [9]:
best_wf = walkforward_predictions_df[walkforward_predictions_df["model_name"] == best_model_name]
fig_walkforward = plot_walkforward_panels(best_wf, style="academic")
save_figure(fig_walkforward, "stats_walkforward_panels", width=1400, height=900)
fig_walkforward




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




In [10]:
# Merge metrics for bar chart (whole split + walk-forward mean)
if not walkforward_metrics_df.empty:
    walkforward_summary = (
        walkforward_metrics_df.groupby("model_name")[["MAE", "RMSE", "nRMSE"]]
        .mean()
        .reset_index()
    )
    walkforward_summary["evaluation"] = "Walk-forward mean"
else:
    walkforward_summary = pd.DataFrame(columns=["model_name", "MAE", "RMSE", "nRMSE", "evaluation"])

whole_split_metrics = single_split_metrics[["model_name", "MAE", "RMSE", "nRMSE"]].copy()
whole_split_metrics["evaluation"] = "Whole-train split"

metrics_long = pd.concat(
    [whole_split_metrics, walkforward_summary],
    ignore_index=True,
)

melted_metrics = metrics_long.melt(
    id_vars=["model_name", "evaluation"],
    value_vars=["MAE", "RMSE", "nRMSE"],
    var_name="metric",
    value_name="value",
)
fig_metrics = plot_metrics_bar(melted_metrics, style="academic")
save_figure(fig_metrics, "stats_metrics_bar", width=1100, height=600)
fig_metrics




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




Support for Kaleido versions less than 1.0.0 is deprecated and will be removed after September 2025.
Please upgrade Kaleido to version 1.0.0 or greater (`pip install 'kaleido>=1.0.0'` or `pip install 'plotly[kaleido]'`).




## Export metrics tables

In [11]:
single_split_metrics.to_csv(TABLE_PATH / "model_candidates_metrics.csv", index=False)
walkforward_metrics_df.to_csv(TABLE_PATH / "walkforward_per_day_metrics.csv", index=False)
walkforward_summary.to_csv(TABLE_PATH / "walkforward_metrics_summary.csv", index=False)

print("Saved stationarity tests, candidate metrics, walk-forward metrics, predictions, and summaries.")


Saved stationarity tests, candidate metrics, walk-forward metrics, predictions, and summaries.


## Interpretation

- **Preferred model.** The model yielding the lowest normalized RMSE on the whole-training split (identified above) also sustains strong walk-forward accuracy, indicating robust short-term forecasting capability.
- **Diagnostics.** ACF/PACF plots after first differencing show damped autocorrelation with residual daily periodicity, motivating the inclusion of a seasonal AR component.
- **Operational insight.** Reliable 24-hour demand forecasts enable proactive battery scheduling and tariff-aware load shifting, particularly when combined with PV generation predictions in subsequent tasks.
