# Process Notebook - Short-Term Electricity Demand Forecasting

## Overview and motivation
This project builds a reliable short-term electricity demand forecast to support operational planning, peak management, and grid stability. We aimed to compare classical time-series modeling with deep learning and ensemble strategies, using a consistent evaluation period and clear error metrics.


## Related work
- Open Power System Data (OPSD) time series dataset provides publicly available demand data for Europe and is widely used in load-forecasting studies.
- Baseline statistical models such as SARIMA are common for seasonal load series.
- Deep learning models like LSTM are widely used for capturing non-linear temporal dynamics.
- Ensemble and hybrid strategies are frequently reported to improve robustness by combining model strengths.


## Initial questions
**What questions are we trying to answer?**
- How well can we forecast hourly electricity demand with classical vs. deep learning methods?
- Do hybrid or ensemble models improve accuracy over single models?
- Which model handles peak periods most reliably?

**How did the questions evolve?**
- After initial EDA, we focused more on peak-error behavior and seasonal structure.
- We added ensemble variants to reduce error variance across different regimes.

**New questions considered**
- Can switching rules or residual correction outperform fixed-weight ensembles?
- Which errors matter most for operational decisions (median error vs. tail error)?



## Data
- **Source:** OPSD time-series dataset (`opsd-time_series-2020-10-06`).
- **Granularity:** hourly demand series; resampled/cleaned into 60-minute frequency.
- **Storage:** processed data saved under `scripts/processed_data_60min/` and `saved_model_outputs/preprocessing_outputs/`.

**Preprocessing Steps**
- Parsed timestamps, sorted by time, and aligned to a uniform hourly index.
- Removed duplicates, handled missing values, and filtered invalid records.
- Resampled to 60-minute intervals to match the forecasting horizon.
- Generated train/validation/test splits for 2015-2017 (train/val) and 2018 (test).
- Saved processed datasets and preprocessing reports for reproducibility.



## Exploratory Data Analysis (EDA)
We explored trend, seasonality, stationarity, peaks, and correlation structure.

**EDA Steps**
Step 1: Basic data checks and missing value scan.
Step 1.1-1.2: Coverage and gap analysis before/after gap filling.
Step 2: Target variable distribution and basic statistics.
Step 3: Time series visualization for trend and regime shifts.
Step 4: Seasonality patterns (daily/weekly cycles).
Step 5: Stationarity checks to guide SARIMA differencing.
Step 6: Autocorrelation analysis (ACF/PACF).
Step 7: Calendar effects (weekday/weekend, holidays).
Step 8: Weather variable exploration.
Step 9: Lag relationships with demand.
Step 10: Peak load analysis.
Step 11: Error sensitivity by regime.
Step 12: Multicollinearity checks.
Step 13: Summary and modeling decisions.



## Methods considered and design decisions
**Models explored**
- **SARIMA** for seasonal baselines using ACF/PACF and differencing diagnostics.
- **LSTM** to capture nonlinear dependencies and long-range temporal patterns.
- **Hybrid (SARIMA residual + LSTM)** to learn residual structure after seasonal modeling.
- **Ensembles** (mean/median, ridge stacking, NNLS, residual-switching) to combine model strengths.

**Why these choices?**
- Strong daily/weekly seasonality suggested a SARIMA baseline.
- Nonlinear effects and regime shifts motivated LSTM.
- Ensemble methods were tested to reduce variance and improve peak accuracy.

**Major changes in ideas**
- The hybrid model underperformed, so we emphasized ensembles and switch-based variants.
- Evaluation shifted to include tail errors (AE_P90/P95) to capture peak risk.



## Model results (test period)
Key metrics from saved outputs:

| Model | MAE | RMSE | R2 |
|---|---:|---:|---:|
| SARIMA | 11509.05 | 13345.42 | -0.84 |
| LSTM | 764.96 | 989.54 | 0.9898 |
| Hybrid | 16475.40 | 18460.52 | -2.53 |
| **Best ensemble (final_switch_tuned_q083)** | **577.14** | **740.36** | **0.995** |

Best overall accuracy comes from the switch-based ensemble, followed by LSTM. SARIMA and hybrid models show poor fit in this configuration.



## Model visualizations

**SARIMA forecasts**

![SARIMA forecasts](saved_model_outputs/sarima_outputs/10_forecasts.png)

**LSTM predictions**

![LSTM predictions](saved_model_outputs/lstm_outputs/predictions_visualization.png)

**Hybrid predictions**

![Hybrid predictions](saved_model_outputs/hybrid_outputs/predictions_visualization_hybrid.png)

**Ensemble predictions**

![Ensemble predictions](saved_model_outputs/ensemble_outputs/01_actual_vs_predictions_full.png)



## Final analysis
**What did we learn about the data?**
- The series shows strong daily/weekly seasonality and clear peak regimes.
- Peaks are harder to predict and dominate tail-error behavior.

**How did we answer the questions?**
- We compared SARIMA, LSTM, hybrid, and multiple ensembles on the same test period.
- The switch-based ensemble (`final_switch_tuned_q083`) achieved the best MAE/RMSE/R2.

**How can we justify these answers?**
- Consistent evaluation metrics across models.
- Visual comparison of predictions vs. actuals.
- Tail-error focus (AE_P90/P95) to assess peak robustness.

**Next questions / future work**
- Incorporate exogenous features (weather/holidays) to improve peak prediction.
- Evaluate rolling-origin backtesting for long-term robustness.

