Machine Learning solutions for multivariate time-series in the Spanish Electrical Grid.
- Josu Viteri
- Gotzon Viteri
- Iker Dominguez
This research project explores advanced machine-learning solutions for multivariate time-series forecasting in energy markets. In the context of the transition to renewable electrical infrastructures, predicting market behavior is crucial due to increased volatility.
Using a comprehensive 4-year dataset of Spanish electrical consumption, generation, pricing, and weather data, this study evaluates the impact of weather-driven renewable generation on Day-Ahead Market (DAM) prices.
- Architectural Evolution: Evaluating the transition from traditional Autoregressive (AR) models and Tree-based ensemble methods to state-of-the-art Temporal Fusion Transformers (TFT).
- Extreme Event Mitigation: Addressing the distributional imbalance of price spikes through SMOGN (Synthetic Data) oversampling and Quantile Regression.
- Interpretability: Moving beyond "black-box" models by integrating SHAP values and Attention Maps to provide explainable insights for grid operations.
- Robustness: Implementing Walk-Forward Validation to ensure model reliability in real-world scenarios.
- Data Sources: Spanish electrical grid (consumption/generation/pricing) + Meteorological data.
- Core Models: SARIMAX, XGBoost, LightGBM, LSTM, Temporal Fusion Transformers (TFT), Chronos, TimesFM.
- Handling Imbalance: Cost-sensitive learning with custom weighted MSE loss.
- Uncertainty Quantification: Quantile Regression for probabilistic forecasting.
- Explainable AI (XAI): SHAP interpretability on tabular models.
The notebooks are expected to be read in the following order:
| # | Notebook | Focus | Key Outputs |
|---|---|---|---|
| 00 | Baselines | EDA & Feature Extraction | Dataset insights, seasonal patterns, correlation analysis, engineered features (calendar, lags, renewable proxies) |
| 01 | SARIMAX | Traditional Statistical Baseline | SARIMAX model optimization and forecasting on fixed and moving forward validation horizons |
| 02 | Tabular ML & Deep Learning | XGBoost, LightGBM, Weighted Ensemble and LSTM with Optuna Tuning | Cost-sensitive learning, SHAP interpretability |
| 03 | SOTA (State-of-the-Art) | Foundation & Transformer Models | Chronos (zero-shot, Amazon T5-based), TimesFM (zero-shot, Google decoder-only), TFT (trained supervised transformer with quantile forecasting) |
Each notebook builds upon insights from the previous ones, progressively moving from exploratory analysis → traditional methods → modern ML → cutting-edge transformers.
This research delivers:
- Comprehensive Baseline Comparisons: From each individual variable of classical approaches (MA and AR), ARIMA-based general modeling procedure to state-of-the-art transformer architectures.
- Weather-Driven Insights: Integration of meteorological features to improve renewable energy forecasting accuracy.
- Uncertainty Quantification: Probabilistic forecasts (quantile-based) rather than point estimates for operational risk management.
- Interpretability: SHAP-based feature importance and attention mechanism visualization for explainable AI in energy markets.
- Robustness: Walk-forward validation and multi-horizon evaluation (daily 7-day forecasts and hourly 24-hour forecasts).
© 2026 WATT Project - Spanish Energy Market Research