# Electricity Load Forecasting (h+1) â€” XGBoost Machine Learning Model

The goal of this notebook is to build a **XGBoost** (Extreme Gradient Boosting) model to predict **hourly electricity demand one hour ahead (h+1)** using historical load, weather, and calendar features.

This notebook follows **time-series best practices**:
- No data leakage
- Time-aware train/validation/test splits
- Baseline comparison
- Progressive model complexity

## 1. Problem formulation

We formulate the task as a **supervised regression problem** with:
* **Features (X)** available at time t:
    * Load at time t
    * Load lags: t-1, t-24, t-168
    * Temperature at time t
    * Calendar features (hour, weekday, week of year)
* **Target (y)**:
    * Load at time t+1
    
Mathematically:
$$\hat{y}_{t+1} = f(X_t)$$

## 2. Imports & Setup

In [1]:
from pathlib import Path
import pandas as pd
import numpy as np

import xgboost as xgb

from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_absolute_error, mean_squared_error

## 3. Project Path and Parameters

In [2]:
PROJECT_ROOT = Path.cwd().parents[0]
FEATURED_BASE_PATH = PROJECT_ROOT / "data" / "featured"

In [3]:
# Parameters
country = "FR"
years = list(range(2015, 2024+1))