# Introduction  

**What is Autogloun**  
AutoGluon is an open-source AutoML (Automated Machine Learning) library developed by Amazon Web Services (AWS). It enables developers and data scientists to quickly build accurate machine learning models

[**Website**](https://auto.gluon.ai/)  
[**GitHub**](https://github.com/autogluon/autogluon)

**Dataset info**  
Half-hourly UK electricity demand data (2009-2024) from [National Grid ESO] (https://www.neso.energy/), including national demand, renewable generation estimates, and interconnector flows.






# Setup: Libraries and Data

Install the AutoGluon library, import the required libraries, and load the dataset.



In [None]:
# Install the autogluon library
#!pip install -q autogluon

In [None]:
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt

from autogluon.timeseries import TimeSeriesDataFrame, TimeSeriesPredictor
from autogluon.timeseries.splitter import ExpandingWindowSplitter

In [None]:
# Load the data


# Data Processing and Visualization

We rename the target column, preview the data, plot the target, and assess its basic structure.

In [None]:
# set time stamp to_datetime



In [None]:
# Visualize the target variable over time


In [None]:
# Rename the target variable "england_wales_demand" for easier referencing in our modeling steps


# Autogluon Model - Simple

We specify a prediction length and build a simple AutoGluon time-series model. 
We train it, list the models used, check the leaderboard, and generate predictions.

For a list of timeseries forecasting models trained by AutGluon see [here](https://auto.gluon.ai/dev/tutorials/timeseries/forecasting-model-zoo.html)  
The portfolio is comprehensive of:  
- baseline models  
- statistical models  
- deep learning models  
- tabular models  
- pre-trained models 
  

## AutoGluon TimeSeriesDataFrame parameters

  

**id_column**:  
It just tells AutoGluon that all rows belong to one single time series.  
If the dataset has two locations (England and Wales) the time series has two groups. In this case we have only one.  

**static_features_df**   
it defines the attributes that are constant in the dataframe - attributes that do change between time series (eg. between England and Wales).   
For your UK dataset with a single series, static features aren't very useful since there's only one time series.

In [None]:
# Convert raw data into a TimeSeriesDataFrame

from autogluon.timeseries import TimeSeriesDataFrame



In [None]:
# split the dataset in train and test

from sklearn.model_selection import train_test_split



In [None]:
# check train


## Training

Create the model defining the parameters and train it to forecast the values

In [None]:
# predict one week ahead = (2*24)*7=336


In [None]:
# Initialize the TimeSeriesPredictor with essential parameters


### Model Parameters

**presets**  
Controls the number and complexity of models AutoGluon will train:

- `high_quality`: 12+ models (most thorough, slowest)
- `medium_quality`: 5–8 models (balanced)
- `low_quality`: fewer models (fastest, least accurate)

**num_val_windows**  
Number of validation windows used to evaluate model performance. Each window has the same length as `prediction_length` (in our case, 1 week). AutoGluon computes the evaluation metric (e.g., MAPE) on each window and averages the results.

Practical guidelines:
- 3 — quick iteration, good for initial experiments
- 5–10 — more robust evaluation, recommended for final models
- >10 — diminishing returns, significantly slower

**val_step_size**  
The offset (in time steps) between consecutive validation windows. With `val_step_size = 96` (2 days of 30-min intervals), each window starts 2 days before the previous one:

Note: Only the validation windows are used to compute metrics. All data *before* each window is used for training.

**random_seed**  
Fixes the random number generator to ensure reproducible results across runs.


In [None]:
# Train the model with specified presets, validation steps, and time limits


In [None]:
# List the model_names


## Monitoring

**leaderboard**:  
Normally, error metrics like RMSE, MASE, MAPE are "lower is better" — a smaller error means a better model. But AutoGluon standardizes all metrics to **"higher is better"**. This means that metrics such as MASE or MAPE will be multiplied by -1, so their values will be negative.  


Raw RMSE	AutoGluon score
3.27	-3.27
4.15	-4.15
So -3.27 is better than -4.15 because it's higher (closer to zero).

In [None]:
# Show performance (.leaderboard) of trained models


## Prediction

In [None]:
# Generate predictions using the trained model
# it performs the model with the best prediction score (see leaderboard)



In [None]:
# Plot the actual time series (timeseries_df) along with the model predictions (predictions)


We have a simple model trained on the data. The seasonality is more or less stable, but the CI becomes quite wide. 

# AutoGluon with Covariates

We load future covariates, convert them into the correct format, and retrain the model by including extra features like event days and day of week.


In [None]:
# Load future covariates to enrich the forecasting process


In [None]:
# Convert the future covariates DataFrame into a TimeSeriesDataFrame


## Training

In [None]:
#  Pass the whole dataset (timeseries_df, train, test) and  
# Initialize the TimeSeriesPredictor with known covariates


In [None]:
# Train the model using the covariates


In [None]:
# Display the leaderboard for the new model


In [None]:
# Generate predictions using known covariates


In [None]:
# Plot the actual time series and predictions


We have now included extra features to boost the model's understanding of the data and refine the forecasts.

# Interpretability

We generate and visualize the feature importance to see which features have the biggest impact on the model's predictions.

In [None]:
# Extract and store the feature importance metrics


In [None]:
# Display the feature importance DataFrame


In [None]:
# Plot the feature importance for a clear picture
