### ISF Workshop on Deep Learning, Foundational Models and AutoML for Forecasting

Hands-on session, adapted from [\[1\]](https://colab.research.google.com/github/autogluon/autogluon/blob/stable/docs/tutorials/timeseries/forecasting-quick-start.ipynb), [\[2\]](https://colab.research.google.com/github/autogluon/autogluon/blob/stable/docs/tutorials/timeseries/forecasting-indepth.ipynb).

<a target="_blank" href="https://colab.research.google.com/github/https://github.dev/canerturkmen/isf-workshop-2024/blob/main/Forecasting_with_AutoGluon_and_Chronos.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Let's start by installing AutoGluon-TimeSeries and loading some data!

- Always good to have the docs around: [auto.gluon.ai](https://auto.gluon.ai/).
- Chronos models and datasets on [Hugging Face](https://huggingface.co/collections/amazon/chronos-models-and-datasets-65f1791d630a8d57cb718444)
- GluonTS docs on [ts.gluon.ai](https://ts.gluon.ai).

In [None]:
!pip install -q autogluon.timeseries  # `pip install autogluon` for getting all of AutoGluon!
!pip install -q datasets

In [None]:
import datasets
import pandas as pd
from matplotlib import pyplot as plt

from autogluon.timeseries import TimeSeriesDataFrame, TimeSeriesPredictor

To use `autogluon.timeseries`, we will only need the following two classes:

- `TimeSeriesDataFrame` stores a dataset consisting of multiple time series.
- `TimeSeriesPredictor` takes care of fitting, tuning and selecting the best forecasting models, as well as generating new forecasts.

AutoGluon expects time series data in [long format](https://doc.dataiku.com/dss/latest/time-series/data-formatting.html#long-format).
Each row of the data frame contains a single observation (timestep) of a single time series represented by

- unique ID of the time series (`"item_id"`) as int or str
- timestamp of the observation (`"timestamp"`) as a `pandas.Timestamp` or compatible format
- numeric value of the time series (`"target"`)

In [None]:
raw_df = pd.read_csv("https://autogluon.s3.amazonaws.com/datasets/timeseries/m4_hourly_tiny/train.csv")

m4_train_data = TimeSeriesDataFrame(
    raw_df,
    id_column="item_id",
)

m4_test_data = TimeSeriesDataFrame.from_path("https://autogluon.s3.amazonaws.com/datasets/timeseries/m4_hourly_tiny/test.csv")

m4_train_data

- We refer to each individual time series stored in a `TimeSeriesDataFrame` as an _item_.
- For example, items might correspond to different products in demand forecasting, or to different stocks in financial datasets.

In [None]:
plt.figure(figsize=(10, 3))
plt.plot(
    m4_train_data.loc["H10"]
)

### Basic Training Run with AutoGluon-TimeSeries

We need to define
- The _task_ in the predictor initialization (prediction_length, eval_metric)
- Predictor `fit` takes parameters on _how_ to train the forecaster.

In [None]:
predictor = TimeSeriesPredictor(
    prediction_length=48,
    eval_metric="MASE",
    # path="my-autogluon-model",
    # target="target",
)

predictor.fit(m4_train_data, time_limit=2 * 60)


In [None]:
predictions = predictor.predict(m4_train_data)
predictions.head()

In [None]:
predictor.leaderboard()

In [None]:
# TimeSeriesDataFrame can also be loaded directly from a file
predictor.plot(m4_test_data, predictions, quantile_levels=[0.1, 0.9], max_history_length=200, max_num_item_ids=4);


In [None]:
# The test score is computed using the last
# prediction_length=48 timesteps of each time series in test_data
predictor.leaderboard(m4_test_data)

### Customizing AutoGluon-TimeSeries Training

Let's look at a more realistic time series forecasting scenario.

- With covariates
- Probabilistic forecasting instead of point forecasting.
- Multi-window backtesting

In [None]:
# check out the Chronos datasets on Hugging Face!

features = ["timestamp", "t_mean", "prcp_sum"]

# Load from Hugging Face
raw_df = datasets.load_dataset(
    path="autogluon/chronos_datasets",
    name="monash_temperature_rain",
    split="train[:20]",
).select_columns(
    ["id"] + features
).to_pandas().explode(features).infer_objects()

raw_data = TimeSeriesDataFrame(raw_df, id_column="id")

train_data = raw_data.slice_by_timestep(end_index=-3)
test_data = raw_data

train_data

In [None]:
predictor = TimeSeriesPredictor(
    prediction_length=3,
    path="my-better-autogluon-model",
    eval_metric="WQL",  # let's go probabilistic
    quantile_levels=[0.05, 0.5, 0.95],  # quantile levels to consider
    target="t_mean",
    known_covariates_names=["prcp_sum"],
)

predictor.fit(
    train_data,
    presets="medium_quality",  # see: https://auto.gluon.ai/stable/api/autogluon.timeseries.TimeSeriesPredictor.fit.html
    time_limit=4 * 60,
    num_val_windows=3,  # multi-window testing
)

In [None]:
predictor.leaderboard(test_data)

### Forecasting with Chronos

In [None]:
predictor = TimeSeriesPredictor(
    prediction_length=48,
    path="my-autogluon-model-with-chronos",
)

predictor.fit(
    m4_train_data,
    presets="chronos_tiny",
    time_limit=60 * 60,
)

In [None]:
predictions = predictor.predict(
    m4_test_data.slice_by_timestep(end_index=-48),
)

predictor.plot(
    data=m4_test_data,
    predictions=predictions,
    quantile_levels=[0.1, 0.9],
    max_num_item_ids=4,
)

In [None]:
predictor = TimeSeriesPredictor(prediction_length=48)

predictor.fit(
    m4_train_data,
    hyperparameters={
        "Chronos": {"model_path": "tiny"},
        "DeepAR": {},
        "RecursiveTabular": {},
    },
    time_limit=5*60,
)

predictions = predictor.predict(
    m4_test_data.slice_by_timestep(end_index=-48),
)

_ = predictor.plot(
    data=m4_test_data,
    predictions=predictions,
    quantile_levels=[0.1, 0.9],
    max_num_item_ids=4,
)