# 2.2 Forecasting using Chronos Foundation Model and sktime

In this notebook we will perform zero-shot predictions with and without covariates with the following fundational model for time series prediction: [ChronosForecaster](https://www.sktime.net/en/stable/api_reference/auto_generated/sktime.forecasting.chronos.ChronosForecaster.html)

## 1. Chronos Features

These are the two links to the official source code repository for [Chronos](https://github.com/amazon-science/chronos-forecasting/blob/main/src/chronos/chronos.py) and [Chronos Bolt](https://github.com/amazon-science/chronos-forecasting/blob/main/src/chronos/chronos_bolt.py)  and here you can see that: 
- **Context Length**: any positive integer
- **Horizon Length**: recommended <= 64 datapoints

Considering that```self.model.config.prediction_length = 64``` and in  ```def predict( ...)```
```
if prediction_length > self.model.config.prediction_length:
            msg = (
                f"We recommend keeping prediction length <= {self.model.config.prediction_length}. "
                "The quality of longer predictions may degrade since the model is not optimized for it. "
            )
```
In the following [notebook](https://github.com/amazon-science/chronos-forecasting/blob/main/notebooks/deploy-chronos-bolt-to-amazon-sagemaker.ipynb) it indicates that:

*Recommended to keep prediction_length <= 64 since larger values will result in inaccurate quantile forecasts. Values above 1000 will raise an error.*

[Click here](https://huggingface.co/collections/amazon/chronos-models-and-datasets-65f1791d630a8d57cb718444) to check the different versions available of the model.

In addition, in sktime the model [ChronosForecaster](https://www.sktime.net/en/stable/api_reference/auto_generated/sktime.forecasting.chronos.ChronosForecaster.html) has the following characteristics:
- **Univariate**: predicts a single dependent variable.
- **Covariates**: allows the inclusion of covariates per entry

## 1.1. Developing problems:
1. I don't know how to make it work when configuring the class with: ```"torch_dtype": "torch.float16"```

## 2. Environment configuration
1. **Open PyCharm, or your preferred IDE, with Admin privileges.**
2. Install a version of ```python``` compatible. I installed python 3.10.11. not older, ...
3. Install a version of ```cuda``` compatible. I installed cuda v12.4 not older v12.6, ...
4. Install a version of ```pythorch``` compatible. I use Windows and installed it with the following command ([pytorch previous-versions](https://pytorch.org/get-started/previous-versions/#linux-and-windows)):
    ```
    pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124
    ```
5. To **import the Python modules that we have created locally in the project** we do the following:

In [1]:
import sys
from pathlib import Path

# path to the notebook
notebook_path = Path().resolve()

# path to the project root (path to ppf)
project_path = notebook_path.parents[0]

# path to the directory where is placed the source code of the project  (path to ppf/src/ppf/)
code_project_path = notebook_path.parents[0] / 'src'  / 'ppf'

# insert the paths where to search for python modules (import)
sys.path.insert(0, str(code_project_path))

In [2]:
print("sys.path:")
for p in sys.path:
    print(" ", p)

sys.path:
  /home/michi/Workspaces/Python/ppf/src/ppf
  /home/michi/miniconda3/envs/ppf-chronos-manual/lib/python312.zip
  /home/michi/miniconda3/envs/ppf-chronos-manual/lib/python3.12
  /home/michi/miniconda3/envs/ppf-chronos-manual/lib/python3.12/lib-dynload
  
  /home/michi/miniconda3/envs/ppf-chronos-manual/lib/python3.12/site-packages


## 3. Obtain pollen predictions
### 3.1 Without covariates
#### 3.1.1 Constants

In [3]:
from datetime import date, timedelta
import pandas as pd

FIRST_YEAR = 1993
LAST_YEAR = 2023

# First and last day in the datset
FIRST_DATE = date(FIRST_YEAR, 1, 1)
LAST_DATE = date(LAST_YEAR, 12, 31)

#### 3.1.2 Arguments

In [4]:
# 1 week of horizon
HORIZON_SIZE = 7 

# 1 year of context
INPUT_SIZE = 365

# 0 year of training (zero-shot, no fitting)
TRAIN_SIZE = 0

# benchmark for 2000, 2001, ..., 2021, 2022 (22 years)
START_YEAR = 2000
END_YEAR = 2022

# half a yar of offset days
OFFSET_DAYS = -182

In [5]:
from non_leap_date_utils import calculate_non_leap_offset_dates, add_non_leap_days

# obtener todas la fecha necesarias
START_TRAINING, _ = calculate_non_leap_offset_dates(START_YEAR, END_YEAR, INPUT_SIZE, TRAIN_SIZE, OFFSET_DAYS, HORIZON_SIZE)
START_DATE, END_DATE = calculate_non_leap_offset_dates(START_YEAR, END_YEAR, INPUT_SIZE, 0, OFFSET_DAYS, HORIZON_SIZE)
print(START_TRAINING)
print(START_DATE)
print(END_DATE)

if START_TRAINING < FIRST_DATE or LAST_DATE < END_DATE:
    print(f"Data younger than {FIRST_DATE} or older than {LAST_DATE} cannot be retrieved from the dataset. Therefore a subset will be obtained.")
    amount_missing_data = (END_DATE - LAST_DATE).days
else:
    amount_missing_data = 0

1998-06-27
1998-06-27
2023-01-06


In [6]:
import os

# Output directory for predictions
PREDICTIONS_DIR = "../outputs/predictions"
# Output directory for models runtime
TIMING_DIR = "../outputs/timing"
# Create the directory
os.makedirs(PREDICTIONS_DIR, exist_ok=True)
os.makedirs(TIMING_DIR, exist_ok=True)

#### 3.1.3 Get the pollen time series

1. Let's load the datset and check that it has the "date" column:

In [7]:
# Load the dataset
df = pd.read_csv("../datasets/AlnusOurense9322.csv")

if "date" in df.columns:
    # Set the "date" column as the index
    df["date"] = pd.to_datetime(df["date"], format="%d-%m-%Y")
    df.set_index("date", inplace=True)
else:
    raise ValueError("Dataset must contain a 'date' column.")

2. Let's get the pollen time series from the dataset over the required time period (en, endogenous)

In [8]:
# Get the pandas.Series pollen and filters the data by years 
en = df[(df.index >= pd.Timestamp(START_TRAINING)) & (df.index <= pd.Timestamp(END_DATE))]["pollen"]

# Force daily frequency
en = en.asfreq("D") 

ex = None

#### 3.1.4 Specifying the forecasting horizon

We are going to do a prediction with the forecasting horizon of 1 week.

In [9]:
from sktime.forecasting.base import ForecastingHorizon
import numpy as np

fh = ForecastingHorizon(np.arange(1, HORIZON_SIZE+1), is_relative=True)
fh

ForecastingHorizon([1, 2, 3, 4, 5, 6, 7], dtype='int64', is_relative=True)

#### 3.1.5 Specifying the forecasting algorithm

To make forecasts, a forecasting algorithm needs to be specified. This is done using a scikit-learn-like interface. Most importantly, all sktime forecasters follow the same interface, so the preceding and remaining steps are the same, no matter which forecaster is being chosen.

We will work with all the available versions of **Chronos**: [Chronos versions](https://huggingface.co/collections/amazon/chronos-models-and-datasets-65f1791d630a8d57cb718444)

**We'll also be testing the conservative, auto and stochastic model configurations**. 

We will use as ```torch_dtype``` the ```torch.bfloat16``` because it has similar dynamic range to ```torch.float32```, but less precision than ```torch.float32```.

In [10]:
from sktime.forecasting.chronos import ChronosForecaster
import torch

# Definimos los tamaños disponibles por modelo
sizes_by_variant = {
    "t5": ["tiny", "mini", "small", "base", "large"],
    "bolt": ["tiny", "mini", "small", "base"]  # sin 'large'
}

# Conservative no deterministic (num_samples > 1)
chronos_models = {
    f"chronos.conservative.{variant}.{size}": ChronosForecaster(
        model_path=f"amazon/chronos-{variant}-{size}",
        config={
            "num_samples": 35,
            "temperature": 0.25,
            "top_k": 8,
            "top_p": 0.85,
            "torch_dtype": torch.bfloat16,
            "device_map": "cuda"
        },
        seed=42,
        use_source_package=False,
        ignore_deps=False
    )
    for variant, sizes in sizes_by_variant.items()
    for size in sizes
}

# Auto (training model parameters)
chronos_models.update(
    {
        f"chronos.auto.{variant}.{size}": ChronosForecaster(
            model_path=f"amazon/chronos-{variant}-{size}",
            config={
                "torch_dtype": torch.bfloat16,
                "device_map": "cuda"
            },
            seed=43,
            use_source_package=False,
            ignore_deps=False
        )
        for variant, sizes in sizes_by_variant.items()
        for size in sizes
    }
)

# Stochastic (balanced configuration)
chronos_models.update(
    {
        f"chronos.stochastic.{variant}.{size}": ChronosForecaster(
            model_path=f"amazon/chronos-{variant}-{size}",
            config={
                "num_samples": 50,
                "temperature": 0.6,
                "top_k": 20,
                "top_p": 0.93,
                "torch_dtype": torch.bfloat16,
                "device_map": "cuda"
            },
            seed=44,
            use_source_package=False,
            ignore_deps=False
        )
        for variant, sizes in sizes_by_variant.items()
        for size in sizes
    }
)

  from .autonotebook import tqdm as notebook_tqdm


#### 3.1.5 Defining the splitter

In [11]:
from sktime.forecasting.model_selection import SlidingWindowSplitter

cv = SlidingWindowSplitter(fh=fh, window_length=INPUT_SIZE , step_length=1, start_with_window=True)

#### 3.1.6 Requesting pollen forecasts

In [12]:
from get_predictions import predict

for name, model in chronos_models.items():
    result, fit_time, predict_time = predict(
        model=model,
        model_name=name,
        start_year = START_YEAR,
        end_year = END_YEAR,
        y=en,
        X=None,      
        input_size=INPUT_SIZE,
        train_size=TRAIN_SIZE, 
        offset_days = OFFSET_DAYS,
        splitter=cv,
        start_date = START_DATE,
        amount_missing_data=amount_missing_data,
        predictions_dir=PREDICTIONS_DIR,
        timing_dir=TIMING_DIR
    )

Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.t5.tiny_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.t5.mini_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.t5.small_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.t5.base_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.t5.large_0_365_7_without_covariates.csv


Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.48.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.bolt.tiny_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.bolt.mini_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.bolt.small_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.conservative.bolt.base_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.t5.tiny_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.t5.mini_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.t5.small_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.t5.base_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.t5.large_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.bolt.tiny_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.bolt.mini_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.bolt.small_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.auto.bolt.base_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.t5.tiny_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.t5.mini_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.t5.small_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.t5.base_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.t5.large_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.bolt.tiny_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.bolt.mini_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.bolt.small_0_365_7_without_covariates.csv
Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/chronos.stochastic.bolt.base_0_365_7_without_covariates.csv


## References:
- [ChronosForecaster](https://www.sktime.net/en/stable/api_reference/auto_generated/sktime.forecasting.chronos.ChronosForecaster.html) Sktime API
- [Chronos Adapter](https://github.com/sktime/sktime/blob/v0.38.3/sktime/forecasting/chronos.py#L180-L554) Source Code Sktime
- [Chronos Variations](https://huggingface.co/collections/amazon/chronos-models-and-datasets-65f1791d630a8d57cb718444)
- [Chronos Source Code](https://github.com/amazon-science/chronos-forecasting/blob/main/src/chronos/chronos.py)
- [Chronos Bolt Source Code](https://github.com/amazon-science/chronos-forecasting/blob/main/src/chronos/chronos_bolt.py) 
- [Chronos Notebook](https://github.com/amazon-science/chronos-forecasting/blob/main/notebooks/deploy-chronos-bolt-to-amazon-sagemaker.ipynb)


**➡️ [Next notebook: 02-03_Forecasting_with_NHiTs_one_year_train](../notebooks/02-03_Forecasting_with_NHiTs_one_year_train.ipynb)**