# 2.1 Forecasting using MOIRAI Foundation Model and sktime

In this notebook we will perform zero-shot predictions with and without covariates with the following fundational model for time series prediction: [MOIRAIForecaster](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.moirai_forecaster.MOIRAIForecaster.html#rb64756c851d6-1)

## 1. MOIRAI
### 1.1. Features

[GitHub](https://github.com/SalesforceAIResearch/uni2ts?tab=readme-ov-file#-getting-started) and here you can see that:
- **Context Length**: any positive integer
- **Horizon Length**: any positive integer

Furthermore, in this [paper](https://arxiv.org/pdf/2402.02592) section *D.4. Computation Costs* (page 23) tests are carried out with **Context Length** and **Prediction Length** of up to 5000 datapoints.

[Click here](https://huggingface.co/collections/sktime/moirai-variations-66ba3bc9f1dfeeafaed3b974) to check the different versions available of the model, and here you can see that:
- **Last Update**: Sept, 2024

In addition, in sktime the model has the following characteristics that you can consult in [sktime](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.moirai_forecaster.MOIRAIForecaster.html#rb64756c851d6-1) or in the [Git repo](https://github.com/sktime/sktime/blob/main/sktime/forecasting/moirai_forecaster.py#L21-L653):
- **Univariate**: predicts a single dependent variable.
- **Covariates**: allows the inclusion of covariates per entry

### 1.2. Errors
1. It won't let me work with a ```target_dim``` less than 2. In our case, it's only 1: ```pollen```.
2. If you change the value from zero to any of the parameters ```num_feat_dynamic_real```, ```num_past_feat_dynamic_real```, it starts throwing errors that I don't know how to solve.

## 2. Environment configuration
1. **Open PyCharm, or your preferred IDE, with Admin privileges.**
2. Install a version of ```python``` compatible. I installed python 3.10.11. not older, ...
3. Install a version of ```cuda``` compatible. I installed cuda v12.4 not older v12.6, ...
4. Install a version of ```pythorch``` compatible. I use Windows and installed it with the following command ([pytorch previous-versions](https://pytorch.org/get-started/previous-versions/#linux-and-windows)):
    ```
    pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124
    ```

5. To **import the Python modules that we have created locally in the project** we do the following:

In [1]:
import sys
from pathlib import Path

# path to the notebook
notebook_path = Path().resolve()

# path to the project root (path to ppf)
project_path = notebook_path.parents[0]

# path to the directory where is placed the source code of the project  (path to ppf/src/ppf/)
code_project_path = notebook_path.parents[0] / 'src'  / 'ppf'

# insert the paths where to search for python modules (import)
sys.path.insert(0, str(code_project_path))

In [2]:
print("sys.path:")
for p in sys.path:
    print(" ", p)

sys.path:
  /home/michi/Workspaces/Python/ppf/src/ppf
  /home/michi/miniconda3/envs/ppf-chronos-manual/lib/python312.zip
  /home/michi/miniconda3/envs/ppf-chronos-manual/lib/python3.12
  /home/michi/miniconda3/envs/ppf-chronos-manual/lib/python3.12/lib-dynload
  
  /home/michi/miniconda3/envs/ppf-chronos-manual/lib/python3.12/site-packages


## 3. Obtain pollen predictions
### 3.1 Without covariates
#### 3.1.1 Constants

In [3]:
from datetime import date, timedelta
import pandas as pd

FIRST_YEAR = 1993
LAST_YEAR = 2023

# First and last day in the datset
FIRST_DATE = date(FIRST_YEAR, 1, 1)
LAST_DATE = date(LAST_YEAR, 12, 31)

#### 3.1.2 Arguments

In [4]:
# 1 week of horizon
HORIZON_SIZE = 7

# 1 year of context
INPUT_SIZE = 365 

# 0 year of training (zero-shot, no fitting)
TRAIN_SIZE = 0

# benchmark for 2000, 2001, ..., 2021, 2022 (22 years)
START_YEAR = 2000
END_YEAR = 2022

# half a yar of offset days
OFFSET_DAYS = -182

In [5]:
from non_leap_date_utils import calculate_non_leap_offset_dates, add_non_leap_days

# obtener todas la fecha necesarias
START_TRAINING, _ = calculate_non_leap_offset_dates(START_YEAR, END_YEAR, INPUT_SIZE, TRAIN_SIZE, OFFSET_DAYS, HORIZON_SIZE)
START_DATE, END_DATE = calculate_non_leap_offset_dates(START_YEAR, END_YEAR, INPUT_SIZE, 0, OFFSET_DAYS, HORIZON_SIZE)
print(START_TRAINING)
print(START_DATE)
print(END_DATE)

if START_TRAINING < FIRST_DATE or LAST_DATE < END_DATE:
    print(f"Data younger than {FIRST_DATE} or older than {LAST_DATE} cannot be retrieved from the dataset. Therefore a subset will be obtained.")
    amount_missing_data = (END_DATE - LAST_DATE).days
else:
    amount_missing_data = 0

1998-06-27
1998-06-27
2023-01-06


In [6]:
import os

# Output directory for predictions
PREDICTIONS_DIR = "../outputs/predictions"
# Output directory for models runtime
TIMING_DIR = "../outputs/timing"
# Create the directory
os.makedirs(PREDICTIONS_DIR, exist_ok=True)
os.makedirs(TIMING_DIR, exist_ok=True)

#### 3.1.3 Get the pollen time series
1. Let's load the datset and check that it has the "date" column:

In [7]:
# Load the dataset
df = pd.read_csv("../datasets/AlnusOurense9322.csv")

if "date" in df.columns:
    # Set the "date" column as the index
    df["date"] = pd.to_datetime(df["date"], format="%d-%m-%Y")
    df.set_index("date", inplace=True)
else:
    raise ValueError("Dataset must contain a 'date' column.")

2. Let's get the pollen time series from the dataset over the required time period (en, endogenous)

In [8]:
# Get the pandas.Series pollen and filters the data by years 
en = df[(df.index >= pd.Timestamp(START_TRAINING)) & (df.index <= pd.Timestamp(END_DATE))]["pollen"]

# Force daily frequency
en = en.asfreq("D") 

ex = None

#### 3.1.4 Specifying the forecasting horizon

We are going to do a prediction with the forecasting horizon of 1 week.

In [9]:
from sktime.forecasting.base import ForecastingHorizon
import numpy as np

fh = ForecastingHorizon(np.arange(1, HORIZON_SIZE+1), is_relative=True)
fh

ForecastingHorizon([1, 2, 3, 4, 5, 6, 7], dtype='int64', is_relative=True)

#### 3.1.5 Specifying the forecasting algorithm

To make forecasts, a forecasting algorithm needs to be specified. This is done using a scikit-learn-like interface. Most importantly, all sktime forecasters follow the same interface, so the preceding and remaining steps are the same, no matter which forecaster is being chosen.

We will work with all the available versions of **MOIRAI**: [MOIRAI versions](https://huggingface.co/collections/sktime/moirai-variations-66ba3bc9f1dfeeafaed3b974)

**We'll also be testing the deterministic and stochastic model configurations**.

In [10]:
from sktime.forecasting.moirai_forecaster import MOIRAIForecaster

# Moirai models
# Adding deterministic models
moirai_models = {
    f"moirai.deterministic.{size}": MOIRAIForecaster(
        checkpoint_path=f"sktime/moirai-1.0-R-{size}",
        context_length=INPUT_SIZE,
        patch_size=32,
        #num_samples=100,
        #num_feat_dynamic_real=3,
        #num_past_feat_dynamic_real=0,
        map_location=None,
        #target_dim= 1,
        broadcasting=False,
        deterministic=True,
        batch_size=32,
        use_source_package=False
    )
    for size in ["small", "base", "large"]
}

# Adding stochastic models
moirai_models.update({
    f"moirai.stochastic.{size}": MOIRAIForecaster(
        checkpoint_path=f"sktime/moirai-1.0-R-{size}",
        context_length=INPUT_SIZE,
        patch_size=32,
        # num_samples=100,
        #num_feat_dynamic_real=3,
        #num_past_feat_dynamic_real=0,
        map_location=None,
        #target_dim=1,
        broadcasting=False,
        deterministic=False,
        batch_size=32,
        use_source_package=False
    )
    for size in ["small", "base", "large"]
})

  from .autonotebook import tqdm as notebook_tqdm




#### 3.1.5 Defining the splitter

In [11]:
from sktime.forecasting.model_selection import SlidingWindowSplitter

cv = SlidingWindowSplitter(fh=fh, window_length=INPUT_SIZE , step_length=1, start_with_window=True)

#### 3.1.6 Requesting pollen forecasts

In [12]:
from get_predictions import predict

for name, model in moirai_models.items():    
    result, fit_time, predict_time = predict(
        model=model,
        model_name=name,
        start_year = START_YEAR,
        end_year = END_YEAR,
        y=en,
        X=None,      
        input_size=INPUT_SIZE,
        train_size=TRAIN_SIZE, 
        offset_days = OFFSET_DAYS,
        splitter=cv,
        start_date = START_DATE,
        amount_missing_data=amount_missing_data,
        predictions_dir=PREDICTIONS_DIR,
        timing_dir=TIMING_DIR
    )

Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/moirai.deterministic.small_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/moirai.deterministic.base_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/moirai.deterministic.large_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/moirai.stochastic.small_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/moirai.stochastic.base_0_365_7_without_covariates.csv


Predicting: 1998-06-27 00:00:00 - 2023-01-06 00:00:00


Saved predictions to: ../outputs/predictions/moirai.stochastic.large_0_365_7_without_covariates.csv


## References:

- [Moirai Adapter](https://github.com/sktime/sktime/blob/main/sktime/forecasting/moirai_forecaster.py#L21-L653) Source Code used by Sktime
- [MoiraiForecaster](https://www.sktime.net/en/latest/api_reference/auto_generated/sktime.forecasting.moirai_forecaster.MOIRAIForecaster.html#rb64756c851d6-1) Sktime API
- [Huggingface Moirai Variations](https://huggingface.co/collections/sktime/moirai-variations-66ba3bc9f1dfeeafaed3b974) 
- [Moirai Paper](https://arxiv.org/pdf/2402.02592)
- [Official GitHub Repo](https://github.com/SalesforceAIResearch/uni2ts?tab=readme-ov-file#-getting-started)

**➡️ [Next notebook: 02-02_Forecasting_with_Chronos](../notebooks/02-02_Forecasting_with_Chronos.ipynb)**