# Quickstart - Run TS Orchestra on GiftEval

This notebook shows how to run TS Orchestra on the gift-eval benchmark.

Make sure you download the gift-eval benchmark and set the `GIFT-EVAL` environment variable correctly before running this notebook.

We will use the `Dataset` class to load the data and run the model. If you have not already please check out the [dataset.ipynb](./dataset.ipynb) notebook to learn more about the `Dataset` class. We are going to just run the model on two datasets for brevity. But feel free to run on any dataset by changing the `short_datasets` and `med_long_datasets` variables below.

## Setting up TS Orchestra

Clone the TS Orchestra repository and add the file to the python path.

In [5]:
# TODO: Replace this url with the real TS Orchestra repository when it is made public
# !git clone https://github.com/mpg05883/Private-TS-Orchestra.git

# cd ./ts-orchestra && pip install -e .
# pip install dotted_dict, tabulate, timecopilot

In [6]:
import sys
import os

# Add the ts-orchestra subdirectory to the path
sys.path.append(os.path.abspath("ts-orchestra"))
sys.path.append(os.path.abspath("ts-orchestra/src"))
print("Added ts-orchestra to Python path")

import tsorchestra
print(f"Imported tsorchestra from: {tsorchestra.__file__}")

Added ts-orchestra to Python path


 See https://github.com/google-research/timesfm/blob/master/README.md for updated APIs.
Imported tsorchestra from: /projects/bcqc/mgee2/gift-eval/ts-orchestra/src/tsorchestra/__init__.py


Specify the datasets to evaluate TS Orchestra on.

In [7]:
import json
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# short_datasets = "m4_yearly m4_quarterly m4_monthly m4_weekly m4_daily m4_hourly electricity/15T electricity/H electricity/D electricity/W solar/10T solar/H solar/D solar/W hospital covid_deaths us_births/D us_births/M us_births/W saugeenday/D saugeenday/M saugeenday/W temperature_rain_with_missing kdd_cup_2018_with_missing/H kdd_cup_2018_with_missing/D car_parts_with_missing restaurant hierarchical_sales/D hierarchical_sales/W LOOP_SEATTLE/5T LOOP_SEATTLE/H LOOP_SEATTLE/D SZ_TAXI/15T SZ_TAXI/H M_DENSE/H M_DENSE/D ett1/15T ett1/H ett1/D ett1/W ett2/15T ett2/H ett2/D ett2/W jena_weather/10T jena_weather/H jena_weather/D bitbrains_fast_storage/5T bitbrains_fast_storage/H bitbrains_rnd/5T bitbrains_rnd/H bizitobs_application bizitobs_service bizitobs_l2c/5T bizitobs_l2c/H"
short_datasets = "ett1/H"

# med_long_datasets = "electricity/15T electricity/H solar/10T solar/H kdd_cup_2018_with_missing/H LOOP_SEATTLE/5T LOOP_SEATTLE/H SZ_TAXI/15T M_DENSE/H ett1/15T ett1/H ett2/15T ett2/H jena_weather/10T jena_weather/H bitbrains_fast_storage/5T bitbrains_rnd/5T bizitobs_application bizitobs_service bizitobs_l2c/5T bizitobs_l2c/H"
med_long_datasets = ""

# Get union of short and med_long datasets
all_datasets = list(set(short_datasets.split() + med_long_datasets.split()))

dataset_properties_map = json.load(open("dataset_properties.json"))

Instantiate the metrics to use during evaluation.

In [8]:
from gluonts.ev.metrics import (
    MSE,
    MAE,
    MASE,
    MAPE,
    SMAPE,
    MSIS,
    RMSE,
    NRMSE,
    ND,
    MeanWeightedSumQuantileLoss,
)

metrics = [
    MSE(forecast_type="mean"),
    MSE(forecast_type=0.5),
    MAE(),
    MASE(),
    MAPE(),
    SMAPE(),
    MSIS(),
    RMSE(),
    NRMSE(),
    ND(),
    MeanWeightedSumQuantileLoss(
        quantile_levels=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
    ),
]

Create the ensemble.

In [9]:
from tsorchestra.models.foundation import (
    Moirai,
    Sundial,
    Toto,
    Chronos,
    FlowState,
    TimesFM,
)
from tsorchestra.models.ensembles import SLSQPEnsemble

batch_size = 32

# Metric to optimize during cross-validation
metric = "mae"

# Load models
models = [
    Moirai(batch_size=batch_size),
    Sundial(batch_size=batch_size),
    Toto(batch_size=batch_size),
    Chronos(batch_size=batch_size),
    FlowState(batch_size=batch_size),
    TimesFM(batch_size=batch_size),
]

# Use models to create ensemble forecaster
forecaster = SLSQPEnsemble(
    models=models,
    metric=metric,
)

INFO:p-1649965:t-139854902654784:slsqp.py:__init__:[SLSQPEnsemble] Initializing ensemble with 6 models (Chronos, FlowState, Moirai, Sundial, TimesFM, Toto), metric=mae, n_windows=1


Evaluate TS Orchestra.

In [10]:
import csv

from gift_eval.data import Dataset
from gluonts.model import evaluate_model
from gluonts.time_feature import get_seasonality
from pytorch_lightning import seed_everything
from tsorchestra.models.common.gluonts_predictor import GluonTSPredictor

# Set the seed for reproducibility
seed_everything(42, workers=True, verbose=True)


pretty_names = {
    "saugeenday": "saugeen",
    "temperature_rain_with_missing": "temperature_rain",
    "kdd_cup_2018_with_missing": "kdd_cup_2018",
    "car_parts_with_missing": "car_parts",
}


model_name = "tsorchestra"

# set the output directory and CSV file path
output_dir = f"../results/{model_name}"
os.makedirs(output_dir, exist_ok=True)
csv_file_path = os.path.join(output_dir, "all_results.csv")

completed_datasets = set()
# 1. Check if the results file exists and read the completed datasets to allow resuming
if os.path.exists(csv_file_path):
    print(f"'{csv_file_path}' exists. Reading completed datasets...")
    with open(csv_file_path, "r", newline="") as csvfile:
        reader = csv.reader(csvfile)
        next(reader)
        for row in reader:
            if row:
                completed_datasets.add(row[0])
    print(f"Found {len(completed_datasets)} completed datasets.")

# 2. If the file doesn't exist, create it and write the header
else:
    with open(csv_file_path, "w", newline="") as csvfile:
        writer = csv.writer(csvfile)

        # Write the header
        writer.writerow(
            [
                "dataset",
                "model",
                "eval_metrics/MSE[mean]",
                "eval_metrics/MSE[0.5]",
                "eval_metrics/MAE[0.5]",
                "eval_metrics/MASE[0.5]",
                "eval_metrics/MAPE[0.5]",
                "eval_metrics/sMAPE[0.5]",
                "eval_metrics/MSIS",
                "eval_metrics/RMSE[mean]",
                "eval_metrics/NRMSE[mean]",
                "eval_metrics/ND[0.5]",
                "eval_metrics/mean_weighted_sum_quantile_loss",
                "domain",
                "num_variates",
            ]
        )

for ds_num, ds_name in enumerate(all_datasets):
    ds_key = ds_name.split("/")[0]
    print(f"Processing dataset: {ds_name} ({ds_num + 1} of {len(all_datasets)})")
    terms = ["short", "medium", "long"]
    for term in terms:
        if (
            term == "medium" or term == "long"
        ) and ds_name not in med_long_datasets.split():
            continue

        if "/" in ds_name:
            ds_key = ds_name.split("/")[0]
            ds_freq = ds_name.split("/")[1]
            ds_key = ds_key.lower()
            ds_key = pretty_names.get(ds_key, ds_key)
        else:
            ds_key = ds_name.lower()
            ds_key = pretty_names.get(ds_key, ds_key)
            ds_freq = dataset_properties_map[ds_key]["frequency"]
        ds_config = f"{ds_key}/{ds_freq}/{term}"

        if ds_config in completed_datasets:
            print(f"Skipping already completed dataset: {ds_config}")
            continue

        # Initialize the dataset
        to_univariate = (
            False
            if Dataset(name=ds_name, term=term, to_univariate=False).target_dim == 1
            else True
        )
        dataset = Dataset(name=ds_name, term=term, to_univariate=to_univariate)
        season_length = get_seasonality(dataset.freq)
        print(f"Dataset size: {len(dataset.test_data)}")

        predictor = GluonTSPredictor(forecaster=forecaster)

        # Measure the time taken for evaluation
        res = evaluate_model(
            predictor,
            test_data=dataset.test_data,
            metrics=metrics,
            axis=None,
            mask_invalid_label=True,
            allow_nan_forecast=False,
            seasonality=season_length,
        )

        # Append the results to the CSV file
        with open(csv_file_path, "a", newline="") as csvfile:
            writer = csv.writer(csvfile)
            writer.writerow(
                [
                    ds_config,
                    model_name,
                    res["MSE[mean]"][0],
                    res["MSE[0.5]"][0],
                    res["MAE[0.5]"][0],
                    res["MASE[0.5]"][0],
                    res["MAPE[0.5]"][0],
                    res["sMAPE[0.5]"][0],
                    res["MSIS"][0],
                    res["RMSE[mean]"][0],
                    res["NRMSE[mean]"][0],
                    res["ND[0.5]"][0],
                    res["mean_weighted_sum_quantile_loss"][0],
                    dataset_properties_map[ds_key]["domain"],
                    dataset_properties_map[ds_key]["num_variates"],
                ]
            )

        print(f"Results for {ds_name} have been written to {csv_file_path}")

INFO:p-1649965:t-139854902654784:seed.py:seed_everything:[rank: 0] Seed set to 42


Processing dataset: ett1/H (1 of 1)
Dataset size: 140


[GluonTSPredictor] Predicting: 100%|██████████| 140/140 [00:00<00:00, 110128.01series/s]
[Moirai] Cross-validating:   0%|          | 0/1 [00:00<?, ?window/s]INFO:p-1649965:t-139854902654784:pandas.py:from_long_dataframe:Indexing data by 'ds'.
INFO:p-1649965:t-139854902654784:pandas.py:from_long_dataframe:Grouping data by 'unique_id'; this may take some time.
INFO:p-1649965:t-139854902654784:forecast_generator.py:log_once:Forecast is not sample based. Ignoring parameter `num_samples` from predict method.
140it [00:00, 180.44it/s]
[Moirai] Cross-validating: 100%|██████████| 1/1 [00:02<00:00,  2.15s/window]
100%|██████████| 5/5 [00:01<00:00,  4.45it/s]
[Sundial] Cross-validating: 100%|██████████| 1/1 [00:04<00:00,  4.73s/window]
100%|██████████| 5/5 [00:29<00:00,  5.86s/it] [00:00<?, ?window/s]
[Toto] Cross-validating: 100%|██████████| 1/1 [00:32<00:00, 32.56s/window]
100%|██████████| 5/5 [00:20<00:00,  4.05s/it]
[Chronos] Cross-validating: 100%|██████████| 1/1 [00:23<00:00, 23.04s/window

Optimization terminated successfully    (Exit mode 0)
            Current function value: 4.735700420273391
            Iterations: 9
            Function evaluations: 64
            Gradient evaluations: 9


140it [00:00, 231.04it/s]
100%|██████████| 5/5 [00:00<00:00,  6.90it/s]
100%|██████████| 5/5 [00:29<00:00,  5.86s/it]
100%|██████████| 5/5 [00:18<00:00,  3.61s/it]
INFO:p-1649965:t-139854902654784:modeling_flowstate.py:__init__:Number of encoder parameters: 7885.8240000000005k
INFO:p-1649965:t-139854902654784:modeling_flowstate.py:__init__:Number of dencoder parameters: 1181.952k (14.99%)
100%|██████████| 5/5 [00:00<00:00,  5.40it/s]
INFO:p-1649965:t-139854902654784:timesfm_2p5_torch.py:_from_pretrained:Downloading checkpoint from Hugging Face repo google/timesfm-2.5-200m-pytorch
INFO:p-1649965:t-139854902654784:timesfm_2p5_torch.py:_from_pretrained:Loading checkpoint from: /u/mgee2/.cache/huggingface/hub/models--google--timesfm-2.5-200m-pytorch/snapshots/1d952420fba87f3c6dee4f240de0f1a0fbc790e3/model.safetensors
INFO:p-1649965:t-139854902654784:timesfm_2p5_torch.py:compile:When compiling, max horizon needs to be multiple of the output patch size 128. Using max horizon = 128 instead.
1

Results for ett1/H have been written to ../results/tsorchestra/all_results.csv



  res["MSE[mean]"][0],
  res["MSE[0.5]"][0],
  res["MAE[0.5]"][0],
  res["MASE[0.5]"][0],
  res["MAPE[0.5]"][0],
  res["sMAPE[0.5]"][0],
  res["MSIS"][0],
  res["RMSE[mean]"][0],
  res["NRMSE[mean]"][0],
  res["ND[0.5]"][0],
  res["mean_weighted_sum_quantile_loss"][0],


In [11]:
import pandas as pd

df = pd.read_csv(f"../results/{model_name}/all_results.csv")
df

Unnamed: 0,dataset,model,eval_metrics/MSE[mean],eval_metrics/MSE[0.5],eval_metrics/MAE[0.5],eval_metrics/MASE[0.5],eval_metrics/MAPE[0.5],eval_metrics/sMAPE[0.5],eval_metrics/MSIS,eval_metrics/RMSE[mean],eval_metrics/NRMSE[mean],eval_metrics/ND[0.5],eval_metrics/mean_weighted_sum_quantile_loss,domain,num_variates
0,ett1/H/short,tsorchestra,92.422526,92.422526,4.757038,0.792472,0.45415,0.250998,5.452635,9.613664,0.448709,0.22203,0.170857,Energy,7
