# Running PatchTST-FM on gift-eval benchmark
**The following notebook is only intended to reproduce GIFT-Eval results.**
Make sure you download the gift-eval benchmark and set the `GIFT_EVAL` environment variable correctly before running this notebook.
We will use the `Dataset` class to load the data and run the model. If you have not already please check out the [dataset.ipynb](./dataset.ipynb) notebook to learn more about the `Dataset` class. We are going to just run the model on two datasets for brevity. But feel free to run on any dataset by changing the `short_datasets` and `med_long_datasets` variables below.

Please note: The submitted gift-eval results were generated using the following hardware and library versions:
- GPU Model: NVIDIA RTX Pro 6000 Blackwell 96GB
- CUDA Driver: 570.270, CUDA 12.8
- torch: 2.8.0+cu129
- transformers: 4.53.3

Similar results were also obtained when using an NVIDIA A100-SXM4 80GB 

## TSFM and TTM Installation
1. Clone the [GIFT-Eval repository](https://github.com/SalesforceAIResearch/gift-eval).
1. Follow the instruction to set up the GIFT-Eval environment as described [here](https://github.com/SalesforceAIResearch/gift-eval?tab=readme-ov-file#installation).
1. This notebook should be placed in the `notebooks` folder of the cloned repository.
1. Follow the instructions below to install TSFM.

### Installing `granite-tsfm`
The source code will be installed from the [granite-tsfm repository](https://github.com/ibm-granite/granite-tsfm).
Run the following code once to install granite-tsfm in your working python environment.

In [None]:
import os


if not os.path.exists("granite-tsfm"):
    ! git clone --branch patchtst-fm git@github.com:ibm-granite/granite-tsfm.git
    %cd granite-tsfm
    ! pwd
    # Switch to the desired branch
    ! pip install ".[notebooks]"
    %cd ..
else:
    print("Folder 'granite-tsfm' already exists. Skipping git clone.")

 ## Imports

In [None]:
import json
import logging
import os
import sys

import pandas as pd
import torch
from dotenv import load_dotenv
from gift_eval.data import Dataset
from gluonts.ev.metrics import (
    MAE,
    MAPE,
    MASE,
    MSE,
    MSIS,
    ND,
    NRMSE,
    RMSE,
    SMAPE,
    MeanWeightedSumQuantileLoss,
)
from gluonts.model import evaluate_forecasts
from gluonts.time_feature import get_seasonality, norm_freq_str

from tsfm_public import PatchTSTFMForPrediction


load_dotenv()

 ### Update the python path to include the custom eval predictor

In [None]:
sys.path.append("./granite-tsfm/notebooks/hfdemo/patchtst_fm/")

from patchtst_fm_predictor import PatchTSTFMEvalPredictor

 ## Set up loggers

In [None]:
class WarningFilter(logging.Filter):
    def __init__(self, text_to_filter):
        super().__init__()
        self.text_to_filter = text_to_filter

    def filter(self, record):
        return self.text_to_filter not in record.getMessage()


logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
gts_logger = logging.getLogger("gluonts.model.forecast")
gts_logger.addFilter(
    WarningFilter("The mean prediction is not stored in the forecast data")
)

 ## Dataset and metrics configuration

In [None]:
# short_datasets = "m4_yearly m4_quarterly m4_monthly m4_weekly m4_daily m4_hourly electricity/15T electricity/H electricity/D electricity/W solar/10T solar/H solar/D solar/W hospital covid_deaths us_births/D us_births/M us_births/W saugeenday/D saugeenday/M saugeenday/W temperature_rain_with_missing kdd_cup_2018_with_missing/H kdd_cup_2018_with_missing/D car_parts_with_missing restaurant hierarchical_sales/D hierarchical_sales/W LOOP_SEATTLE/5T LOOP_SEATTLE/H LOOP_SEATTLE/D SZ_TAXI/15T SZ_TAXI/H M_DENSE/H M_DENSE/D ett1/15T ett1/H ett1/D ett1/W ett2/15T ett2/H ett2/D ett2/W jena_weather/10T jena_weather/H jena_weather/D bitbrains_fast_storage/5T bitbrains_fast_storage/H bitbrains_rnd/5T bitbrains_rnd/H bizitobs_application bizitobs_service bizitobs_l2c/5T bizitobs_l2c/H"
# med_long_datasets = "electricity/15T electricity/H solar/10T solar/H kdd_cup_2018_with_missing/H LOOP_SEATTLE/5T LOOP_SEATTLE/H SZ_TAXI/15T M_DENSE/H ett1/15T ett1/H ett2/15T ett2/H jena_weather/10T jena_weather/H bitbrains_fast_storage/5T bitbrains_rnd/5T bizitobs_application bizitobs_service bizitobs_l2c/5T bizitobs_l2c/H"
short_datasets = "m4_weekly"
med_long_datasets = "bizitobs_l2c/H"

all_datasets = list(set(short_datasets.split() + med_long_datasets.split()))
pretty_names = {
    "saugeenday": "saugeen",
    "temperature_rain_with_missing": "temperature_rain",
    "kdd_cup_2018_with_missing": "kdd_cup_2018",
    "car_parts_with_missing": "car_parts",
}

dataset_properties_map = json.load(open("dataset_properties.json"))

# Instantiate the metrics
metrics = [
    MSE(forecast_type="mean"),
    MSE(forecast_type=0.5),
    MAE(),
    MASE(),
    MAPE(),
    SMAPE(),
    MSIS(),
    RMSE(),
    NRMSE(),
    ND(),
    MeanWeightedSumQuantileLoss(
        quantile_levels=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
    ),
]


model_name = "PatchTST-FM-r1"
ckpt_path = "ibm-research/patchtst-fm-r1"
output_dir = f"../results/{model_name}"

# Define the path for the results CSV file
csv_file_path = os.path.join(output_dir, "all_results.csv")

## Evaluation


Now that we have defined imported tsfm-specific classes and configured the metrics and output paths, we will evaluate a PatchTSTFM model on the gift-eval benchmark datasets.
Below, we will load the PatchTST-FM zero-shot model and then iterate through all the datasets, evaluating on each using the PatchTSTFMEvalPredictor.

We are going to follow the naming conventions explained in the
[README](../README.md) file to store the results in a csv file
called `all_results.csv` under the `results/PatchTST-FM-r1` folder.

The first column in the csv file is the dataset config name which
is a combination of the dataset name, frequency and the term:

```python
f"{dataset_name}/{freq}/{term}"
```

### Load model

In [None]:
logging.info(f"Loading model from {ckpt_path}")
device = (
    "cuda"
    if torch.cuda.is_available()
    else ("mps" if torch.mps.is_available() else "cpu")
)
model = PatchTSTFMForPrediction.from_pretrained(ckpt_path, device_map=device)

### Iterate through datasets

In [None]:
all_results = []
for ds_num, ds_name in enumerate(all_datasets):
    ds_key = ds_name.split("/")[0]
    logger.info(f"Processing dataset: {ds_name} ({ds_num + 1} of {len(all_datasets)})")
    terms = ["short", "medium", "long"]

    for term in terms:
        if (
            term == "medium" or term == "long"
        ) and ds_name not in med_long_datasets.split():
            continue

        if "/" in ds_name:
            ds_key = ds_name.split("/")[0]
            ds_freq = ds_name.split("/")[1]
            ds_key = ds_key.lower()
            ds_key = pretty_names.get(ds_key, ds_key)
        else:
            ds_key = ds_name.lower()
            ds_key = pretty_names.get(ds_key, ds_key)
            ds_freq = dataset_properties_map[ds_key]["frequency"]
        ds_config = f"{ds_key}/{ds_freq}/{term}"

        logger.info(f"config: {ds_config}")
        # Initialize the dataset
        to_univariate = (
            False
            if Dataset(name=ds_name, term=term, to_univariate=False).target_dim == 1
            else True
        )
        dataset = Dataset(name=ds_name, term=term, to_univariate=to_univariate)
        # target_dim = Dataset(name=ds_name, term=term, to_univariate=False).target_dim
        # dataset = Dataset(name=ds_name, term=term, to_univariate=target_dim != 1)
        season_length = get_seasonality(dataset.freq)
        logger.info(f"Dataset size: {len(dataset.test_data)}")

        predictor = PatchTSTFMEvalPredictor(
            model=model,
            prediction_length=dataset.prediction_length,
            dataset_name=ds_name,
            quantile_levels=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
        )

        forecasts = predictor.predict(
            dataset.test_data.input,
            batch_size=2048,
        )

        res = (
            evaluate_forecasts(
                forecasts,
                test_data=dataset.test_data,
                metrics=metrics,
                axis=None,
                batch_size=1024,
                mask_invalid_label=True,
                allow_nan_forecast=False,
                seasonality=season_length,
            )
            .reset_index(drop=True)
            .to_dict(orient="records")
        )

        all_results.append(
            (
                res,
                ds_config,
                dataset_properties_map[ds_key]["domain"],
                dataset_properties_map[ds_key]["num_variates"],
            )
        )

### Finalize results and save

In [None]:
result_df_rows = []
for result_metrics, ds_config, domain, num_variates in all_results:
    result_metrics = {f"eval_metrics/{k}": v for k, v in result_metrics[0].items()}

    result_df_rows.append(
        {
            "dataset": ds_config,
            "model": model_name,
            **result_metrics,
            "domain": domain,
            "num_variates": num_variates,
        }
    )
results_df = pd.DataFrame(result_df_rows).sort_values(by="dataset")
results_df.to_csv(csv_file_path, index=False)

logger.info(f"Results have been written to {csv_file_path}")

In [None]:
# Results
df = pd.read_csv(f"{output_dir}/all_results.csv")
df = df.sort_values(by="dataset")
display(
    df[
        [
            "dataset",
            "eval_metrics/MASE[0.5]",
            "eval_metrics/NRMSE[mean]",
            "eval_metrics/mean_weighted_sum_quantile_loss",
        ]
    ]
)