# TTM zero-shot and few-shot benchmarking on multiple datasets

**Using TTM-1536-96 model.**

Pre-trained TTM models will be fetched from the [Hugging Face TTM Model Repository](ibm-granite/granite-timeseries-ttm-r2).

1. TTM-R1 pre-trained models can be found here: [TTM-R1 Model Card](https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1)
    1. For 512-96 model set `TTM_MODEL_REVISION="main"`
    2. For 1024-96 model set `TTM_MODEL_REVISION="1024_96_v1"`
2. TTM-R2 pre-trained models can be found here: [TTM-R2 Model Card](https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2)
    1. For 512-96 model set `TTM_MODEL_REVISION="main"`
    2. For 1024-96 model set `TTM_MODEL_REVISION="1024-96-r2"`
    3. For 1536-96 model set `TTM_MODEL_REVISION="1536-96-r2"`

Details about the revisions (R1 and R2) can be found [here](https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2).

## Imports

In [1]:
import math
import warnings

import matplotlib.pyplot as plt
import pandas as pd
from torch.optim import AdamW
from torch.optim.lr_scheduler import OneCycleLR
from transformers import EarlyStoppingCallback, Trainer, TrainingArguments, set_seed
from transformers.integrations import INTEGRATION_TO_CALLBACK

from tsfm_public import TinyTimeMixerForPrediction, TrackingCallback, count_parameters, load_dataset
from tsfm_public.toolkit.lr_finder import optimal_lr_finder
from tsfm_public.toolkit.visualization import plot_predictions


warnings.filterwarnings("ignore")

2024-10-10 07:15:38.441950: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-10-10 07:15:38.481580: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  warn(f"Failed to load image Python extension: {e}")


## Important arguments

In [2]:
# Set seed
SEED = 42
set_seed(SEED)

# Specify model parameters
context_length = 1536
forecast_length = 96
freeze_backbone = True

# Other args
EPOCHS = 50
NUM_WORKERS = 16

# Make sure all the datasets in the following `list_datasets` are
# saved in the `DATA_ROOT_PATH` folder. Or, change it accordingly.
# Refer to the load_datasets() function
# in notebooks/hfdemo/tinytimemixer/utils/ttm_utils.py
# to see how it is used.
DATA_ROOT_PATH = "/dccstor/tsfm23/datasets/"

# This is where results will be saved
OUT_DIR = f"ttm-r2_results_benchmark_{context_length}_{forecast_length}/"

## List of benchmark datasets (TTM was not pre-trained on any of these)

In [3]:
list_datasets = [
    "etth1",
    "etth2",
    "ettm1",
    "ettm2",
    "weather",
    "electricity",
    "traffic",
]

## Get model path

In [4]:
# Please provide the branch name properly based on context_len and forecast_len
hf_model_path = "ibm-granite/granite-timeseries-ttm-r2"
hf_model_branch = f"{context_length}-{forecast_length}-r2"

## Main benchmarking loop

In [5]:
all_results = {
    "dataset": [],
    "zs_mse": [],
    "fs5_mse": [],
    "zs_eval_time": [],
    "fs5_mean_epoch_time": [],
    "fs5_total_train_time": [],
    "fs5_best_val_metric": [],
}
# Loop over data
for DATASET in list_datasets:
    print()
    print("=" * 100)
    print(
        f"Running zero-shot/few-shot for TTM-{context_length} on dataset = {DATASET}, forecast_len = {forecast_length}"
    )
    print(f"Model will be loaded from {hf_model_path}/{hf_model_branch}")
    SUBDIR = f"{OUT_DIR}/{DATASET}"

    # Set batch size
    if DATASET == "traffic":
        BATCH_SIZE = 8
    elif DATASET == "electricity":
        BATCH_SIZE = 32
    else:
        BATCH_SIZE = 64

    # Data prep: Get dataset
    _, _, dset_test = load_dataset(DATASET, context_length, forecast_length, dataset_root_path=DATA_ROOT_PATH)

    #############################################################
    ##### Use the pretrained model in zero-shot forecasting #####
    #############################################################
    # Load model
    zeroshot_model = TinyTimeMixerForPrediction.from_pretrained(hf_model_path, revision=hf_model_branch)

    # zeroshot_trainer
    zeroshot_trainer = Trainer(
        model=zeroshot_model,
        args=TrainingArguments(
            output_dir=f"{SUBDIR}/zeroshot",
            per_device_eval_batch_size=BATCH_SIZE,
            seed=SEED,
        ),
        eval_dataset=dset_test,
    )

    # evaluate = zero-shot performance
    print("+" * 20, "Test MSE zero-shot", "+" * 20)
    zeroshot_output = zeroshot_trainer.evaluate(dset_test)
    print(zeroshot_output)
    print("+" * 60)
    all_results["zs_eval_time"].append(zeroshot_output["eval_runtime"])

    # Plot
    plot_predictions(
        model=zeroshot_trainer.model,
        dset=dset_test,
        plot_dir=SUBDIR,
        num_plots=10,
        plot_prefix="test_zeroshot",
        channel=0,
    )
    plt.close()

    # write results
    all_results["dataset"].append(DATASET)
    all_results["zs_mse"].append(zeroshot_output["eval_loss"])

    ################################################################
    ## Use the pretrained model in few-shot 5% and 10% forecasting #
    ################################################################
    for fewshot_percent in [5]:
        # Set learning rate
        learning_rate = None  # `None` value indicates that the optimal_lr_finder() will be used

        print("-" * 20, f"Running few-shot {fewshot_percent}%", "-" * 20)
        # Data prep: Get dataset
        dset_train, dset_val, dset_test = load_dataset(
            DATASET,
            context_length,
            forecast_length,
            fewshot_fraction=fewshot_percent / 100,
            dataset_root_path=DATA_ROOT_PATH,
        )

        # change head dropout to 0.7 for ett datasets
        if "ett" in DATASET:
            finetune_forecast_model = TinyTimeMixerForPrediction.from_pretrained(
                hf_model_path, revision=hf_model_branch, head_dropout=0.7
            )
        else:
            finetune_forecast_model = TinyTimeMixerForPrediction.from_pretrained(
                hf_model_path, revision=hf_model_branch
            )

        if freeze_backbone:
            print(
                "Number of params before freezing backbone",
                count_parameters(finetune_forecast_model),
            )

            # Freeze the backbone of the model
            for param in finetune_forecast_model.backbone.parameters():
                param.requires_grad = False

            # Count params
            print(
                "Number of params after freezing the backbone",
                count_parameters(finetune_forecast_model),
            )

        if learning_rate is None:
            learning_rate, finetune_forecast_model = optimal_lr_finder(
                finetune_forecast_model,
                dset_train,
                batch_size=BATCH_SIZE,
            )
            print("OPTIMAL SUGGESTED LEARNING RATE =", learning_rate)

        print(f"Using learning rate = {learning_rate}")
        finetune_forecast_args = TrainingArguments(
            output_dir=f"{SUBDIR}/fewshot_{fewshot_percent}",
            overwrite_output_dir=True,
            learning_rate=learning_rate,
            num_train_epochs=EPOCHS,
            do_eval=True,
            evaluation_strategy="epoch",
            per_device_train_batch_size=BATCH_SIZE,
            per_device_eval_batch_size=BATCH_SIZE,
            dataloader_num_workers=NUM_WORKERS,
            report_to=None,
            save_strategy="epoch",
            logging_strategy="epoch",
            save_total_limit=1,
            logging_dir=f"{SUBDIR}/fewshot_{fewshot_percent}",  # Make sure to specify a logging directory
            load_best_model_at_end=True,  # Load the best model when training ends
            metric_for_best_model="eval_loss",  # Metric to monitor for early stopping
            greater_is_better=False,  # For loss
            seed=SEED,
        )

        # Create the early stopping callback
        early_stopping_callback = EarlyStoppingCallback(
            early_stopping_patience=10,  # Number of epochs with no improvement after which to stop
            early_stopping_threshold=0.0,  # Minimum improvement required to consider as improvement
        )
        tracking_callback = TrackingCallback()

        # Optimizer and scheduler
        optimizer = AdamW(finetune_forecast_model.parameters(), lr=learning_rate)
        scheduler = OneCycleLR(
            optimizer,
            learning_rate,
            epochs=EPOCHS,
            steps_per_epoch=math.ceil(len(dset_train) / (BATCH_SIZE)),
        )

        finetune_forecast_trainer = Trainer(
            model=finetune_forecast_model,
            args=finetune_forecast_args,
            train_dataset=dset_train,
            eval_dataset=dset_val,
            callbacks=[early_stopping_callback, tracking_callback],
            optimizers=(optimizer, scheduler),
        )
        finetune_forecast_trainer.remove_callback(INTEGRATION_TO_CALLBACK["codecarbon"])

        # Fine tune
        finetune_forecast_trainer.train()

        # Evaluation
        print(
            "+" * 20,
            f"Test MSE after few-shot {fewshot_percent}% fine-tuning",
            "+" * 20,
        )
        fewshot_output = finetune_forecast_trainer.evaluate(dset_test)
        print(fewshot_output)
        print("+" * 60)

        # Plot
        plot_predictions(
            model=finetune_forecast_trainer.model,
            dset=dset_test,
            plot_dir=SUBDIR,
            num_plots=10,
            plot_prefix=f"test_fewshot_{fewshot_percent}",
            channel=0,
        )
        plt.close()

        # write results
        all_results[f"fs{fewshot_percent}_mse"].append(fewshot_output["eval_loss"])
        all_results[f"fs{fewshot_percent}_mean_epoch_time"].append(tracking_callback.mean_epoch_time)
        all_results[f"fs{fewshot_percent}_total_train_time"].append(tracking_callback.total_train_time)
        all_results[f"fs{fewshot_percent}_best_val_metric"].append(tracking_callback.best_eval_metric)

    df_out = pd.DataFrame(all_results).round(3)
    print(df_out[["dataset", "zs_mse", "fs5_mse"]])
    df_out.to_csv(f"{OUT_DIR}/results_zero_few.csv")
    df_out.to_csv(f"{OUT_DIR}/results_zero_few.csv")

INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: etth1, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 7009, val = 2785, test = 2785



Running zero-shot/few-shot for TTM-1536 on dataset = etth1, forecast_len = 96
Model will be loaded from ibm-granite/granite-timeseries-ttm-r2/1536-96-r2


config.json:   0%|          | 0.00/1.57k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/12.3M [00:00<?, ?B/s]

INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


++++++++++++++++++++ Test MSE zero-shot ++++++++++++++++++++


{'eval_loss': 0.3570095896720886, 'eval_model_preparation_time': 0.0024, 'eval_runtime': 1.9963, 'eval_samples_per_second': 1395.071, 'eval_steps_per_second': 22.041}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: etth1, context length: 1536, prediction length 96


-------------------- Running few-shot 5% --------------------


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 260, val = 2785, test = 2785


Number of params before freezing backbone 3081120
Number of params after freezing the backbone 1054560
LR Finder: Running learning rate (LR) finder algorithm. If the suggested LR is very low, we suggest setting the LR manually.
LR Finder: Using GPU:0.


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


LR Finder: Suggested learning rate = 0.000298364724028334
OPTIMAL SUGGESTED LEARNING RATE = 0.000298364724028334
Using learning rate = 0.000298364724028334


Epoch,Training Loss,Validation Loss
1,0.6123,0.655407
2,0.5939,0.65605
3,0.5191,0.656867
4,0.4808,0.658155
5,0.4316,0.659995
6,0.3847,0.662317
7,0.3553,0.668283
8,0.3088,0.689046
9,0.2656,0.715355
10,0.248,0.734134


[TrackingCallback] Mean Epoch Time = 1.0921787131916394 seconds, Total Train Time = 28.327817678451538
++++++++++++++++++++ Test MSE after few-shot 5% fine-tuning ++++++++++++++++++++


{'eval_loss': 0.3571341633796692, 'eval_runtime': 1.4299, 'eval_samples_per_second': 1947.631, 'eval_steps_per_second': 30.77, 'epoch': 11.0}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: etth2, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 7009, val = 2785, test = 2785


  dataset  zs_mse  fs5_mse
0   etth1   0.357    0.357

Running zero-shot/few-shot for TTM-1536 on dataset = etth2, forecast_len = 96
Model will be loaded from ibm-granite/granite-timeseries-ttm-r2/1536-96-r2


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


++++++++++++++++++++ Test MSE zero-shot ++++++++++++++++++++


{'eval_loss': 0.2743358612060547, 'eval_model_preparation_time': 0.0019, 'eval_runtime': 0.9901, 'eval_samples_per_second': 2812.989, 'eval_steps_per_second': 44.442}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: etth2, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 260, val = 2785, test = 2785


-------------------- Running few-shot 5% --------------------
Number of params before freezing backbone 3081120
Number of params after freezing the backbone 1054560
LR Finder: Running learning rate (LR) finder algorithm. If the suggested LR is very low, we suggest setting the LR manually.
LR Finder: Using GPU:0.


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


LR Finder: Suggested learning rate = 0.00020565123083486514
OPTIMAL SUGGESTED LEARNING RATE = 0.00020565123083486514
Using learning rate = 0.00020565123083486514


Epoch,Training Loss,Validation Loss
1,0.4353,0.22963
2,0.3888,0.230058
3,0.3232,0.231052
4,0.3856,0.232311
5,0.2984,0.233664
6,0.2438,0.234015
7,0.2114,0.232407
8,0.1868,0.228532
9,0.1806,0.228105
10,0.1374,0.232864


[TrackingCallback] Mean Epoch Time = 1.0002899671855725 seconds, Total Train Time = 47.401732206344604
++++++++++++++++++++ Test MSE after few-shot 5% fine-tuning ++++++++++++++++++++


{'eval_loss': 0.27716049551963806, 'eval_runtime': 1.386, 'eval_samples_per_second': 2009.436, 'eval_steps_per_second': 31.747, 'epoch': 19.0}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: ettm1, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 32929, val = 11425, test = 11425


  dataset  zs_mse  fs5_mse
0   etth1   0.357    0.357
1   etth2   0.274    0.277

Running zero-shot/few-shot for TTM-1536 on dataset = ettm1, forecast_len = 96
Model will be loaded from ibm-granite/granite-timeseries-ttm-r2/1536-96-r2


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


++++++++++++++++++++ Test MSE zero-shot ++++++++++++++++++++


{'eval_loss': 0.32653480768203735, 'eval_model_preparation_time': 0.0018, 'eval_runtime': 3.4627, 'eval_samples_per_second': 3299.436, 'eval_steps_per_second': 51.694}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: ettm1, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 1556, val = 11425, test = 11425


-------------------- Running few-shot 5% --------------------
Number of params before freezing backbone 3081120
Number of params after freezing the backbone 1054560
LR Finder: Running learning rate (LR) finder algorithm. If the suggested LR is very low, we suggest setting the LR manually.
LR Finder: Using GPU:0.


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


LR Finder: Suggested learning rate = 0.00043287612810830566
OPTIMAL SUGGESTED LEARNING RATE = 0.00043287612810830566
Using learning rate = 0.00043287612810830566


Epoch,Training Loss,Validation Loss
1,0.7153,0.400856
2,0.4748,0.420347
3,0.359,0.45263
4,0.3256,0.455598
5,0.2978,0.474598
6,0.2759,0.478588
7,0.2622,0.467313
8,0.248,0.475465
9,0.2346,0.459779
10,0.2255,0.477715


[TrackingCallback] Mean Epoch Time = 1.4104089736938477 seconds, Total Train Time = 44.97520208358765
++++++++++++++++++++ Test MSE after few-shot 5% fine-tuning ++++++++++++++++++++


{'eval_loss': 0.3312471807003021, 'eval_runtime': 2.5794, 'eval_samples_per_second': 4429.24, 'eval_steps_per_second': 69.395, 'epoch': 11.0}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: ettm2, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 32929, val = 11425, test = 11425


  dataset  zs_mse  fs5_mse
0   etth1   0.357    0.357
1   etth2   0.274    0.277
2   ettm1   0.327    0.331

Running zero-shot/few-shot for TTM-1536 on dataset = ettm2, forecast_len = 96
Model will be loaded from ibm-granite/granite-timeseries-ttm-r2/1536-96-r2


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


++++++++++++++++++++ Test MSE zero-shot ++++++++++++++++++++


{'eval_loss': 0.16795998811721802, 'eval_model_preparation_time': 0.0018, 'eval_runtime': 3.518, 'eval_samples_per_second': 3247.549, 'eval_steps_per_second': 50.881}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: ettm2, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 1556, val = 11425, test = 11425


-------------------- Running few-shot 5% --------------------
Number of params before freezing backbone 3081120
Number of params after freezing the backbone 1054560
LR Finder: Running learning rate (LR) finder algorithm. If the suggested LR is very low, we suggest setting the LR manually.
LR Finder: Using GPU:0.


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


LR Finder: Suggested learning rate = 0.00011768119524349978
OPTIMAL SUGGESTED LEARNING RATE = 0.00011768119524349978
Using learning rate = 0.00011768119524349978


Epoch,Training Loss,Validation Loss
1,0.4718,0.123267
2,0.3377,0.124431
3,0.2528,0.126874
4,0.176,0.13168
5,0.1358,0.141091
6,0.117,0.147765
7,0.1084,0.156903
8,0.1039,0.162671
9,0.1,0.170844
10,0.097,0.176793


[TrackingCallback] Mean Epoch Time = 1.3997448791157117 seconds, Total Train Time = 45.14118027687073
++++++++++++++++++++ Test MSE after few-shot 5% fine-tuning ++++++++++++++++++++


{'eval_loss': 0.1680709272623062, 'eval_runtime': 2.6241, 'eval_samples_per_second': 4353.841, 'eval_steps_per_second': 68.213, 'epoch': 11.0}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: weather, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 35256, val = 5175, test = 10444


  dataset  zs_mse  fs5_mse
0   etth1   0.357    0.357
1   etth2   0.274    0.277
2   ettm1   0.327    0.331
3   ettm2   0.168    0.168

Running zero-shot/few-shot for TTM-1536 on dataset = weather, forecast_len = 96
Model will be loaded from ibm-granite/granite-timeseries-ttm-r2/1536-96-r2


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


++++++++++++++++++++ Test MSE zero-shot ++++++++++++++++++++


{'eval_loss': 0.14976251125335693, 'eval_model_preparation_time': 0.0021, 'eval_runtime': 6.5327, 'eval_samples_per_second': 1598.717, 'eval_steps_per_second': 25.104}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: weather, context length: 1536, prediction length 96
INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 1672, val = 5175, test = 10444


-------------------- Running few-shot 5% --------------------
Number of params before freezing backbone 3081120
Number of params after freezing the backbone 1054560
LR Finder: Running learning rate (LR) finder algorithm. If the suggested LR is very low, we suggest setting the LR manually.
LR Finder: Using GPU:0.


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


LR Finder: Suggested learning rate = 0.00020565123083486514
OPTIMAL SUGGESTED LEARNING RATE = 0.00020565123083486514
Using learning rate = 0.00020565123083486514


Epoch,Training Loss,Validation Loss
1,0.0979,0.393768
2,0.0957,0.397849
3,0.0929,0.40424
4,0.0889,0.411644
5,0.0849,0.410327
6,0.0817,0.414159
7,0.0774,0.41483
8,0.0734,0.416132
9,0.0686,0.428362
10,0.065,0.419456


[TrackingCallback] Mean Epoch Time = 1.9716925404288552 seconds, Total Train Time = 54.368701219558716
++++++++++++++++++++ Test MSE after few-shot 5% fine-tuning ++++++++++++++++++++


{'eval_loss': 0.14924383163452148, 'eval_runtime': 4.6955, 'eval_samples_per_second': 2224.257, 'eval_steps_per_second': 34.927, 'epoch': 11.0}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: electricity, context length: 1536, prediction length 96


   dataset  zs_mse  fs5_mse
0    etth1   0.357    0.357
1    etth2   0.274    0.277
2    ettm1   0.327    0.331
3    ettm2   0.168    0.168
4  weather   0.150    0.149

Running zero-shot/few-shot for TTM-1536 on dataset = electricity, forecast_len = 96
Model will be loaded from ibm-granite/granite-timeseries-ttm-r2/1536-96-r2


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 16781, val = 2537, test = 5165
INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


++++++++++++++++++++ Test MSE zero-shot ++++++++++++++++++++


{'eval_loss': 0.15529614686965942, 'eval_model_preparation_time': 0.0019, 'eval_runtime': 34.5318, 'eval_samples_per_second': 149.572, 'eval_steps_per_second': 4.691}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: electricity, context length: 1536, prediction length 96


-------------------- Running few-shot 5% --------------------


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 748, val = 2537, test = 5165


Number of params before freezing backbone 3081120
Number of params after freezing the backbone 1054560
LR Finder: Running learning rate (LR) finder algorithm. If the suggested LR is very low, we suggest setting the LR manually.
LR Finder: Using GPU:0.


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


LR Finder: Suggested learning rate = 0.00020565123083486514
OPTIMAL SUGGESTED LEARNING RATE = 0.00020565123083486514
Using learning rate = 0.00020565123083486514


Epoch,Training Loss,Validation Loss
1,0.1437,0.129405
2,0.14,0.12771
3,0.1376,0.126163
4,0.1355,0.124611
5,0.1332,0.123532
6,0.1309,0.122066
7,0.1291,0.121844
8,0.1273,0.120507
9,0.1256,0.119225
10,0.1233,0.119105


[TrackingCallback] Mean Epoch Time = 6.774247827737228 seconds, Total Train Time = 964.9241693019867
++++++++++++++++++++ Test MSE after few-shot 5% fine-tuning ++++++++++++++++++++


{'eval_loss': 0.13803862035274506, 'eval_runtime': 26.8199, 'eval_samples_per_second': 192.581, 'eval_steps_per_second': 6.04, 'epoch': 46.0}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: traffic, context length: 1536, prediction length 96


       dataset  zs_mse  fs5_mse
0        etth1   0.357    0.357
1        etth2   0.274    0.277
2        ettm1   0.327    0.331
3        ettm2   0.168    0.168
4      weather   0.150    0.149
5  electricity   0.155    0.138

Running zero-shot/few-shot for TTM-1536 on dataset = traffic, forecast_len = 96
Model will be loaded from ibm-granite/granite-timeseries-ttm-r2/1536-96-r2


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 10649, val = 1661, test = 3413
INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


++++++++++++++++++++ Test MSE zero-shot ++++++++++++++++++++


{'eval_loss': 0.4634234607219696, 'eval_model_preparation_time': 0.0019, 'eval_runtime': 62.6042, 'eval_samples_per_second': 54.517, 'eval_steps_per_second': 6.821}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Dataset name: traffic, context length: 1536, prediction length 96


-------------------- Running few-shot 5% --------------------


INFO:p-3048562:t-22362052518656:data_handling.py:load_dataset:Data lengths: train = 442, val = 1661, test = 3413


Number of params before freezing backbone 3081120
Number of params after freezing the backbone 1054560
LR Finder: Running learning rate (LR) finder algorithm. If the suggested LR is very low, we suggest setting the LR manually.
LR Finder: Using GPU:0.




LR Finder: Suggested learning rate = 5.590810182512223e-05
OPTIMAL SUGGESTED LEARNING RATE = 5.590810182512223e-05
Using learning rate = 5.590810182512223e-05


INFO:p-3048562:t-22362052518656:base.py:add_job:Adding job tentatively -- it will be properly scheduled when the scheduler starts


Epoch,Training Loss,Validation Loss
1,0.2988,0.391451
2,0.2866,0.393831
3,0.2753,0.394873
4,0.2645,0.396028
5,0.2538,0.400728
6,0.2454,0.40427
7,0.2379,0.408866
8,0.2316,0.409725
9,0.2262,0.410739
10,0.2213,0.412317


[TrackingCallback] Mean Epoch Time = 9.786083113063466 seconds, Total Train Time = 365.34510469436646
++++++++++++++++++++ Test MSE after few-shot 5% fine-tuning ++++++++++++++++++++


{'eval_loss': 0.46613699197769165, 'eval_runtime': 46.0615, 'eval_samples_per_second': 74.097, 'eval_steps_per_second': 9.27, 'epoch': 11.0}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
       dataset  zs_mse  fs5_mse
0        etth1   0.357    0.357
1        etth2   0.274    0.277
2        ettm1   0.327    0.331
3        ettm2   0.168    0.168
4      weather   0.150    0.149
5  electricity   0.155    0.138
6      traffic   0.463    0.466


## Benchmarking results*

*Some slight differences in the results as compared to the TTM paper results is possible due to different training environments.

In [6]:
df_out

Unnamed: 0,dataset,zs_mse,fs5_mse,zs_eval_time,fs5_mean_epoch_time,fs5_total_train_time,fs5_best_val_metric
0,etth1,0.357,0.357,1.996,1.092,28.328,0.655
1,etth2,0.274,0.277,0.99,1.0,47.402,0.228
2,ettm1,0.327,0.331,3.463,1.41,44.975,0.401
3,ettm2,0.168,0.168,3.518,1.4,45.141,0.123
4,weather,0.15,0.149,6.533,1.972,54.369,0.394
5,electricity,0.155,0.138,34.532,6.774,964.924,0.113
6,traffic,0.463,0.466,62.604,9.786,365.345,0.391
