# Getting started with TinyTimeMixer (TTM)

This notebooke demonstrates the usage of a pre-trained `TinyTimeMixer` model for several multivariate time series forecasting tasks. For details related to model architecture, refer to the [TTM paper](https://arxiv.org/pdf/2401.03955.pdf).

In this example, we will use a pre-trained TTM-512-96 model. That means the TTM model can take an input of 512 time points (`context_length`), and can forecast upto 96 time points (`forecast_length`) in the future. We will use the pre-trained TTM in two settings:
1. **Zero-shot**: The pre-trained TTM will be directly used to evaluate on the `test` split of the target data. Note that the TTM was NOT pre-trained on the target data.
2. **Few-shot**: The pre-trained TTM will be quickly fine-tuned on only 5% of the `train` split of the target data, and subsequently, evaluated on the `test` part of the target data.

Note: Alternatively, this notebook can be modified to try any other TTM model from a suite of TTM models. For details, visit the [Hugging Face TTM Model Repository](https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2).

1. IBM Granite TTM-R1 pre-trained models can be found here: [Granite-TTM-R1 Model Card](https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1)
2. IBM Granite TTM-R2 pre-trained models can be found here: [Granite-TTM-R2 Model Card](https://huggingface.co/ibm-granite/granite-timeseries-ttm-r2)
3. Research-use (non-commercial use only) TTM-R2 pre-trained models can be found here: [Research-Use-TTM-R2](https://huggingface.co/ibm-research/ttm-research-r2)

### The get_model() utility
TTM Model card offers a suite of models with varying `context_length` and `prediction_length` combinations.
In this notebook, we will utilize the TSFM `get_model()` utility that automatically selects the right model based on the given input `context_length` and `prediction_length` (and some other optional arguments) abstracting away the internal complexity. See the usage examples below in the `zeroshot_eval()` and `fewshot_finetune_eval()` functions. For more details see the [docstring](https://github.com/ibm-granite/granite-tsfm/blob/main/tsfm_public/toolkit/get_model.py) of the function definition.

## Install `tsfm`
**[Optional for Local Run / Mandatory for Google Colab]**  
Run the below cell to install `tsfm`. Skip if already installed.

In [1]:
# Install the tsfm library
! pip install "granite-tsfm[notebooks] @ git+https://github.com/ibm-granite/granite-tsfm.git@v0.3.1"

Collecting granite-tsfm@ git+https://github.com/ibm-granite/granite-tsfm.git@v0.3.1 (from granite-tsfm[notebooks]@ git+https://github.com/ibm-granite/granite-tsfm.git@v0.3.1)
  Cloning https://github.com/ibm-granite/granite-tsfm.git (to revision v0.3.1) to /tmp/pip-install-ahscs4ic/granite-tsfm_4b0dc735163e405f9b3eb780ce819def
  Running command git clone --filter=blob:none --quiet https://github.com/ibm-granite/granite-tsfm.git /tmp/pip-install-ahscs4ic/granite-tsfm_4b0dc735163e405f9b3eb780ce819def
  Running command git checkout -q 16106d70d1fb3244eecd48c8fbbf3a0009fb8751
  Resolved https://github.com/ibm-granite/granite-tsfm.git to commit 16106d70d1fb3244eecd48c8fbbf3a0009fb8751
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting deprecated (from granite-tsfm@ git+https://github.com/ibm-granite/granite-tsfm.git@v0.3.1->granite-tsfm[notebooks]@ git+https://g

## Imports

In [None]:
import math
import os
import tempfile

import pandas as pd
from torch.optim import AdamW
from torch.optim.lr_scheduler import OneCycleLR
from transformers import EarlyStoppingCallback, Trainer, TrainingArguments, set_seed
from transformers.integrations import INTEGRATION_TO_CALLBACK

from tsfm_public.toolkit import TimeSeriesPreprocessor, TrackingCallback, count_parameters, get_datasets
from tsfm_public.toolkit.get_model import get_model
from tsfm_public.toolkit.lr_finder import optimal_lr_finder
from tsfm_public.toolkit.visualization import plot_predictions

In [None]:
import warnings


# Suppress all warnings
warnings.filterwarnings("ignore")

### Important arguments

In [None]:
# Set seed for reproducibility
SEED = 42
set_seed(SEED)

# TTM Model path. The default model path is Granite-R2. Below, you can choose other TTM releases.
TTM_MODEL_PATH = "ibm-granite/granite-timeseries-ttm-r2"
# TTM_MODEL_PATH = "ibm-granite/granite-timeseries-ttm-r1"
# TTM_MODEL_PATH = "ibm-research/ttm-research-r2"

# Context length, Or Length of the history.
# Currently supported values are: 512/1024/1536 for Granite-TTM-R2 and Research-Use-TTM-R2, and 512/1024 for Granite-TTM-R1
CONTEXT_LENGTH = 1536

# Granite-TTM-R2 supports forecast length upto 720 and Granite-TTM-R1 supports forecast length upto 96
PREDICTION_LENGTH = 204

TARGET_DATASET = "etth1"
dataset_path = "https://raw.githubusercontent.com/zhouhaoyi/ETDataset/main/ETT-small/ETTh1.csv"


# Results dir
OUT_DIR = "ttm_finetuned_models/"

# Data processing

In [None]:
# Dataset
TARGET_DATASET = "train"
dataset_path = "./train.csv"
timestamp_column = "일시"
id_columns = ['건물번호']  # mention the ids that uniquely identify a time-series.

control_columns = ["기온(°C)","강수량(mm)","풍속(m/s)","습도(%)"]
target_columns = ["전력소비량(kWh)"]
split_config = {
    "train": [0, 1536],
    "valid": [1536, 1836],
    "test": [
        1836,
        2040,
    ],
}
# Understanding the split config -- slides

data = pd.read_csv(
    dataset_path,
    parse_dates=[timestamp_column],
)

# --- 데이터를 ID와 시간 순서로 명확하게 정렬 ---
data = data.sort_values(by=id_columns + [timestamp_column])
print("데이터프레임이 ID와 일시 순서로 정렬되었습니다.")

# --- 디버깅: ID 1번 데이터의 크기 확인 ---
building1_data = data[data['건물번호'] == 1]
print(f"건물번호 1번의 데이터 개수: {len(building1_data)}개")
# 이 값이 2040이 맞는지 확인해 주세요.

column_specifiers = {
    "timestamp_column": timestamp_column,
    "id_columns": id_columns,
    "target_columns": target_columns,
    "control_columns": control_columns,
}

데이터프레임이 ID와 일시 순서로 정렬되었습니다.
건물번호 1번의 데이터 개수: 2040개


## Zero-shot evaluation method

In [None]:
column_specifiers

{'timestamp_column': '일시',
 'id_columns': ['건물번호'],
 'target_columns': ['전력소비량(kWh)'],
 'control_columns': ['기온(°C)', '강수량(mm)', '풍속(m/s)', '습도(%)']}

In [None]:
def zeroshot_eval(dataset_name, batch_size, context_length=512, forecast_length=96):
    # Get data
    tsp = TimeSeriesPreprocessor(
        **column_specifiers,
        context_length=context_length,
        prediction_length=forecast_length,
        scaling=True,
        encode_categorical=False,
        scaler_type="standard",
    )
    # Load model
    zeroshot_model = get_model(
        TTM_MODEL_PATH,
        context_length=context_length,
        prediction_length=forecast_length,
        freq_prefix_tuning=False,
        freq=None,
        prefer_l1_loss=False,
        prefer_longer_context=True,
    )
    dset_train, dset_valid, dset_test = get_datasets(
        tsp, data, split_config, use_frequency_token=zeroshot_model.config.resolution_prefix_tuning
    )
    print("dset_train=",dset_train)
    print("dset_valid=",dset_valid)
    print("dset_test=",dset_test)
    temp_dir = tempfile.mkdtemp()
    # zeroshot_trainer
    zeroshot_trainer = Trainer(
        model=zeroshot_model,
        args=TrainingArguments(
            output_dir=temp_dir,
            per_device_eval_batch_size=batch_size,
            seed=SEED,
            report_to="none",
        ),
    )
    # evaluate = zero-shot performance
    print("+" * 20, "Test MSE zero-shot", "+" * 20)
    zeroshot_output = zeroshot_trainer.evaluate(dset_test)
    print("zeroshot_output=", zeroshot_output)
    # get predictions
    predictions_dict = zeroshot_trainer.predict(dset_test)
    predictions_np = predictions_dict.predictions[0]
    print("predictions_np.shape=",predictions_np.shape)
    # get backbone embeddings (if needed for further analysis)
    backbone_embedding = predictions_dict.predictions[1]
    print("backbone_embedding.shape=",backbone_embedding.shape)
    # plot
    """
    plot_predictions(
        model=zeroshot_trainer.model,
        dset=dset_test,
        plot_dir=os.path.join(OUT_DIR, dataset_name),
        plot_prefix="test_zeroshot",
        indices=[685, 118, 902, 1984, 894, 967, 304, 57, 265, 1015],
        channel=0,
    )
    """
    # zeroshot_trainer와 tsp를 함께 반환
    return dset_train, dset_valid, dset_test, predictions_dict, zeroshot_trainer, tsp

# Zeroshot

In [None]:
# 1. 수정된 zeroshot_eval 함수 호출
dset_train, dset_valid, dset_test, predictions_dict, zeroshot_trainer, tsp = zeroshot_eval(
    dataset_name=TARGET_DATASET, context_length=CONTEXT_LENGTH, forecast_length=PREDICTION_LENGTH, batch_size=64
)

INFO:/usr/local/lib/python3.12/dist-packages/tsfm_public/toolkit/get_model.py:Loading model from: ibm-granite/granite-timeseries-ttm-r2


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/15.3M [00:00<?, ?B/s]

INFO:/usr/local/lib/python3.12/dist-packages/tsfm_public/toolkit/get_model.py:Model loaded successfully from ibm-granite/granite-timeseries-ttm-r2, revision = 1536-336-r2.
INFO:/usr/local/lib/python3.12/dist-packages/tsfm_public/toolkit/get_model.py:[TTM] context_length = 1536, prediction_length = 336


dset_train= <tsfm_public.toolkit.dataset.ForecastDFDataset object at 0x789e30b9c830>
dset_valid= <tsfm_public.toolkit.dataset.ForecastDFDataset object at 0x789e30138b90>
dset_test= <tsfm_public.toolkit.dataset.ForecastDFDataset object at 0x789e300eaf00>
++++++++++++++++++++ Test MSE zero-shot ++++++++++++++++++++


zeroshot_output= {'eval_loss': 0.44334670901298523, 'eval_model_preparation_time': 0.0046, 'eval_runtime': 2.4703, 'eval_samples_per_second': 40.482, 'eval_steps_per_second': 0.81}
predictions_np.shape= (100, 204, 5)
backbone_embedding.shape= (100, 5, 12, 384)


In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

def plot_predictions_from_eval(
    data, dset_test, predictions_dict, split_config, building_id, forecast_length, tsp
):
    """
    여러 빌딩의 데이터를 사용하여 예측 결과를 시각화합니다.
    Args:
        data (pd.DataFrame): 모든 건물 데이터가 합쳐진 원본 DataFrame.
        dset_test (Dataset): zeroshot_eval에서 반환된 테스트 데이터셋.
        predictions_dict (dict): zeroshot_eval에서 반환된 예측 결과.
        split_config (dict): 데이터 분할 설정.
        building_id (int): 시각화할 건물 번호.
        forecast_length (int): 예측 길이.
        tsp (TimeSeriesPreprocessor): 데이터 전처리 객체.
    """
    # 데이터셋의 총 row 개수를 기준으로 각 빌딩의 시작점과 끝점을 계산합니다.
    # 예: 건물번호 1-100, 총 데이터 204000개
    rows_per_building = data.shape[0] // 100 # 총 100개 건물로 가정
    start_index_in_combined = (building_id - 1) * rows_per_building
    end_index_in_combined = building_id * rows_per_building

    # 1. 모델에 입력된 과거 데이터(context) 추출
    start_test_idx_in_building = split_config['test'][0]
    context_length = split_config['valid'][0] - split_config['train'][0]

    historical_df = data.iloc[start_index_in_combined + start_test_idx_in_building - context_length :
                              start_index_in_combined + start_test_idx_in_building].copy()

    # 2. 예측 기간에 해당하는 실제값(정답) 데이터 추출
    end_test_idx_in_building = split_config['test'][1]
    ground_truth_df = data.iloc[start_index_in_combined + start_test_idx_in_building :
                                start_index_in_combined + end_test_idx_in_building].copy()

    # 3. predictions_dict에서 예측 결과 추출
    # predictions_dict의 첫 번째 차원은 빌딩 ID에 해당합니다.
    predictions_for_plot = predictions_dict.predictions[0][building_id - 1, :, 0]

    # --- 언스케일링 과정 ---
    # `target_scaler_dict`의 키는 빌딩 ID입니다.
    target_scaler = tsp.target_scaler_dict[building_id]
    predictions_for_plot = predictions_for_plot.reshape(-1, 1)
    unscaled_predictions = target_scaler.inverse_transform(predictions_for_plot).flatten()

    # 4. 데이터 시각화
    plt.figure(figsize=(15, 6))

    # 과거 데이터와 미래의 실제값을 합쳐서 '실제값' 라인으로 플롯
    full_ground_truth_df = pd.concat([historical_df, ground_truth_df])
    plt.plot(full_ground_truth_df['일시'], full_ground_truth_df['전력소비량(kWh)'], label='실제값', color='blue')

    # 예측값을 플롯
    forecast_timestamps = pd.date_range(start=historical_df['일시'].iloc[-1], periods=forecast_length + 1, freq='H')[1:]
    plt.plot(forecast_timestamps, unscaled_predictions, label='예측값', color='red', linestyle='--')

    # 제목 및 라벨
    plt.title(f'건물 {building_id}의 전력소비량 예측', fontsize=16)
    plt.xlabel('일시', fontsize=12)
    plt.ylabel('전력소비량 (kWh)', fontsize=12)
    plt.grid(True)
    plt.legend()
    plt.tight_layout()
    plt.show()

In [None]:
for i in range(1,101,1):
    plot_predictions_from_eval(
        data=data,
        dset_test=dset_test,
        predictions_dict=predictions_dict,
        split_config=split_config,
        building_id=i,
        forecast_length=PREDICTION_LENGTH,
        tsp=tsp
    )

Output hidden; open in https://colab.research.google.com to view.

In [None]:
import numpy as np

def calculate_smape(y_true, y_pred):
    """
    SMAPE (Symmetric Mean Absolute Percentage Error)를 계산합니다.
    Args:
        y_true (np.ndarray): 실제 값
        y_pred (np.ndarray): 예측 값
    Returns:
        float: SMAPE 값 (백분율)
    """
    numerator = np.abs(y_pred - y_true)
    denominator = (np.abs(y_true) + np.abs(y_pred)) / 2

    # 분모가 0인 경우를 처리하여 NaN을 방지합니다.
    smape_values = np.where(denominator == 0, 0, numerator / denominator)

    return np.mean(smape_values) * 100

In [None]:
# 전체 데이터의 총 row 개수를 기준으로 각 빌딩의 row 개수를 계산합니다.
rows_per_building = data.shape[0] // 100

# 각 건물별 SMAPE 결과를 저장할 딕셔너리와
# 전체 SMAPE 계산을 위해 모든 실제값과 예측값을 저장할 리스트를 초기화합니다.
smape_results = {}
all_y_true = []
all_y_pred = []

# 테스트 기간의 시작점과 끝점을 split_config에서 가져옵니다.
start_test_idx_in_building = split_config['test'][0]
end_test_idx_in_building = split_config['test'][1]

# 100개의 건물에 대해 반복합니다.
for building_id in range(1, 101):
    # 각 빌딩의 데이터 시작 인덱스를 계산합니다.
    start_index_in_combined = (building_id - 1) * rows_per_building

    # 예측 기간에 해당하는 실제값(정답) 데이터 추출
    ground_truth_df = data.iloc[start_index_in_combined + start_test_idx_in_building :
                                start_index_in_combined + end_test_idx_in_building].copy()

    # 실제값 배열 추출
    y_true_single = ground_truth_df['전력소비량(kWh)'].values

    # 예측값 배열 추출
    predictions_for_plot = predictions_dict.predictions[0][building_id - 1, :, 0]

    # 언스케일링 과정
    target_scaler = tsp.target_scaler_dict[building_id]
    predictions_for_plot = predictions_for_plot.reshape(-1, 1)
    y_pred_single = target_scaler.inverse_transform(predictions_for_plot).flatten()

    # 건물별 SMAPE 계산
    smape_single = calculate_smape(y_true_single, y_pred_single)
    smape_results[building_id] = smape_single

    # 전체 SMAPE 계산을 위해 리스트에 추가
    all_y_true.append(y_true_single)
    all_y_pred.append(y_pred_single)

# 결과 출력
print("--- 건물별 SMAPE ---")
for building, smape_val in smape_results.items():
    print(f"건물 {building:03d}의 SMAPE: {smape_val:.2f}%")

# 전체 데이터에 대한 SMAPE 계산
all_y_true_combined = np.concatenate(all_y_true)
all_y_pred_combined = np.concatenate(all_y_pred)

overall_smape = calculate_smape(all_y_true_combined, all_y_pred_combined)

print("\n--- 전체 SMAPE ---")
print(f"전체 건물의 SMAPE: {overall_smape:.2f}%")

--- 건물별 SMAPE ---
건물 001의 SMAPE: 11.11%
건물 002의 SMAPE: 10.29%
건물 003의 SMAPE: 7.16%
건물 004의 SMAPE: 6.72%
건물 005의 SMAPE: 6.75%
건물 006의 SMAPE: 16.07%
건물 007의 SMAPE: 14.70%
건물 008의 SMAPE: 10.97%
건물 009의 SMAPE: 6.44%
건물 010의 SMAPE: 20.16%
건물 011의 SMAPE: 8.74%
건물 012의 SMAPE: 7.40%
건물 013의 SMAPE: 9.86%
건물 014의 SMAPE: 8.31%
건물 015의 SMAPE: 10.52%
건물 016의 SMAPE: 12.36%
건물 017의 SMAPE: 6.50%
건물 018의 SMAPE: 10.43%
건물 019의 SMAPE: 16.14%
건물 020의 SMAPE: 3.60%
건물 021의 SMAPE: 5.71%
건물 022의 SMAPE: 7.52%
건물 023의 SMAPE: 19.54%
건물 024의 SMAPE: 10.62%
건물 025의 SMAPE: 11.93%
건물 026의 SMAPE: 13.64%
건물 027의 SMAPE: 10.16%
건물 028의 SMAPE: 6.95%
건물 029의 SMAPE: 6.71%
건물 030의 SMAPE: 0.57%
건물 031의 SMAPE: 6.56%
건물 032의 SMAPE: 9.69%
건물 033의 SMAPE: 17.18%
건물 034의 SMAPE: 6.08%
건물 035의 SMAPE: 0.69%
건물 036의 SMAPE: 1.01%
건물 037의 SMAPE: 13.14%
건물 038의 SMAPE: 6.58%
건물 039의 SMAPE: 7.29%
건물 040의 SMAPE: 7.33%
건물 041의 SMAPE: 0.60%
건물 042의 SMAPE: 5.60%
건물 043의 SMAPE: 1.90%
건물 044의 SMAPE: 8.62%
건물 045의 SMAPE: 11.92%
건물 046의 SMAPE: 10.7