# Forecasting with Chronos-2

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/autogluon/autogluon/blob/stable/docs/tutorials/timeseries/forecasting-chronos.ipynb)
[![Open In SageMaker Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/autogluon/autogluon/blob/stable/docs/tutorials/timeseries/forecasting-chronos.ipynb)

AutoGluon-TimeSeries (AG-TS) includes the [Chronos](https://github.com/amazon-science/chronos-forecasting) family of forecasting models. Chronos models are pretrained on a large collection of real and synthetic time series data, enabling accurate out-of-the-box forecasts on new data.

AG-TS provides a robust and user-friendly way to work with Chronos through the familiar `TimeSeriesPredictor` API. It allows users to backtest models, compare them with other forecasting approaches, and ensemble Chronos with other models to build robust forecasting pipelines. This tutorial demonstrates how to:

- Use Chronos-2 in **zero-shot** mode to generate forecasts without dataset-specific training
- **Fine-tune** Chronos-2 on custom data to improve accuracy

:::{note}

**New in v1.5:** AutoGluon now features [Chronos-2](https://arxiv.org/abs/2510.15821) ‚Äî the latest version of Chronos models with _zero-shot_ support for covariates and a [90%+ win-rate](https://huggingface.co/spaces/autogluon/fev-bench) over Chronos-Bolt. The older version of this tutorial with the Chronos-Bolt model is available [here](https://auto.gluon.ai/1.4.0/tutorials/timeseries/forecasting-chronos.html).

:::

In [1]:
# We use uv for faster installation
!pip install uv
!uv pip install -q autogluon.timeseries --system
!uv pip uninstall -q torchaudio torchvision torchtext --system # fix incompatible package versions on Colab

Collecting uv
  Downloading uv-0.9.18-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Downloading uv-0.9.18-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (22.2 MB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m22.2/22.2 MB[0m [31m118.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: uv
Successfully installed uv-0.9.18


## Getting started with Chronos-2

Being a pretrained model for zero-shot forecasting, Chronos is different from other models available in AG-TS.
Specifically, by default, Chronos models do not really `fit` time series data. However, when `predict` is called, they perform _zero-shot inference_ by using the provided contextual information. In this aspect, they behave like local statistical models such as ETS or ARIMA, where all computation happens during inference.

AutoGluon supports the original Chronos models (e.g., [`chronos-t5-large`](https://huggingface.co/autogluon/chronos-t5-large)), the Chronos-Bolt models (e.g., [`chronos-bolt-base`](https://huggingface.co/autogluon/chronos-bolt-base)), and the latest Chronos-2 models (e.g., [`chronos-2`](https://huggingface.co/autogluon/chronos-2)). The following table compares the capabilities of the three model families.

| Capability | Chronos | Chronos-Bolt | Chronos-2 |
|------------|---------|--------------|-----------|
| Univariate Forecasting | ‚úÖ | ‚úÖ | ‚úÖ |
| Cross-learning across items | ‚ùå | ‚ùå | ‚úÖ |
| Multivariate Forecasting | ‚ùå | ‚ùå | ‚úÖ |
| Past-only (real/categorical) covariates | ‚ùå | ‚ùå | ‚úÖ |
| Known future (real/categorical) covariates | üß© | üß© | ‚úÖ |
| Fine-tuning support | ‚úÖ | ‚úÖ | ‚úÖ |
| Max. Context Length | 512 | 2048 | 8192 |
| Max. Prediction Length | 64 | 64 | 1024 |


The easiest way to get started with Chronos is through the model-specific presets.

- **(recommended)** The Chronos-2 models can be accessed using the `"chronos2_small"` and `"chronos2"` presets.
- The Chronos-BoltÔ∏è models can be accessed using the `"bolt_tiny"`, `"bolt_mini"`, `"bolt_small"` and `"bolt_base"` presets.

Alternatively, Chronos models can be combined with other time series models using presets `"medium_quality"`, `"high_quality"` and `"best_quality"`. More details about these presets are available in the documentation for [`TimeSeriesPredictor.fit`](https://auto.gluon.ai/stable/api/autogluon.timeseries.TimeSeriesPredictor.fit.html).


üß© Chronos/Chronos-Bolt do not natively support future covariates, but they can be combined with external covariate regressors. This only models per-timestep effects, not effects across time. In contrast, Chronos-2 supports all covariate types natively.

## Zero-shot forecasting

### Univariate Forecasting

Let's work with a subset of the [Australian Electricity Demand dataset](https://zenodo.org/records/4659727) to see Chronos-2 in action.

First, we load the dataset as a [TimeSeriesDataFrame](https://auto.gluon.ai/stable/api/autogluon.timeseries.TimeSeriesDataFrame.html).

In [2]:
import pandas as pd
from autogluon.timeseries import TimeSeriesDataFrame, TimeSeriesPredictor

In [4]:
data = TimeSeriesDataFrame.from_path(
    "/content/data.csv"
)
data.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,target
item_id,timestamp,Unnamed: 2_level_1
tsla,2010-07-01,1.464
tsla,2010-07-02,1.28
tsla,2010-07-06,1.074
tsla,2010-07-07,1.053333
tsla,2010-07-08,1.164


Next, we create the [TimeSeriesPredictor](https://auto.gluon.ai/stable/api/autogluon.timeseries.TimeSeriesPredictor.html) and select the `"chronos2"` presets to use the Chronos-2 (120M) model in zero-shot mode.

In [13]:
num_test_windows = 3
prediction_length = 5000
train_data, test_data = data.train_test_split(num_test_windows * prediction_length)

# predictor = TimeSeriesPredictor(eval_metric="WQL", freq="h", prediction_length=prediction_length, quantile_levels = [0.99]).fit(
#     data,
#     enable_ensemble=True,
#     presets="chronos2_ensemble",
#     hyperparameters={"Chronos2": {"fine_tune": True, "fine_tune_mode": "full", "fine_tune_lr": 1e-4, "fine_tune_steps": 2000, "fine_tune_batch_size": 32}},
# )
predictor = TimeSeriesPredictor(freq="D", prediction_length=prediction_length, quantile_levels = [.5,0.99]).fit(
    data,
    presets="chronos2",
)

Beginning AutoGluon training...
AutoGluon will save models to '/content/AutogluonModels/ag-20251228_221812'
AutoGluon Version:  1.5.0
Python Version:     3.12.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Thu Oct  2 10:42:05 UTC 2025
CPU Count:          2
Pytorch Version:    2.9.0+cu126
CUDA Version:       12.6
GPU Memory:         GPU 0: 12.07/14.74 GB
Total GPU Memory:   Free: 12.07 GB, Allocated: 2.67 GB, Total: 14.74 GB
GPU Count:          1
Memory Avail:       9.26 GB / 12.67 GB (73.1%)
Disk Space Avail:   72.96 GB / 112.64 GB (64.8%)
Setting presets to: chronos2

Fitting with arguments:
{'enable_ensemble': True,
 'eval_metric': WQL,
 'freq': 'D',
 'hyperparameters': {'Chronos2': {'model_path': 'autogluon/chronos-2'}},
 'known_covariates_names': [],
 'num_val_windows': 1,
 'prediction_length': 5000,
 'quantile_levels': [0.5, 0.99],
 'random_seed': 123,
 'refit_every_n_windows': 1,
 'refit_full': False,
 'skip_model_selection': True,
 'target': '

As promised, Chronos does not take any time to `fit`. The `fit` call merely serves as a proxy for the `TimeSeriesPredictor` to do some of its chores under the hood, such as inferring the frequency of time series and saving the predictor's state to disk.

Let's use the `predict` method to generate forecasts.

In [14]:
predictions = predictor.predict(data)
predictions

data with frequency 'IRREG' has been resampled to frequency 'D'.
Model not specified in predict, will default to the model with the best validation score: Chronos2


Unnamed: 0_level_0,Unnamed: 1_level_0,mean,0.5,0.99
item_id,timestamp,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
tsla,2025-11-27,398.264282,398.264282,439.533905
tsla,2025-11-28,404.633362,404.633362,448.069153
tsla,2025-11-29,276.994751,276.994751,315.374329
tsla,2025-11-30,288.778900,288.778900,328.619354
tsla,2025-12-01,393.092804,393.092804,447.523590
tsla,...,...,...,...
tsla,2039-08-01,203.697388,203.697388,373.244019
tsla,2039-08-02,204.510559,204.510559,373.062012
tsla,2039-08-03,203.780548,203.780548,371.179535
tsla,2039-08-04,204.101669,204.101669,373.209991


In [10]:
# Convert to pandas + save
pred_df = predictions.reset_index()   # columns: item_id, timestamp, mean + quantiles
pred_df.to_csv("predictions.csv", index=False)

# Download from Colab
from google.colab import files
files.download("predictions.csv")


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [37]:
# --- Convert Chronos2 / AutoGluon predictions to MQL5 "EMBEDDED_HEADERS / EMBEDDED_DATA" ---
import pandas as pd
import numpy as np
import re

# If you already have `predictions` in memory (from predictor.predict(...)), this will use it.
# Otherwise, load from a CSV you uploaded (adjust filename/path as needed).
if "predictions" in globals():
    pred_df = predictions.copy()
else:
    pred_df = pd.read_csv("/content/predictions.csv")  # <-- change if your file has a different name

# If predictions came as a MultiIndex (item_id, timestamp), normalize to columns
if isinstance(pred_df.index, pd.MultiIndex):
    pred_df = pred_df.reset_index()

# Ensure timestamp is datetime
if "timestamp" not in pred_df.columns:
    raise ValueError("Expected a 'timestamp' column in predictions.")
pred_df["timestamp"] = pd.to_datetime(pred_df["timestamp"])

# (Optional) pick a specific series if you have multiple item_id
if "item_id" in pred_df.columns:
    SERIES_ID = pred_df["item_id"].iloc[0]  # or set manually, e.g. "H1"
    pred_df = pred_df[pred_df["item_id"] == SERIES_ID].copy()

# --- autodetect quantile columns (prefer: all columns AFTER mean) ---
cols = list(pred_df.columns)

# find "mean" column (case-insensitive)
mean_col = next((c for c in cols if str(c).lower() == "mean"), None)

def is_quant_col(c):
    s = str(c).strip()
    try:
        q = float(s)
        return (0 < q < 1)
    except:
        return False

if mean_col is not None:
    mean_idx = cols.index(mean_col)
    candidate = cols[mean_idx + 1 :]
    quant_cols = [c for c in candidate if is_quant_col(c)]
else:
    # fallback: any numeric column name between 0 and 1 (excluding timestamp/item_id/etc)
    quant_cols = [c for c in cols if is_quant_col(c)]

if not quant_cols:
    raise ValueError("No quantile columns found. Expected columns like '0.01'...'0.99'.")

# Sort quantiles ascending by their numeric value
quant_cols = sorted(quant_cols, key=lambda c: float(str(c).strip()))

# Map "0.01" -> "P1", "0.10" -> "P10", ..., "0.99" -> "P99"
quant_labels = []
for c in quant_cols:
    q = float(str(c).strip())
    p = int(round(q * 100))
    quant_labels.append(f"P{p}")

# If you truly want to REQUIRE P1..P99, uncomment this check:
# missing = [p for p in range(1, 100) if f"P{p}" not in set(quant_labels)]
# if missing:
#     raise ValueError(f"Missing quantiles: {missing} (your file doesn't contain all P1..P99 columns)")

# Header in your EA style (date + P1..)
header = "date," + ",".join(quant_labels)

def fmt_num(x):
    # nice compact doubles (up to 6 decimals), similar to your sample
    if pd.isna(x):
        return ""
    s = f"{float(x):.6f}".rstrip("0").rstrip(".")
    return s

lines = []
for _, row in pred_df.iterrows():
    dt_str = row["timestamp"].strftime("%m-%d %H:%M")  # e.g. "12-15 07:00"
    vals = [fmt_num(row[c]) for c in quant_cols]
    lines.append(dt_str + "," + ",".join(vals))

# --- Print as MQL5 embedded strings ---
print("//------------------------------ Embedded forecast ------------------------------")
print(f'string EMBEDDED_HEADERS = "{header}";')
print("string EMBEDDED_DATA =")
for ln in lines:
    # NOTE: this prints literal \n (one backslash) into the MQL string
    print(f"\"{ln}\\n\"")
print(";")

# --- Also save to a file you can download from Colab ---
out_path = "/content/embedded_forecast.txt"
with open(out_path, "w", encoding="utf-8") as f:
    f.write("//------------------------------ Embedded forecast ------------------------------\n")
    f.write(f'string EMBEDDED_HEADERS = "{header}";\n')
    f.write("string EMBEDDED_DATA =\n")
    for ln in lines:
        # NOTE: write literal \n (one backslash) into the MQL string
        f.write(f"\"{ln}\\n\"\n")
    f.write(";\n")

print(f"\nSaved: {out_path}")

# Download (Colab)
from google.colab import files
files.download(out_path)


//------------------------------ Embedded forecast ------------------------------
string EMBEDDED_HEADERS = "date,P1,P2,P3,P4,P5,P6,P7,P8,P9,P10,P11,P12,P13,P14,P15,P16,P17,P18,P19,P20,P21,P22,P23,P24,P25,P26,P27,P28,P29,P30,P31,P32,P33,P34,P35,P36,P37,P38,P39,P40,P41,P42,P43,P44,P45,P46,P47,P48,P49,P50,P51,P52,P53,P54,P55,P56,P57,P58,P59,P60,P61,P62,P63,P64,P65,P66,P67,P68,P69,P70,P71,P72,P73,P74,P75,P76,P77,P78,P79,P80,P81,P82,P83,P84,P85,P86,P87,P88,P89,P90,P91,P92,P93,P94,P95,P96,P97,P98,P99";
string EMBEDDED_DATA =
"09-22 17:00,3720.21875,3726.173828,3732.128906,3738.08374,3744.038818,3745.417969,3746.797119,3748.176025,3749.555176,3750.934326,3751.767334,3752.600342,3753.433594,3754.266602,3755.099609,3755.758301,3756.417236,3757.075928,3757.734863,3758.393555,3759.112061,3759.830566,3760.548828,3761.267334,3761.98584,3763.093262,3764.200439,3765.307861,3766.415039,3767.522461,3768.155762,3768.788818,3769.422119,3770.055176,3770.688477,3770.251709,3769.815186,3769.378418,3768.941

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

We get a dataframe with the point forecast (`mean`) and nine quantiles which capture the uncertainty in the forecasts. Custom quantile levels can be specified as follows:
```py
TimeSeriesPredictor(..., quantile_levels=[0.05, 0.1, 0.5, 0.9, 0.95])
```

AG-TS also makes it easy to generate predictions for multiple backtest dates and to visualize the models' predictions.

In [16]:
import matplotlib.pyplot as plt

# Generate predictions for multiple windows
predictions_per_window = predictor.backtest_predictions(test_data, num_val_windows=num_test_windows)

# Plot predictions for the first two time series
item_ids = test_data.item_ids[:2].tolist()
all_predictions = pd.concat(predictions_per_window)
predictor.plot(test_data, all_predictions, max_history_length=300, item_ids=item_ids)

# Optional: Plot the cutoff dates with dashed vertical lines
for cutoff in range(-num_test_windows * prediction_length, 0, prediction_length):
    for i, ax in enumerate(plt.gcf().axes):
        cutoff_timestamp = test_data.loc[item_ids[i]].index[cutoff]
        ax.axvline(cutoff_timestamp, color='gray', linestyle='--')
plt.show()

data with frequency 'IRREG' has been resampled to frequency 'D'.
Model Chronos2 failed to predict with the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/autogluon/timeseries/trainer/trainer.py", line 1142, in get_model_pred_dict
    model_pred_dict[model_name] = self._predict_model(
                                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/autogluon/timeseries/trainer/trainer.py", line 1070, in _predict_model
    return model.predict(model_inputs, known_covariates=known_covariates)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/autogluon/timeseries/models/abstract/abstract_timeseries_model.py", line 622, in predict
    predictions = self._predict(data=data, known_covariates=known_covariates, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/

RuntimeError: Following models failed to predict: ['Chronos2']

## Forecasting with covariates

The previous example showed Chronos-2 in action on a univariate forecasting task, i.e., only the historical data of the target time series for making predictions. However, in real-world scenarios, additional exogenous information related to the target series (e.g., weather forecasts, holidays, promotions) is often available. These exogenous time series, often referred to as covariates, may either be observed only in the past (past-only) or also in the forecast horizon (known future). Leveraging this information when making predictions can improve forecast accuracy.

Chronos-2 natively supports (dynamic) covariates, past-only and known-future, real-valued or categorical. Let's see how we can use Chronos-2 to forecast with covariates on a **Electrical Load Forecasting** task.

In [None]:
data = TimeSeriesDataFrame.from_path(
    "https://autogluon.s3.amazonaws.com/datasets/timeseries/bull/test.parquet", id_column="id"
)
data.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,load,airtemperature,dewtemperature,sealvlpressure
item_id,timestamp,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Bull_education_Magaret,2016-01-01 00:00:00,0.0,9.4,3.3,1028.699951
Bull_education_Magaret,2016-01-01 01:00:00,2.7908,8.9,2.2,1028.800049
Bull_education_Magaret,2016-01-01 02:00:00,3.721,8.9,2.2,1029.599976
Bull_education_Magaret,2016-01-01 03:00:00,2.7908,8.3,1.7,1029.5
Bull_education_Magaret,2016-01-01 04:00:00,9.3025,7.8,1.7,1029.599976


The goal is to forecast next day's (24 hours) load using historical load and known weather covariates: air temperature, dew temperature and sea level pressure. Since future weather information is not known in advance, weather forecasts are typically used as known covariates.

In [None]:
prediction_length = 24
train_data, test_data = data.train_test_split(prediction_length=prediction_length)

Sorting the dataframe index before generating the train/test split.


The following code uses Chronos-2 in the TimeSeriesPredictor to forecast the `load` for the next 24 hours. We use the _univariate_ [Chronos-Bolt (Small)](https://huggingface.co/autogluon/chronos-bolt-small) model as a baseline for comparison.

Note that we have specified the target column we are interested in forecasting and the names of known covariates while constructing the TimeSeriesPredictor. Any other columns, if present, will be used as past-only covariates.

In [None]:
predictor = TimeSeriesPredictor(
    prediction_length=prediction_length,
    target="load",
    known_covariates_names=["airtemperature", "dewtemperature", "sealvlpressure"],
    eval_metric="MASE",
).fit(
    train_data,
    hyperparameters={"Chronos": {}, "Chronos2": {}},
    enable_ensemble=False,
    time_limit=60,
)

Beginning AutoGluon training... Time limit = 60s
AutoGluon will save models to '/fsx/ansarnd/repos/autogluon/docs/tutorials/timeseries/AutogluonModels/ag-20251214_125334'
AutoGluon Version:  1.4.1b20250910
Python Version:     3.11.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #38~22.04.1-Ubuntu SMP Fri Aug 22 15:44:33 UTC 2025
CPU Count:          96
Pytorch Version:    2.7.1+cu126
CUDA Version:       12.6
GPU Memory:         GPU 0: 39.37/39.38 GB
Total GPU Memory:   Free: 39.37 GB, Allocated: 0.01 GB, Total: 39.38 GB
GPU Count:          1
Memory Avail:       1029.12 GB / 1121.80 GB (91.7%)
Disk Space Avail:   1758.43 GB / 11459.15 GB (15.3%)

Fitting with arguments:
{'enable_ensemble': False,
 'eval_metric': MASE,
 'hyperparameters': {'Chronos': {}, 'Chronos2': {}},
 'known_covariates_names': ['airtemperature',
                            'dewtemperature',
                            'sealvlpressure'],
 'num_val_windows': 1,
 'prediction_length': 24,
 'qua

Once the predictor has been fit, we can evaluate it on the test dataset and generate the leaderboard. We see that Chronos-2, which utilizes covariates, produces a significantly more accurate forecast on the test set compared to Chronos-Bolt, which does not utilize covariates.

Note that all AutoGluon-TimeSeries models report scores in a "higher is better" format, meaning that most forecasting error metrics like MASE are multiplied by -1 when reported.

In [17]:
predictor.leaderboard(test_data)

data with frequency 'IRREG' has been resampled to frequency 'D'.
Additional data provided, testing on additional data. Resulting leaderboard will be sorted according to test score (`score_test`).


Unnamed: 0,model,score_test,score_val,pred_time_test,pred_time_val,fit_time_marginal,fit_order
0,Chronos2,-1.450125,,1.867302,,0.632699,1


We can also use the predictor to compute features importances to understand which exogenous features are affecting the prediction accuracy the most.

In [18]:
predictor.feature_importance(test_data, model="Chronos2", relative_scores=True)

data with frequency 'IRREG' has been resampled to frequency 'D'.
Computing feature importance


Unnamed: 0,importance,stdev,n,p99_low,p99_high


With `relative_scores=True`, this method returns relative (percentage) improvements in the `eval_metric` due to each feature. In this example, the `airtemperature` feature is the most important for accurate forecasting, yielding a ~32% error reduction on the test set.

Note that covariates may not always be useful and using more covariates does not necessarily imply more accurate forecasts. With Chronos-2, AutoGluon makes it easy for users to quickly validate different configurations and find ones that perform best on held-out data.

## Fine-tuning

We have seen above how Chronos-2 models can produce forecasts in zero-shot mode, both with and without covariates. AutoGluon also makes it easy to fine-tune Chronos models on a specific dataset to maximize the predictive accuracy.

The following snippet specifies two settings for the Chronos-2 model: zero-shot and fine-tuned. `TimeSeriesPredictor` will perform a lightweight fine-tuning of the pretrained model on the provided training data. We add name suffixes to easily identify the zero-shot and fine-tuned versions of the model.

:::{note}

If you are fine-tuning on a machine with multiple GPUs, we strongly recommend setting the `CUDA_VISIBLE_DEVICES` environment variable to ensure that only a single GPU is visible.

:::

In [None]:
predictor = TimeSeriesPredictor(
    prediction_length=prediction_length,
    target="load",
    known_covariates_names=["airtemperature", "dewtemperature", "sealvlpressure"],
    eval_metric="MASE",
).fit(
    train_data=train_data,
    hyperparameters={
        "Chronos2": [
            # Zero-shot model
            {"ag_args": {"name_suffix": "ZeroShot"}},
            # Fine-tuned model
            {"fine_tune": True, "ag_args": {"name_suffix": "FineTuned"}},
        ]
    },
    time_limit=300,  # time limit in seconds
    enable_ensemble=False,
)

Beginning AutoGluon training... Time limit = 300s
AutoGluon will save models to '/fsx/ansarnd/repos/autogluon/docs/tutorials/timeseries/AutogluonModels/ag-20251214_125418'
AutoGluon Version:  1.4.1b20250910
Python Version:     3.11.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #38~22.04.1-Ubuntu SMP Fri Aug 22 15:44:33 UTC 2025
CPU Count:          96
Pytorch Version:    2.7.1+cu126
CUDA Version:       12.6
GPU Memory:         GPU 0: 39.37/39.38 GB
Total GPU Memory:   Free: 39.37 GB, Allocated: 0.01 GB, Total: 39.38 GB
GPU Count:          1
Memory Avail:       1028.93 GB / 1121.80 GB (91.7%)
Disk Space Avail:   1758.40 GB / 11459.15 GB (15.3%)

Fitting with arguments:
{'enable_ensemble': False,
 'eval_metric': MASE,
 'hyperparameters': {'Chronos2': [{'ag_args': {'name_suffix': 'ZeroShot'}},
                                  {'ag_args': {'name_suffix': 'FineTuned'},
                                   'fine_tune': True}]},
 'known_covariates_names': ['airtemp

Here we used the default fine-tuning configuration for Chronos-2 by only specifying `"fine_tune": True`. By default, Chronos-2 is fine-tuned with a low-rank adapter (LoRA) to reduce memory and disk footprint. AutoGluon makes it easy to change other parameters for fine-tuning such as the mode, number of steps or learning rate.
```python
predictor.fit(
    ...,
    hyperparameters={"Chronos2": {"fine_tune": True, "fine_tune_mode": "full", "fine_tune_lr": 1e-4, "fine_tune_steps": 2000, "fine_tune_batch_size": 32}},
)
```

For the full list of fine-tuning options, see the Chronos-2 documentation in [Forecasting Model Zoo](forecasting-model-zoo.md#autogluon.timeseries.models.Chronos2Model).


After fitting, we can evaluate the two model variants on the test data and generate a leaderboard.

In [None]:
predictor.leaderboard(test_data)

Additional data provided, testing on additional data. Resulting leaderboard will be sorted according to test score (`score_test`).


Unnamed: 0,model,score_test,score_val,pred_time_test,pred_time_val,fit_time_marginal,fit_order
0,Chronos2FineTuned,-0.677888,-0.802909,2.51046,0.773595,123.887496,2
1,Chronos2ZeroShot,-0.696239,-0.817203,2.385682,1.586984,0.885827,1


Fine-tuning resulted in a more accurate model, as shown by the better `score_test` on the test set.

## FAQ


#### How accurate is Chronos-2?

Chronos-2 is the best performing (last updated: Dec 2025) time series foundation model across multiple benchmarks, including [fev-bench](https://huggingface.co/spaces/autogluon/fev-bench), [GIFT-Eval](https://huggingface.co/spaces/Salesforce/GIFT-Eval) and [Chronos Bench II](https://arxiv.org/abs/2403.07815). Details empirical results can be found in the [Chronos-2 technical report](https://arxiv.org/abs/2510.15821). The accuracy of Chronos-2 often exceeds statistical baseline models and task-specific deep learning models such as `DeepAR` and `TemporalFusionTransformer`.

#### Does fine-tuning always improve Chronos-2's forecasting accuracy?

Fine-tuning a foundation model like Chronos-2 involves many hyperparameter choices. AG-TS provides reasonable defaults that performed well in large-scale benchmarking, but they may not be optimal for every use case. We recommend fine-tuning only when you have a reasonable number of time series and sufficient historical data (e.g., >100 time series with a median history length larger than `3 * prediction_length`), as limited data can lead to overfitting or degraded performance. If you observe degraded accuracy, we recommend increasing the size of the training data and experimenting with different fine-tuning hyperparameters.

Alternatively, you can use an ensemble of zero-shot Chronos-2 and fine-tuned Chronos-2 (Small) to construct a robust predictor, available via the `chronos2_ensemble` preset:

```py
predictor = TimeSeriesPredictor(prediction_length=prediction_length).fit(
    ...,
    presets="chronos2_ensemble",
)
```

#### What is the recommended hardware for running Chronos models?

We recommend using a machine with a GPU for best performance, especially for fine-tuning. For reference, we tested the models on AWS `g5.2xlarge` instances with NVIDIA A10G GPUs (24 GiB GPU memory) and 32 GiB of system memory. However, Chronos-2, Chronos-Bolt, and Chronos (up to small size) can also run on consumer GPUs and CPUs with reasonable inference times.

#### Why do my predictions change with the `batch_size`?

By default, AutoGluon enables Chronos-2‚Äôs cross_learning mode, where the model makes joint predictions across time series within a batch. This often improves accuracy but also makes results sensitive to the `batch_size`. You can disable this mode with:

```python
predictor.fit(
    ...,
    hyperparameters={"Chronos2": {"cross_learning": False}},
)
```

#### Where can I ask specific questions on Chronos?

Members of the AutoGluon team are among the core developers of Chronos. So you can ask Chronos-related questions on [AutoGluon's GitHub](https://github.com/autogluon/autogluon) or on [Chronos' GitHub](https://github.com/amazon-science/chronos-forecasting/discussions).