Gather data from
https://huggingface.co/datasets/autogluon/chronos_datasets

All data is in the 'train' split

In [24]:
import datasets

ds = datasets.load_dataset("autogluon/chronos_datasets", "monash_covid_deaths", split="train")
ds.set_format("numpy")  # sequences returned as numpy arrays

Each time series has an 'id', and array of timestamps 'timestamp', and an array of values 'target'

In [None]:
example_timeseries = ds[0]
print("id:", example_timeseries["id"])
print(
    f"timestamp: of shape {example_timeseries['timestamp'].shape} contains entries like:",
    example_timeseries["timestamp"][0],
)
print(f"target of shape {example_timeseries['target'].shape} contains entries like", example_timeseries["target"][0])

id: T000000
timestamp: of shape (212,) contains entries like: 2020-01-22T00:00:00.000
target of shape (212,) contains entries like 0.0


Interestingly, the 'target' array is not the predition targets for the models. Instead 'target' contains all of the values of the timeseries. 

The prediction length $H$ used for each of the datasets by the Chronos paper is described on page 31 of https://arxiv.org/pdf/2403.07815

$H$ is dataset-specific, and the last $H$ of each time series was used for the prediction task

In [None]:
import pandas as pd


# We can convert data in such format to a long format data frame
def to_pandas(ds: datasets.Dataset) -> "pd.DataFrame":
    """Convert dataset to long data frame format."""
    sequence_columns = [col for col in ds.features if isinstance(ds.features[col], datasets.Sequence)]
    return ds.to_pandas().explode(sequence_columns).infer_objects()


print(to_pandas(ds))

          id  timestamp  target
0    T000000 2020-01-22     0.0
0    T000000 2020-01-23     0.0
0    T000000 2020-01-24     0.0
0    T000000 2020-01-25     0.0
0    T000000 2020-01-26     0.0
..       ...        ...     ...
265  T000265 2020-08-16   132.0
265  T000265 2020-08-17   135.0
265  T000265 2020-08-18   141.0
265  T000265 2020-08-19   150.0
265  T000265 2020-08-20   151.0

[56392 rows x 3 columns]
