# Geographical and Temporal Aggregation (Tourism)

> Geographical and Temporal Hierarchical Forecasting on Australian Tourism Data

In many applications, a set of time series is hierarchically organized. Examples include the presence of geographic levels, products, or categories that define different types of aggregations. In such scenarios, forecasters are often required to provide predictions for all disaggregate and aggregate series. A natural desire is for those predictions to be **"coherent"**, that is, for the bottom series to add up precisely to the forecasts of the aggregated series.

In this notebook we present an example on how to use `HierarchicalForecast` to produce coherent forecasts between both geographical levels and temporal levels. We will use the classic Australian Domestic Tourism (`Tourism`) dataset, which contains monthly time series of the number of visitors to each state of Australia.

We will first load the `Tourism` data and produce base forecasts using an `AutoETS` model from `StatsForecast`. Then, we reconciliate the forecasts with several reconciliation algorithms from `HierarchicalForecast` according to the cross-sectional geographical hierarchies. Finally, we reconciliate the forecasts in the temporal dimension according to a temporal hierarchy.

You can run these experiments using CPU or GPU with Google Colab.

<a href="https://colab.research.google.com/github/Nixtla/hierarchicalforecast/blob/main/nbs/examples/AustralianDomesticTourismCrossTemporal.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%%capture
!pip install hierarchicalforecast statsforecast

## 1. Load and Process Data

In this example we will use the [Tourism](https://otexts.com/fpp3/tourism.html) dataset from the [Forecasting: Principles and Practice](https://otexts.com/fpp3/) book.

The dataset only contains the time series at the lowest level, so we need to create the time series for all hierarchies.

In [None]:
import numpy as np
import pandas as pd

In [None]:
Y_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/tourism.csv')
Y_df = Y_df.rename({'Trips': 'y', 'Quarter': 'ds'}, axis=1)
Y_df.insert(0, 'Country', 'Australia')
Y_df = Y_df[['Country', 'Region', 'State', 'Purpose', 'ds', 'y']]
Y_df['ds'] = Y_df['ds'].str.replace(r'(\d+) (Q\d)', r'\1-\2', regex=True)
Y_df['ds'] = pd.PeriodIndex(Y_df["ds"], freq='Q').to_timestamp()
Y_df.head()

Unnamed: 0,Country,Region,State,Purpose,ds,y
0,Australia,Adelaide,South Australia,Business,1998-01-01,135.07769
1,Australia,Adelaide,South Australia,Business,1998-04-01,109.987316
2,Australia,Adelaide,South Australia,Business,1998-07-01,166.034687
3,Australia,Adelaide,South Australia,Business,1998-10-01,127.160464
4,Australia,Adelaide,South Australia,Business,1999-01-01,137.448533


## 2. Cross-sectional reconciliation

### 2a. Aggregating the dataset according to cross-sectional hierarchy

The dataset can be grouped in the following non-strictly hierarchical structure.

In [None]:
spec = [
    ['Country'],
    ['Country', 'State'], 
    ['Country', 'Purpose'], 
    ['Country', 'State', 'Region'], 
    ['Country', 'State', 'Purpose'], 
    ['Country', 'State', 'Region', 'Purpose']
]

Using the `aggregate` function from `HierarchicalForecast` we can get the full set of time series.

In [None]:
from hierarchicalforecast.utils import aggregate

In [None]:
Y_df_cs, S_df_cs, tags_cs = aggregate(Y_df, spec)

In [None]:
Y_df_cs

Unnamed: 0,unique_id,ds,y
0,Australia,1998-01-01,23182.197269
1,Australia,1998-04-01,20323.380067
2,Australia,1998-07-01,19826.640511
3,Australia,1998-10-01,20830.129891
4,Australia,1999-01-01,22087.353380
...,...,...,...
33995,Australia/Western Australia/Experience Perth/V...,2016-10-01,439.699451
33996,Australia/Western Australia/Experience Perth/V...,2017-01-01,356.867038
33997,Australia/Western Australia/Experience Perth/V...,2017-04-01,302.296119
33998,Australia/Western Australia/Experience Perth/V...,2017-07-01,373.442070


In [None]:
S_df_cs.iloc[:5, :5]

Unnamed: 0,unique_id,Australia/ACT/Canberra/Business,Australia/ACT/Canberra/Holiday,Australia/ACT/Canberra/Other,Australia/ACT/Canberra/Visiting
0,Australia,1.0,1.0,1.0,1.0
1,Australia/ACT,1.0,1.0,1.0,1.0
2,Australia/New South Wales,0.0,0.0,0.0,0.0
3,Australia/Northern Territory,0.0,0.0,0.0,0.0
4,Australia/Queensland,0.0,0.0,0.0,0.0


### 2b. Split Train/Test sets

We use the final two years (8 quarters) as test set. Consequently, our forecast horizon=8.

In [None]:
horizon = 8

In [None]:
Y_test_df_cs = Y_df_cs.groupby("unique_id", as_index=False).tail(horizon)
Y_train_df_cs = Y_df_cs.drop(Y_test_df_cs.index)

### 2c. Computing base forecasts

The following cell computes the **base forecasts** for each time series in `Y_df` using the `AutoETS` model. Observe that `Y_hat_df` contains the forecasts but they are not coherent.

In [None]:
from statsforecast.models import AutoETS
from statsforecast.core import StatsForecast

In [None]:
fcst = StatsForecast(models=[AutoETS(season_length=4, model='ZZA')], 
                     freq='QS', n_jobs=-1)
Y_hat_df_cs = fcst.forecast(df=Y_train_df_cs, h=horizon, fitted=True)
Y_fitted_df_cs = fcst.forecast_fitted_values()

### 2d. Reconcile forecasts

The following cell makes the previous forecasts coherent using the `HierarchicalReconciliation` class. Since the hierarchy structure is not strict, we can't use methods such as `TopDown` or `MiddleOut`. In this example we use `BottomUp` and `MinTrace`.

In [None]:
from hierarchicalforecast.methods import BottomUp, MinTrace
from hierarchicalforecast.core import HierarchicalReconciliation

In [None]:
reconcilers = [
    BottomUp(),
    MinTrace(method='mint_shrink'),
    MinTrace(method='ols')
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df_cs = hrec.reconcile(Y_hat_df=Y_hat_df_cs, Y_df=Y_fitted_df_cs, S=S_df_cs, tags=tags_cs)

The dataframe `Y_rec_df` contains the reconciled forecasts.

In [None]:
Y_rec_df_cs.head()

Unnamed: 0,unique_id,ds,AutoETS,AutoETS/BottomUp,AutoETS/MinTrace_method-mint_shrink,AutoETS/MinTrace_method-ols
0,Australia,2016-01-01,25990.068004,24381.911737,25428.089783,25894.399067
1,Australia,2016-04-01,24458.490282,22903.895964,23914.2714,24357.301898
2,Australia,2016-07-01,23974.055984,22412.265739,23428.462394,23865.910647
3,Australia,2016-10-01,24563.454495,23127.349578,24089.845955,24470.782393
4,Australia,2017-01-01,25990.068004,24518.118006,25545.358678,25901.362283


## 3. Temporal reconciliation

Next, we aim to reconcile our forecasts also in the temporal domain.

### 3a. Aggregating the dataset according to temporal hierarchy

We first define the temporal aggregation spec. You can use string aliases of timestamp attributes to compute temporal aggregations. For Pandas, see an overview of allowable attributes [here](https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html). 

In this example, we choose a temporal aggregation of `["year"]` and the bottom level `["year", "ds"]`. The bottom level timesteps have a quarterly frequency.

In [None]:
spec_temporal = [
    ["year"],
    ["year", "ds"],
]   

We next compute the temporally aggregated train- and test sets using the `aggregate_temporal` function. Note that we have different aggregation matrices `S` for the train- and test set, as the test set contains temporal hierarchies that are not included in the train set.

In [None]:
from hierarchicalforecast.utils import aggregate_temporal

In [None]:
Y_train_df_te, S_train_df_te, tags_te_train = aggregate_temporal(df=Y_train_df_cs, freq="QS", spec=spec_temporal)
Y_test_df_te, S_test_df_te, tags_te_test = aggregate_temporal(df=Y_test_df_cs, freq="QS", spec=spec_temporal)


In [None]:
S_train_df_te.iloc[:5, :5]

Unnamed: 0,temporal_id,year-1998/1998-01-01,year-1998/1998-04-01,year-1998/1998-07-01,year-1998/1998-10-01
0,year-1998,1.0,1.0,1.0,1.0
1,year-1999,0.0,0.0,0.0,0.0
2,year-2000,0.0,0.0,0.0,0.0
3,year-2001,0.0,0.0,0.0,0.0
4,year-2002,0.0,0.0,0.0,0.0


In [None]:
S_test_df_te.iloc[:5, :5]

Unnamed: 0,temporal_id,year-2016/2016-01-01,year-2016/2016-04-01,year-2016/2016-07-01,year-2016/2016-10-01
0,year-2016,1.0,1.0,1.0,1.0
1,year-2017,0.0,0.0,0.0,0.0
2,year-2016/2016-01-01,1.0,0.0,0.0,0.0
3,year-2016/2016-04-01,0.0,1.0,0.0,0.0
4,year-2016/2016-07-01,0.0,0.0,1.0,0.0


If you don't have a test set available, as is usually the case when you're making forecasts, it is necessary to create a future dataframe that holds the correct bottom-level unique_ids and timestamps so that they can be temporally aggregated. We can use the `make_future_dataframe` helper function for that.

In [None]:
from hierarchicalforecast.utils import make_future_dataframe

In [None]:
Y_test_df_te_new = make_future_dataframe(Y_train_df_te, freq="QS", h=horizon)

`Y_test_df_te_new` can be then used in `aggregate_temporal` to construct the temporally aggregated structures:

In [None]:
Y_test_df_te_new, S_test_df_te_new, tags_te_test_new = aggregate_temporal(df=Y_test_df_te_new, freq="QS", spec=spec_temporal)


And we can verify that we have the same temporally aggregated test set, except that `Y_df_te_test_new` doesn't contain the ground truth values `y`.

In [None]:
Y_test_df_te

Unnamed: 0,temporal_id,unique_id,y,ds
0,year-2016,Australia,101484.586551,2016-01-01
1,year-2016,Australia/ACT,2457.401367,2016-01-01
2,year-2016,Australia/ACT/Business,754.139245,2016-01-01
3,year-2016,Australia/ACT/Canberra,2457.401367,2016-01-01
4,year-2016,Australia/ACT/Canberra/Business,754.139245,2016-01-01
...,...,...,...,...
4245,year-2017/2017-10-01,Australia/Western Australia/Experience Perth/O...,87.494916,2017-10-01
4246,year-2017/2017-10-01,Australia/Western Australia/Experience Perth/V...,455.316702,2017-10-01
4247,year-2017/2017-10-01,Australia/Western Australia/Holiday,1026.285985,2017-10-01
4248,year-2017/2017-10-01,Australia/Western Australia/Other,161.087339,2017-10-01


In [None]:
Y_test_df_te_new

Unnamed: 0,temporal_id,unique_id,ds
0,year-2016,Australia,2016-01-01
1,year-2016,Australia/ACT,2016-01-01
2,year-2016,Australia/ACT/Business,2016-01-01
3,year-2016,Australia/ACT/Canberra,2016-01-01
4,year-2016,Australia/ACT/Canberra/Business,2016-01-01
...,...,...,...
4245,year-2017/2017-10-01,Australia/Western Australia/Experience Perth/O...,2017-10-01
4246,year-2017/2017-10-01,Australia/Western Australia/Experience Perth/V...,2017-10-01
4247,year-2017/2017-10-01,Australia/Western Australia/Holiday,2017-10-01
4248,year-2017/2017-10-01,Australia/Western Australia/Other,2017-10-01


### 3b. Computing base forecasts

Now, we need to compute base forecasts for each temporal aggregation. The following cell computes the **base forecasts** for each temporal aggregation in `Y_train_df_te` using the `AutoETS` model. Observe that `Y_hat_df_te` contains the forecasts but they are not coherent.

Note also that both frequency and horizon are different for each temporal aggregation. In this example, the lowest level has a quarterly frequency, and a horizon of `8` (constituting `2` years). The `year` aggregation thus has a yearly frequency with a horizon of `2`.

It is of course possible to choose a different model for each level in the temporal aggregation - you can be as creative as you like!

In [None]:
Y_hat_dfs_te = []
id_cols = ["unique_id", "temporal_id", "ds", "y"]
# We will train a model for each temporal level
for level, temporal_ids_train in tags_te_train.items():
    # Filter the data for the level
    Y_level_train = Y_train_df_te.query("temporal_id in @temporal_ids_train")
    temporal_ids_test = tags_te_test[level]
    Y_level_test = Y_test_df_te.query("temporal_id in @temporal_ids_test")
    # For each temporal level we have a different frequency and forecast horizon
    freq_level = pd.infer_freq(Y_level_train["ds"].unique())
    horizon_level = Y_level_test["ds"].nunique()
    # Train a model and create forecasts
    fcst = StatsForecast(models=[AutoETS(model='ZZZ')], freq=freq_level, n_jobs=-1)
    Y_hat_df_te_level = fcst.forecast(df=Y_level_train[["ds", "unique_id", "y"]], h=horizon_level)
    # Add the test set to the forecast
    Y_hat_df_te_level = Y_hat_df_te_level.merge(Y_level_test, on=["ds", "unique_id"], how="left")
    # Put cols in the right order (for readability)
    Y_hat_cols = id_cols + [col for col in Y_hat_df_te_level.columns if col not in id_cols]
    Y_hat_df_te_level = Y_hat_df_te_level[Y_hat_cols]
    # Append the forecast to the list
    Y_hat_dfs_te.append(Y_hat_df_te_level)

Y_hat_df_te = pd.concat(Y_hat_dfs_te, ignore_index=True)


### 3c. Reconcile forecasts

We can again use the `HierarchicalReconciliation` class to reconcile the forecasts. In this example we use `BottomUp` and `MinTrace`. Note that we have to set `temporal=True` in the `reconcile` function.

Note that temporal reconcilation currently isn't supported for insample reconciliation methods, such as `MinTrace(method='mint_shrink')`.

In [None]:
reconcilers = [
    BottomUp(),
    MinTrace(method='ols')
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df_te = hrec.reconcile(Y_hat_df=Y_hat_df_te, S=S_test_df_te, tags=tags_te_test, temporal=True)

## 4. Evaluation 

The `HierarchicalForecast` package includes the `evaluate` function to evaluate the different hierarchies.

In [None]:
from hierarchicalforecast.evaluation import evaluate
from utilsforecast.losses import rmse

### 4a. Cross-sectional evaluation

We first evaluate the forecasts _across all cross-sectional aggregations_.

In [None]:
eval_tags = {}
eval_tags['Total'] = tags_cs['Country']
eval_tags['Purpose'] = tags_cs['Country/Purpose']
eval_tags['State'] = tags_cs['Country/State']
eval_tags['Regions'] = tags_cs['Country/State/Region']
eval_tags['Bottom'] = tags_cs['Country/State/Region/Purpose']

evaluation = evaluate(df = Y_rec_df_te.drop(columns = 'temporal_id'),
                      tags = eval_tags,
                      metrics = [rmse])

evaluation.columns = ['level', 'metric', 'Base', 'BottomUp', 'MinTrace(ols)']
numeric_cols = evaluation.select_dtypes(include="number").columns
evaluation[numeric_cols] = evaluation[numeric_cols].map('{:.2f}'.format).astype(np.float64)

In [None]:
evaluation

Unnamed: 0,level,metric,Base,BottomUp,MinTrace(ols)
0,Total,rmse,4089.09,4482.04,4083.23
1,Purpose,rmse,1167.85,1284.27,1114.9
2,State,rmse,631.81,540.32,614.16
3,Regions,rmse,100.14,105.58,96.51
4,Bottom,rmse,32.39,33.34,31.62
5,Overall,rmse,79.58,81.68,77.37


As can be seen `MinTrace(ols)` seems to be the best forecasting method across each cross-sectional aggregation.

### 4b. Temporal evaluation

We then evaluate the temporally aggregated forecasts _across all temporal aggregations_.

In [None]:
eval_tags = {}
eval_tags['Year'] = tags_te_test['year']
eval_tags['Quarter'] = tags_te_test['year/ds']

evaluation = evaluate(df = Y_rec_df_te.drop(columns = 'unique_id'),
                      tags = eval_tags,
                      metrics = [rmse],
                      id_col="temporal_id")

evaluation.columns = ['level', 'metric', 'Base', 'BottomUp', 'MinTrace(ols)']
numeric_cols = evaluation.select_dtypes(include="number").columns
evaluation[numeric_cols] = evaluation[numeric_cols].map('{:.2f}'.format).astype(np.float64)

In [None]:
evaluation

Unnamed: 0,level,metric,Base,BottomUp,MinTrace(ols)
0,Year,rmse,480.85,581.18,491.84
1,Quarter,rmse,168.02,168.02,152.3
2,Overall,rmse,230.58,250.65,220.21


Again, `MinTrace(ols)` is the best overall method, scoring the lowest `rmse` on the `Quarter` aggregated forecasts, and being slightly worse than the `Base` forecasts on the `Year` aggregated forecasts.

### 4c. Cross-temporal evaluation

Finally, we evaluate cross-temporally. To do so, we first need to obtain the combination of cross-sectional and temporal hierarchies, for which we can use the `get_cross_temporal_tags` helper function.


In [None]:
from hierarchicalforecast.utils import get_cross_temporal_tags

In [None]:
Y_rec_df_te, tags_ct = get_cross_temporal_tags(Y_rec_df_te, tags_cs=tags_cs, tags_te=tags_te_test)

As we can see, we now have a tag `Country//year` that contains `Australia//year-2016` and `Australia//year-2017`, indicating the cross-sectional hierarchy `Australia` at the temporal hierarchies `2016` and `2017`.

In [None]:
tags_ct

{'Country//year': ['Australia//year-2016', 'Australia//year-2017'],
 'Country//year/ds': ['Australia//year-2016/2016-01-01',
  'Australia//year-2016/2016-04-01',
  'Australia//year-2016/2016-07-01',
  'Australia//year-2016/2016-10-01',
  'Australia//year-2017/2017-01-01',
  'Australia//year-2017/2017-04-01',
  'Australia//year-2017/2017-07-01',
  'Australia//year-2017/2017-10-01'],
 'Country/State//year': ['Australia/ACT//year-2016',
  'Australia/ACT//year-2017',
  'Australia/New South Wales//year-2016',
  'Australia/New South Wales//year-2017',
  'Australia/Northern Territory//year-2016',
  'Australia/Northern Territory//year-2017',
  'Australia/Queensland//year-2016',
  'Australia/Queensland//year-2017',
  'Australia/South Australia//year-2016',
  'Australia/South Australia//year-2017',
  'Australia/Tasmania//year-2016',
  'Australia/Tasmania//year-2017',
  'Australia/Victoria//year-2016',
  'Australia/Victoria//year-2017',
  'Australia/Western Australia//year-2016',
  'Australia/Wes

We now have our dataset and cross-temporal tags ready for evaluation.

We again define a set of eval_tags, and now we split each cross-sectional aggregation also by each temporal aggregation.

In [None]:
eval_tags = {}
eval_tags['TotalByYear'] = tags_ct['Country//year']
eval_tags['RegionsByYear'] = tags_ct['Country/State/Region//year']
eval_tags['BottomByYear'] = tags_ct['Country/State/Region/Purpose//year']
eval_tags['TotalByQuarter'] = tags_ct['Country//year/ds']
eval_tags['RegionsByQuarter'] = tags_ct['Country/State/Region//year/ds']
eval_tags['BottomByQuarter'] = tags_ct['Country/State/Region/Purpose//year/ds']


evaluation = evaluate(df = Y_rec_df_te.drop(columns=['unique_id', 'temporal_id']),
                      tags = eval_tags,
                      id_col = 'cross_temporal_id',
                      metrics = [rmse])

evaluation.columns = ['level', 'metric', 'Base', 'BottomUp', 'MinTrace(ols)']
numeric_cols = evaluation.select_dtypes(include="number").columns
evaluation[numeric_cols] = evaluation[numeric_cols].map('{:.2f}'.format).astype(np.float64)
                      

In [None]:
evaluation

Unnamed: 0,level,metric,Base,BottomUp,MinTrace(ols)
0,TotalByYear,rmse,7148.99,8243.06,7367.8
1,RegionsByYear,rmse,151.96,175.69,153.94
2,BottomByYear,rmse,46.98,50.78,46.9
3,TotalByQuarter,rmse,2060.77,2060.77,1876.54
4,RegionsByQuarter,rmse,57.07,57.07,53.99
5,BottomByQuarter,rmse,19.42,19.42,18.74
6,Overall,rmse,43.14,45.27,42.01


We find that the best method is the cross-temporally reconciled method `AutoETS/MinTrace_method-ols`, which achieves overall lowest RMSE.

### References
- [Hyndman, R.J., & Athanasopoulos, G. (2021). "Forecasting: principles and practice, 3rd edition: 
Chapter 11: Forecasting hierarchical and grouped series.". OTexts: Melbourne, Australia. OTexts.com/fpp3 
Accessed on July 2022.](https://otexts.com/fpp3/hierarchical.html)
- [Rob Hyndman, Alan Lee, Earo Wang, Shanika Wickramasuriya, and Maintainer Earo Wang (2021). "hts: Hierarchical and Grouped Time Series". URL https://CRAN.R-project.org/package=hts. R package version 0.3.1.](https://cran.r-project.org/web/packages/hts/index.html)
- [Mitchell O’Hara-Wild, Rob Hyndman, Earo Wang, Gabriel Caceres, Tim-Gunnar Hensel, and Timothy Hyndman (2021). "fable: Forecasting Models for Tidy Time Series". URL https://CRAN.R-project.org/package=fable. R package version 6.0.2.](https://CRAN.R-project.org/package=fable)