# HierarchicalForecast with GluonTS Example Notebook


This is an example notebook which shows how HierarchicalForecast's reconciliation capabilities can be integrated with other popular machine learning libraries, in this case GluonTS. 

It trains the GluonTS DeepAREstimator on the TourismLarge Hierarchical Dataset, then uses the `samples_to_quantiles_df` util function to transform the output forecasts into a dataframe compatible with HierarchicalForecast's reconciliation functions.

## 1. Installing packages

In [None]:
%%capture
!pip install gluonts
!pip install pytorch_lightning
!pip install datasetsforecast
!pip install git+https://github.com/Nixtla/hierarchicalforecast.git

In [None]:
!pip install mxnet-cu112
import mxnet as mx
mx.context.num_gpus()

In [None]:
from datasetsforecast.hierarchical import HierarchicalData
from gluonts.dataset.pandas import PandasDataset
from gluonts.mx.model.deepar import DeepAREstimator
from gluonts.mx.trainer import Trainer
from gluonts.evaluation import make_evaluation_predictions

from hierarchicalforecast.methods import BottomUp, MinTrace
from hierarchicalforecast.core import HierarchicalReconciliation
from hierarchicalforecast.evaluation import scaled_crps
from hierarchicalforecast.utils import samples_to_quantiles_df

import pandas as pd
import numpy as np

## 2. Load hierarchical dataset


This detailed Australian Tourism Dataset comes from the National Visitor Survey, managed by the Tourism Research Australia, it is composed of 555 monthly series from 1998 to 2016, it is organized geographically, and purpose of travel. The natural geographical hierarchy comprises seven states, divided further in 27 zones and 76 regions. The purpose of travel categories are holiday, visiting friends and relatives (VFR), business and other. The MinT (Wickramasuriya et al., 2019), among other hierarchical forecasting studies has used the dataset it in the past. The dataset can be accessed in the [MinT reconciliation webpage](https://robjhyndman.com/publications/mint/), although other sources are available.

| Geographical Division | Number of series per division | Number of series per purpose | Total |
|          ---          |               ---             |              ---             |  ---  |
|  Australia            |              1                |               4              |   5   |
|  States               |              7                |              28              |  35   |
|  Zones                |             27                |              108             |  135  |
|  Regions              |             76                |              304             |  380  |
|  Total                |            111                |              444             |  555  |


In [None]:
dataset = 'TourismLarge'
Y_df, S_df, tags = HierarchicalData.load(directory = "./data", group=dataset)
Y_df['ds'] = pd.to_datetime(Y_df['ds'])

In [None]:
def sort_hier_df(Y_df, S_df):
    # sorts unique_id lexicographically
    Y_df.unique_id = Y_df.unique_id.astype('category')
    Y_df.unique_id = Y_df.unique_id.cat.set_categories(S_df.index)
    Y_df = Y_df.sort_values(by=['unique_id', 'ds'])
    return Y_df

Y_df = sort_hier_df(Y_df, S_df)

In [None]:
horizon = 12

Y_test_df = Y_df.groupby('unique_id').tail(horizon)
Y_train_df = Y_df.drop(Y_test_df.index)
Y_train_df

In [None]:
ds = PandasDataset.from_long_dataframe(Y_train_df, target="y", item_id="unique_id")

## 3. Fit and Predict Model


In [None]:
estimator = DeepAREstimator(
    freq="M",
    prediction_length=horizon,
    trainer=Trainer(ctx = mx.context.gpu(),
                    epochs=20),
)
predictor = estimator.train(ds)

forecast_it = predictor.predict(ds, num_samples=1000)

forecasts = list(forecast_it)
forecasts = np.array([arr.samples for arr in forecasts])
forecasts.shape

## 4. Reconciliation


In [None]:
level = np.arange(1, 100, 2)

#transform the output of DeepAREstimator to a form that is compatible with HierarchicalForecast
quantiles, forecast_df = samples_to_quantiles_df(samples=forecasts, 
                               unique_ids=S_df.index, 
                               dates=Y_test_df['ds'].unique(), 
                               level=level,
                               model_name='DeepAREstimator')

#reconcile forecasts
reconcilers = [
    BottomUp(),
    MinTrace('ols')
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)

forecast_rec = hrec.reconcile(Y_hat_df=forecast_df, S=S_df, tags=tags, level=level)

In [None]:
forecast_rec

## 5. Evaluation

In [None]:
rec_model_names = ['DeepAREstimator/MinTrace_method-ols', 'DeepAREstimator/BottomUp']

quantiles = np.array(quantiles[1:]) #remove first quantile (median)
n_quantiles = len(quantiles)
n_series = len(S_df)

for name in rec_model_names:
    quantile_columns = [col for col in forecast_rec.columns if (name+'-') in col]
    y_rec  = forecast_rec[quantile_columns].values 
    y_test = Y_test_df['y'].values

    y_rec  = y_rec.reshape(n_series, horizon, n_quantiles)
    y_test = y_test.reshape(n_series, horizon)
    scrps  = scaled_crps(y=y_test, y_hat=y_rec, quantiles=quantiles)
    print("{:<40} {:.5f}".format(name+":", scrps))