# Non-Negative MinTrace

Large collections of time series organized into structures at different aggregation levels often require their forecasts to follow their aggregation constraints and to be nonnegative, which poses the challenge of creating novel algorithms capable of coherent forecasts.

The `HierarchicalForecast` package provides a wide collection of Python implementations of hierarchical forecasting algorithms that follow nonnegative hierarchical reconciliation.

In this notebook, we will show how to use the `HierarchicalForecast` package to perform nonnegative reconciliation of forecasts on `Wiki2` dataset.

You can run these experiments using CPU or GPU with Google Colab.

<a href="https://colab.research.google.com/github/Nixtla/hierarchicalforecast/blob/main/nbs/examples/NonNegativeReconciliation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%%capture
!pip install hierarchicalforecast statsforecast datasetsforecast

## 1. Load Data

In this example we will use the `Wiki2` dataset. The following cell gets the time series for the different levels in the hierarchy, the summing dataframe  `S_df` which recovers the full dataset from the bottom level hierarchy and the indices of each hierarchy denoted by `tags`.

In [None]:
import numpy as np
import pandas as pd

from datasetsforecast.hierarchical import HierarchicalData

In [None]:
Y_df, S_df, tags = HierarchicalData.load('./data', 'Wiki2')
Y_df['ds'] = pd.to_datetime(Y_df['ds'])

In [None]:
Y_df.head()

Unnamed: 0,unique_id,ds,y
0,Total,2016-01-01,156508
1,Total,2016-01-02,129902
2,Total,2016-01-03,138203
3,Total,2016-01-04,115017
4,Total,2016-01-05,126042


<font color='gold'>This is a representation of the Hierarchy - 1 means that column name (item, Drugs in our case), belongs to the Total row. Rows represent totals at each level of the hierarchy, for each node</font>

In [None]:
S_df.iloc[:5, :5]

Unnamed: 0,Month,Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,Дальневосточный ФО - ALMONT - Almont FC tabs 10 mg #28,Дальневосточный ФО - AMBROBENE - Ambrobene oral solution 7.5 mg/ml 100 ml #1,Дальневосточный ФО - AMBROBENE - Ambrobene syrup 15 mg/5ml #100 bottle
Total,1,1,1,1,1
Month,1,0,0,0,0
Дальневосточный ФО,0,1,1,1,1
ALMAGEL,0,1,0,0,0
Almagel susp 170 ml #1,0,1,0,0,0


In [None]:
df.columns

Index(['Month', 'Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1',
       'Дальневосточный ФО - ALMONT - Almont FC tabs 10 mg #28',
       'Дальневосточный ФО - AMBROBENE - Ambrobene oral solution 7.5 mg/ml 100 ml #1',
       'Дальневосточный ФО - AMBROBENE - Ambrobene syrup 15 mg/5ml #100 bottle',
       'Дальневосточный ФО - AMBROBENE - Ambrobene tabs 30 mg #20',
       'Дальневосточный ФО - AMLODIPINE-TEVA - Amlodipine-Teva tabs 10 mg #30',
       'Дальневосточный ФО - AMLODIPINE-TEVA - Amlodipine-Teva tabs 5 mg #30',
       'Дальневосточный ФО - ANASTROSOLE - Anastrozole-Teva FC tabs 1 mg #28',
       'Дальневосточный ФО - ATORVASTATIN-TEVA - Atorvastatin-Teva FC tabs 10 mg #30',
       ...
       'Южный ФО - TENZOTRAN - Tenzotran FC tabs 0.2 mg #28',
       'Южный ФО - TENZOTRAN - Tenzotran FC tabs 0.4 mg #28',
       'Южный ФО - TEVAGRASTIME - Tevagrastime solution for inj 30 MIU #1',
       'Южный ФО - TIZANIDINE-TEVA - Tizanidine-Teva tabs 2 mg #30',
       'Южный ФО - TI

In [None]:
# Let's Create S_df for drugs data

columns = df.columns

# Initialize a blank dataframe with the columns
S_df = pd.DataFrame(0, index=['Total'], columns=columns)

# Total row
S_df.loc['Total'] = 1

# For each column, determine its hierarchy levels
for col in columns:
    levels = col.split(" - ")
    for level in levels:
        if level not in S_df.index:
            S_df.loc[level] = 0
        S_df.at[level, col] = 1

S_df


Unnamed: 0,Month,Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,Дальневосточный ФО - ALMONT - Almont FC tabs 10 mg #28,Дальневосточный ФО - AMBROBENE - Ambrobene oral solution 7.5 mg/ml 100 ml #1,Дальневосточный ФО - AMBROBENE - Ambrobene syrup 15 mg/5ml #100 bottle,Дальневосточный ФО - AMBROBENE - Ambrobene tabs 30 mg #20,Дальневосточный ФО - AMLODIPINE-TEVA - Amlodipine-Teva tabs 10 mg #30,Дальневосточный ФО - AMLODIPINE-TEVA - Amlodipine-Teva tabs 5 mg #30,Дальневосточный ФО - ANASTROSOLE - Anastrozole-Teva FC tabs 1 mg #28,Дальневосточный ФО - ATORVASTATIN-TEVA - Atorvastatin-Teva FC tabs 10 mg #30,...,Южный ФО - TENZOTRAN - Tenzotran FC tabs 0.2 mg #28,Южный ФО - TENZOTRAN - Tenzotran FC tabs 0.4 mg #28,Южный ФО - TEVAGRASTIME - Tevagrastime solution for inj 30 MIU #1,Южный ФО - TIZANIDINE-TEVA - Tizanidine-Teva tabs 2 mg #30,Южный ФО - TIZANIDINE-TEVA - Tizanidine-Teva tabs 4 mg #30,Южный ФО - TOPSAVER - Topsaver FC tabs 25 mg #28,Южный ФО - TOPSAVER - Topsaver FC tabs 50 mg #28,Южный ФО - TROXEVASIN - Troxevasin gel 2% 40 g #1,Южный ФО - VINCRISTINE-TEVA - Vincristine-Teva lyoph for inf 1 mg/ml 1 ml #1,Южный ФО - ZOLEDRONAT-TEVA - Zoledronate-Teva concentrate for inf 4 mg/5ml 5 ml #1
Total,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
Month,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Дальневосточный ФО,0,1,1,1,1,1,1,1,1,1,...,0,0,0,0,0,0,0,0,0,0
ALMAGEL,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Almagel susp 170 ml #1,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
VALZ,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Valz Combi FC tabs 10 mg + 160 mg #28,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Южный ФО,0,0,0,0,0,0,0,0,0,0,...,1,1,1,1,1,1,1,1,1,1
Fosicard tabs 20 mg #28,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [None]:
df = pd.read_excel('data/Quarterly_smoothing.xlsx')#, index_col=0)

save_shape = df.shape
df = df.loc[:, (df != 0).all()]

print("Removed", save_shape[1] - df.shape[1], "Columns with at least one 0 in them")

df.head()

Removed 1440 Columns with at least one 0 in them


Unnamed: 0,Month,Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,Дальневосточный ФО - ALMONT - Almont FC tabs 10 mg #28,Дальневосточный ФО - AMBROBENE - Ambrobene oral solution 7.5 mg/ml 100 ml #1,Дальневосточный ФО - AMBROBENE - Ambrobene syrup 15 mg/5ml #100 bottle,Дальневосточный ФО - AMBROBENE - Ambrobene tabs 30 mg #20,Дальневосточный ФО - AMLODIPINE-TEVA - Amlodipine-Teva tabs 10 mg #30,Дальневосточный ФО - AMLODIPINE-TEVA - Amlodipine-Teva tabs 5 mg #30,Дальневосточный ФО - ANASTROSOLE - Anastrozole-Teva FC tabs 1 mg #28,Дальневосточный ФО - ATORVASTATIN-TEVA - Atorvastatin-Teva FC tabs 10 mg #30,...,Южный ФО - TENZOTRAN - Tenzotran FC tabs 0.2 mg #28,Южный ФО - TENZOTRAN - Tenzotran FC tabs 0.4 mg #28,Южный ФО - TEVAGRASTIME - Tevagrastime solution for inj 30 MIU #1,Южный ФО - TIZANIDINE-TEVA - Tizanidine-Teva tabs 2 mg #30,Южный ФО - TIZANIDINE-TEVA - Tizanidine-Teva tabs 4 mg #30,Южный ФО - TOPSAVER - Topsaver FC tabs 25 mg #28,Южный ФО - TOPSAVER - Topsaver FC tabs 50 mg #28,Южный ФО - TROXEVASIN - Troxevasin gel 2% 40 g #1,Южный ФО - VINCRISTINE-TEVA - Vincristine-Teva lyoph for inf 1 mg/ml 1 ml #1,Южный ФО - ZOLEDRONAT-TEVA - Zoledronate-Teva concentrate for inf 4 mg/5ml 5 ml #1
0,2018-03-01,948,35,1038,1081,4010,8478,5214,5496,4027,...,2705,36670,254,70,364,2432,3,97,400,9689
1,2018-04-01,1036,35,3219,9207,10351,21449,6639,5506,7811,...,4647,40633,1339,1955,452,10310,3029,526,2160,11704
2,2018-05-01,1245,93,4522,10998,13750,29521,9022,23342,11919,...,5632,42338,1660,4128,990,11333,3048,1650,3245,15232
3,2018-06-01,1786,183,5125,25764,19210,37830,13728,23052,13595,...,4419,9781,2270,4512,1522,10333,7050,1986,3235,6585
4,2018-07-01,4131,183,3633,20549,20002,27987,54209,23239,25310,...,2888,13232,2980,3594,1600,5583,5153,2519,5085,6034


In [None]:
def prep_data_for_scikit_hts(df):
    # Beware, chatgpt below :P
    aggregated_df = pd.DataFrame()

    # Split the columns into hierarchical levels by '-'
    columns_split = [col.split(' - ') for col in df.columns]

    # Get the unique top-level classes (regions)
    regions = list(set([col[0] for col in columns_split if len(col) > 1]))

    # AFAIK, HTS needs a 'total' column for each level in the hierarchy. I believe all tree nodes except bottom-most
    # Create a dictionary to represent the hierarchy, starting with 'total'
    hierarchy = {'total': regions}

    # Iterate through regions: Дальневосточный ФО
    for region in regions:
        # Drug Categories - 'ADRIANOL', 'AGALATES', 'ALMAGEL', 'ALMONT', 'AMBROBENE'
        categories = list(set([col[1] for col in columns_split if len(col) > 1 and col[0] == region]))
        region_key = region
        hierarchy[region_key] = [f'{region} - {category}' for category in categories]

        # Aggregate at the region level
        region_columns = [col for col in df.columns if col.startswith(f'{region} - ')]
        aggregated_df[region_key] = df[region_columns].sum(axis=1)

        # Iterate through Drug categories
        for category in categories:
            category_key = f'{region} - {category}'
            products = [col for col in df.columns if col.startswith(f'{region} - {category} - ')]
            hierarchy[category_key] = products

            # Aggregate at the category level
            category_columns = [col for col in df.columns if col.startswith(f'{region} - {category} - ')]
            aggregated_df[category_key] = df[category_columns].sum(axis=1)

    # Concatenate the aggregated columns with the original DataFrame
    df_with_aggregates = pd.concat([df, aggregated_df], axis=1)

    # Add the "total" column across all columns
    df_with_aggregates['total'] = df_with_aggregates.sum(axis=1)

    return df_with_aggregates, hierarchy

In [None]:
df_with_aggregates, hierarchy = prep_data_for_scikit_hts(df)

In [None]:
df_with_aggregates.head(5)

Unnamed: 0,Month,Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,Дальневосточный ФО - ALMONT - Almont FC tabs 10 mg #28,Дальневосточный ФО - AMBROBENE - Ambrobene oral solution 7.5 mg/ml 100 ml #1,Дальневосточный ФО - AMBROBENE - Ambrobene syrup 15 mg/5ml #100 bottle,Дальневосточный ФО - AMBROBENE - Ambrobene tabs 30 mg #20,Дальневосточный ФО - AMLODIPINE-TEVA - Amlodipine-Teva tabs 10 mg #30,Дальневосточный ФО - AMLODIPINE-TEVA - Amlodipine-Teva tabs 5 mg #30,Дальневосточный ФО - ANASTROSOLE - Anastrozole-Teva FC tabs 1 mg #28,Дальневосточный ФО - ATORVASTATIN-TEVA - Atorvastatin-Teva FC tabs 10 mg #30,...,Центральный ФО - OMEPRAZOLE OTC,Центральный ФО - DOLOBENE,Центральный ФО - VELAFAX,Центральный ФО - MELOXICAM-TEVA,Центральный ФО - DOMPERIDONE-TEVA,Центральный ФО - TENZOTRAN,Центральный ФО - DOXAZOSIN-TEVA,Центральный ФО - CLOPIDOGREL-TEVA,Центральный ФО - PHEZAM,total
0,2018-03-01,948,35,1038,1081,4010,8478,5214,5496,4027,...,820,2,1348,2016,380,29601,1938,28288,382,16325139
1,2018-04-01,1036,35,3219,9207,10351,21449,6639,5506,7811,...,823,82,2357,3166,2163,112057,2553,47638,547,32543250
2,2018-05-01,1245,93,4522,10998,13750,29521,9022,23342,11919,...,1009,185,3003,3322,3135,155979,2845,89369,1254,43660485
3,2018-06-01,1786,183,5125,25764,19210,37830,13728,23052,13595,...,309,463,2307,4927,4481,293422,3645,82069,977,44176836
4,2018-07-01,4131,183,3633,20549,20002,27987,54209,23239,25310,...,308,435,2396,10060,4134,507653,4093,82099,1296,40029174


<font color='cyan'>HierarchicalForecast likes data to be Drug | Date | Sales, rather than having DrugName as columns</font>

In [None]:
# Melt the DataFrame - convert ColNames to rows to match input to HierForecast
melted_df = df_with_aggregates.melt(id_vars=['Month'], var_name='Drug', value_name='Sales')

# Convert melted DataFrame to the required format
melted_df = melted_df[['Drug', 'Month', 'Sales']]

# Col names seem to need to be thus for package
melted_df.rename(columns={'Drug': 'unique_id', 'Month':'ds', 'Sales':'y'}, inplace=True)


melted_df


Unnamed: 0,unique_id,ds,y
0,Дальневосточный ФО - ALMAGEL - Almagel susp 17...,2018-03-01,948
1,Дальневосточный ФО - ALMAGEL - Almagel susp 17...,2018-04-01,1036
2,Дальневосточный ФО - ALMAGEL - Almagel susp 17...,2018-05-01,1245
3,Дальневосточный ФО - ALMAGEL - Almagel susp 17...,2018-06-01,1786
4,Дальневосточный ФО - ALMAGEL - Almagel susp 17...,2018-07-01,4131
...,...,...,...
74551,total,2022-07-01,47586666
74552,total,2022-08-01,51155616
74553,total,2022-09-01,51086598
74554,total,2022-10-01,59657496


In [None]:
type(hierarchy)

dict

In [None]:
print(Y_df.head(15))
print(Y_df.tail(5))

   unique_id         ds       y
0      Total 2016-01-01  156508
1      Total 2016-01-02  129902
2      Total 2016-01-03  138203
3      Total 2016-01-04  115017
4      Total 2016-01-05  126042
5      Total 2016-01-06  137474
6      Total 2016-01-07  162134
7      Total 2016-01-08  123693
8      Total 2016-01-09  148475
9      Total 2016-01-10  132045
10     Total 2016-01-11  162284
11     Total 2016-01-12  122782
12     Total 2016-01-13  134154
13     Total 2016-01-14  121805
14     Total 2016-01-15  120531
            unique_id         ds   y
72829  zh_MOB_AAG_138 2016-12-27  69
72830  zh_MOB_AAG_138 2016-12-28  64
72831  zh_MOB_AAG_138 2016-12-29  94
72832  zh_MOB_AAG_138 2016-12-30  99
72833  zh_MOB_AAG_138 2016-12-31  76


In [None]:
tags

{'Views': array(['Total'], dtype=object),
 'Views/Country': array(['de', 'en', 'fr', 'ja', 'ru', 'zh'], dtype=object),
 'Views/Country/Access': array(['de_AAC', 'de_DES', 'de_MOB', 'en_AAC', 'en_DES', 'en_MOB',
        'fr_AAC', 'fr_DES', 'fr_MOB', 'ja_AAC', 'ja_DES', 'ja_MOB',
        'ru_AAC', 'ru_DES', 'ru_MOB', 'zh_AAC', 'zh_DES', 'zh_MOB'],
       dtype=object),
 'Views/Country/Access/Agent': array(['de_AAC_AAG', 'de_AAC_SPD', 'de_DES_AAG', 'de_MOB_AAG',
        'en_AAC_AAG', 'en_AAC_SPD', 'en_DES_AAG', 'en_MOB_AAG',
        'fr_AAC_AAG', 'fr_AAC_SPD', 'fr_DES_AAG', 'fr_MOB_AAG',
        'ja_AAC_AAG', 'ja_AAC_SPD', 'ja_DES_AAG', 'ja_MOB_AAG',
        'ru_AAC_AAG', 'ru_AAC_SPD', 'ru_DES_AAG', 'ru_MOB_AAG',
        'zh_AAC_AAG', 'zh_AAC_SPD', 'zh_DES_AAG', 'zh_MOB_AAG'],
       dtype=object),
 'Views/Country/Access/Agent/Topic': array(['de_AAC_AAG_001', 'de_AAC_AAG_010', 'de_AAC_AAG_014',
        'de_AAC_AAG_045', 'de_AAC_AAG_063', 'de_AAC_AAG_100',
        'de_AAC_AAG_110', 'de_AAC

We split the dataframe in train/test splits.

In [None]:
Y_df = melted_df

In [None]:
Y_test_df = Y_df.groupby('unique_id').tail(7) # Original code
Y_train_df = Y_df.drop(Y_test_df.index)

In [None]:
Y_test_df = Y_test_df.set_index('unique_id')
Y_train_df = Y_train_df.set_index('unique_id')

In [None]:
Y_test_df

Unnamed: 0_level_0,ds,y
unique_id,Unnamed: 1_level_1,Unnamed: 2_level_1
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2022-05-01,1439
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2022-06-01,1468
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2022-07-01,2363
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2022-08-01,1154
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2022-09-01,1582
...,...,...
total,2022-07-01,47586666
total,2022-08-01,51155616
total,2022-09-01,51086598
total,2022-10-01,59657496


In [None]:
Y_train_df

Unnamed: 0_level_0,ds,y
unique_id,Unnamed: 1_level_1,Unnamed: 2_level_1
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2018-03-01,948
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2018-04-01,1036
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2018-05-01,1245
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2018-06-01,1786
Дальневосточный ФО - ALMAGEL - Almagel susp 170 ml #1,2018-07-01,4131
...,...,...
total,2021-12-01,88459056
total,2022-01-01,79444518
total,2022-02-01,63095439
total,2022-03-01,50230020


## 2. Base Forecasts

The following cell computes the *base forecast* for each time series using the `ETS` and `naive` models. Observe that `Y_hat_df` contains the forecasts but they are not coherent.

In [None]:
%%capture
from statsforecast.models import ETS, Naive
from statsforecast.core import StatsForecast

In [None]:
%%capture
fcst = StatsForecast(
    df=Y_train_df,
    models=[ETS(season_length=7, model='ZAA'), Naive()],
    freq='M',
    n_jobs=-1
)
Y_hat_df = fcst.forecast(h=7)

Observe that the ETS model computes negative forecasts for some series.

In [None]:
Y_hat_df.query('ETS < 0')

Unnamed: 0_level_0,ds,ETS,Naive
unique_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Дальневосточный ФО - BISOPROLOL-TEVA - Bisoprolol-Teva FC tabs 5 mg #50,2022-04-30,-125.188553,2008.0
Дальневосточный ФО - CETIRIZINE - Cetirizine-Teva FC tabs 10 mg #10,2022-04-30,-1413.106079,15910.0
Дальневосточный ФО - CETIRIZINE - Cetirizine-Teva FC tabs 10 mg #10,2022-05-31,-1595.040161,15910.0
Дальневосточный ФО - CETIRIZINE - Cetirizine-Teva FC tabs 10 mg #10,2022-06-30,-1035.791870,15910.0
Дальневосточный ФО - CETIRIZINE - Cetirizine-Teva FC tabs 10 mg #10,2022-07-31,-2624.863281,15910.0
...,...,...,...
Южный ФО - VINCRISTINE-TEVA,2022-06-30,-1496.507690,577.0
Южный ФО - VINCRISTINE-TEVA,2022-07-31,-993.848206,577.0
Южный ФО - VINCRISTINE-TEVA - Vincristine-Teva lyoph for inf 1 mg/ml 1 ml #1,2022-05-31,-369.803589,577.0
Южный ФО - VINCRISTINE-TEVA - Vincristine-Teva lyoph for inf 1 mg/ml 1 ml #1,2022-06-30,-1496.507690,577.0


## 3. Non-Negative Reconciliation

The following cell makes the previous forecasts coherent and nonnegative using the `HierarchicalReconciliation` class.

In [None]:
from hierarchicalforecast.methods import MinTrace, BottomUp
from hierarchicalforecast.core import HierarchicalReconciliation

In [None]:
%%capture
reconcilers = [
    BottomUp()
    # MinTrace(method='ols'),
    # MinTrace(method='ols', nonnegative=True)
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_train_df,
                          S=S_df, tags=tags)

TypeError: ignored

Observe that the nonnegative reconciliation method obtains nonnegative forecasts.

In [None]:
Y_rec_df.query('`ETS/MinTrace_method-ols_nonnegative-True` < 0')

Unnamed: 0_level_0,ds,ETS,Naive,ETS/MinTrace_method-ols,Naive/MinTrace_method-ols,ETS/MinTrace_method-ols_nonnegative-True,Naive/MinTrace_method-ols_nonnegative-True
unique_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1


The free reconciliation method gets negative forecasts.

In [None]:
Y_rec_df.query('`ETS/MinTrace_method-ols` < 0')

Unnamed: 0_level_0,ds,ETS,Naive,ETS/MinTrace_method-ols,Naive/MinTrace_method-ols,ETS/MinTrace_method-ols_nonnegative-True,Naive/MinTrace_method-ols_nonnegative-True
unique_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
de_DES,2016-12-25,-2553.932861,495.0,-3468.745214,495.0,2.262540e-15,495.0
de_DES,2016-12-26,-2155.228271,495.0,-2985.587125,495.0,1.356705e-30,495.0
de_DES,2016-12-27,-2720.993896,495.0,-3698.680055,495.0,6.857413e-30,495.0
de_DES,2016-12-29,-3429.432617,495.0,-2965.207609,495.0,2.456449e+02,495.0
de_DES,2016-12-30,-3963.202637,495.0,-3217.360371,495.0,3.646790e+02,495.0
...,...,...,...,...,...,...,...
zh_MOB_AAG_036,2016-12-26,75.298317,115.0,-165.799776,115.0,3.207772e-14,115.0
zh_MOB_AAG_036,2016-12-27,72.895554,115.0,-134.340626,115.0,2.308198e-14,115.0
zh_MOB_AAG_138,2016-12-25,94.796623,65.0,-47.009813,65.0,3.116938e-14,65.0
zh_MOB_AAG_138,2016-12-26,71.293983,65.0,-169.804110,65.0,0.000000e+00,65.0


## 4. Evaluation

The `HierarchicalForecast` package includes the `HierarchicalEvaluation` class to evaluate the different hierarchies and also is capable of compute scaled metrics compared to a benchmark model.

In [None]:
from hierarchicalforecast.evaluation import HierarchicalEvaluation

In [None]:
def mse(y, y_hat):
    return np.mean((y-y_hat)**2)

evaluator = HierarchicalEvaluation(evaluators=[mse])
evaluation = evaluator.evaluate(
        Y_hat_df=Y_rec_df, Y_test_df=Y_test_df,
        tags=tags, benchmark='Naive'
)
evaluation.filter(like='ETS', axis=1).T

level,Overall,Views,Views/Country,Views/Country/Access,Views/Country/Access/Agent,Views/Country/Access/Agent/Topic
metric,mse-scaled,mse-scaled,mse-scaled,mse-scaled,mse-scaled,mse-scaled
ETS,1.011585,0.7358,1.190354,1.103657,1.089515,1.397139
ETS/MinTrace_method-ols,0.979163,0.698355,1.062521,1.143277,1.113349,1.354041
ETS/MinTrace_method-ols_nonnegative-True,0.945075,0.677892,1.004639,1.184719,1.141442,1.158672


Observe that the nonnegative reconciliation method performs better that its unconstrained counterpart.

### References
- [Hyndman, R.J., & Athanasopoulos, G. (2021). "Forecasting: principles and practice, 3rd edition:
Chapter 11: Forecasting hierarchical and grouped series.". OTexts: Melbourne, Australia. OTexts.com/fpp3
Accessed on July 2022.](https://otexts.com/fpp3/hierarchical.html)
- [Wickramasuriya, S. L., Athanasopoulos, G., & Hyndman, R. J. (2019). \"Optimal forecast reconciliation for
    hierarchical and grouped time series through trace minimization\". Journal of the American Statistical Association,
    114 , 804–819. doi:10.1080/01621459.2018.1448825.](https://robjhyndman.com/publications/mint/).
- [Wickramasuriya, S.L., Turlach, B.A. & Hyndman, R.J. (2020). \"Optimal non-negative
    forecast reconciliation". Stat Comput 30, 1167–1182,
    https://doi.org/10.1007/s11222-020-09930-0](https://robjhyndman.com/publications/nnmint/).