In [77]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [78]:
import warnings

warnings.filterwarnings("ignore")

In [79]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Forecasting, hierarchical data, tuning, and more

In this notebook, we will cover more advanced forecasting topics, specially focused on hierarchical data, tuning, and reconciliation.
We will use sales data from [this kaggle dataset](https://www.kaggle.com/datasets/utathya/future-volume-prediction?resource=download), which contains sales data for different products (SKUs) and agencies.

The notebook will be divided into the following sections:

1. Data preparation for hierarchical forecasting
2. Simple forecasting with builtin parallelization
3. Tuning with Optuna
4. Tuning indivually for each timeseries
5. Reconciliation
6. Benchmarking


# 1. Loading and preparing the data

The dataset is a 3-level hierarchical time series, with the following levels:

1. Total sales for all SKUs and agencies
2. Sales for each agency
3. Sales for each SKU in each agency


```mermaid
graph TD
    Root["__total"] --> Agency_01
    Root --> Agency_02
    Root --> Agency_60
    
    Agency_01 --> SKU_01_A01["SKU_01"]
    Agency_01 --> SKU_02_A01["SKU_02"]
    Agency_01 --> SKU_11_A01["SKU_11"]
    Agency_01 --> Agency_01_Total["__total"]
    
    Agency_02 --> SKU_01_A02["SKU_01"]
    Agency_02 --> SKU_02_A02["SKU_02"]
    Agency_02 --> SKU_03_A02["SKU_03"]
    Agency_02 --> Agency_02_Total["__total"]
    
    Agency_60 --> SKU_01_A60["SKU_01"]
    Agency_60 --> SKU_02_A60["SKU_02"]
    Agency_60 --> SKU_23_A60["SKU_23"]
    Agency_60 --> Agency_60_Total["__total"]

```

In sktime, we use pandas multiindex to represent the hierarchy, where each level in the index represent a level in the hierarchy. The last level is reserved to the time index.

In [80]:
# TODO: put in a function
dataset = pd.read_csv("historical_volume.csv")
dataset["YearMonth"] = pd.to_datetime(dataset["YearMonth"], format="%Y%m").dt.to_period("M")
y  = dataset.set_index(["Agency", "SKU", "YearMonth"]).sort_index()
y

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Volume
Agency,SKU,YearMonth,Unnamed: 3_level_1
Agency_01,SKU_01,2013-01,80.676
Agency_01,SKU_01,2013-02,98.064
Agency_01,SKU_01,2013-03,133.704
Agency_01,SKU_01,2013-04,147.312
Agency_01,SKU_01,2013-05,175.608
...,...,...,...
Agency_60,SKU_23,2017-08,1.980
Agency_60,SKU_23,2017-09,1.260
Agency_60,SKU_23,2017-10,0.990
Agency_60,SKU_23,2017-11,0.090


## 1.1 Some useful pandas multiindex operations

Multiindex is a powerful tool in pandas, and knowing its operations can be very useful when working with hierarchical data.


* `df.index.get_level_values(level)`: returns the values of a specific level in the multiindex.
* `df.index.droplevel(level)`: drops a specific level from the multiindex.
* `df.index.loc["value"]`: select rows whose index level 0 is "value".
* `df.index.loc[pd.IndexSlice[:, "value"], :]`: select rows whose index level 1 is "value".

In sktime, for example, one can use `df.index.droplevel(-1).unique()` to get the timeseries in the dataset

In [81]:
y.index.get_level_values(0)

Index(['Agency_01', 'Agency_01', 'Agency_01', 'Agency_01', 'Agency_01',
       'Agency_01', 'Agency_01', 'Agency_01', 'Agency_01', 'Agency_01',
       ...
       'Agency_60', 'Agency_60', 'Agency_60', 'Agency_60', 'Agency_60',
       'Agency_60', 'Agency_60', 'Agency_60', 'Agency_60', 'Agency_60'],
      dtype='object', name='Agency', length=21000)

In [82]:
y.index.get_level_values(-1)

PeriodIndex(['2013-01', '2013-02', '2013-03', '2013-04', '2013-05', '2013-06',
             '2013-07', '2013-08', '2013-09', '2013-10',
             ...
             '2017-03', '2017-04', '2017-05', '2017-06', '2017-07', '2017-08',
             '2017-09', '2017-10', '2017-11', '2017-12'],
            dtype='period[M]', name='YearMonth', length=21000)

In [83]:
y.index.droplevel(-1).unique()

MultiIndex([('Agency_01', 'SKU_01'),
            ('Agency_01', 'SKU_02'),
            ('Agency_01', 'SKU_03'),
            ('Agency_01', 'SKU_04'),
            ('Agency_01', 'SKU_05'),
            ('Agency_01', 'SKU_11'),
            ('Agency_02', 'SKU_01'),
            ('Agency_02', 'SKU_02'),
            ('Agency_02', 'SKU_03'),
            ('Agency_02', 'SKU_04'),
            ...
            ('Agency_59', 'SKU_05'),
            ('Agency_59', 'SKU_07'),
            ('Agency_59', 'SKU_17'),
            ('Agency_60', 'SKU_01'),
            ('Agency_60', 'SKU_02'),
            ('Agency_60', 'SKU_03'),
            ('Agency_60', 'SKU_04'),
            ('Agency_60', 'SKU_05'),
            ('Agency_60', 'SKU_07'),
            ('Agency_60', 'SKU_23')],
           names=['Agency', 'SKU'], length=350)

## 1.2 Aggregating and visualizing the data

Since the dataset do not come with totals for each level, we will need to add them.
It can be easily done with `Aggregator` transformer from sktime.

In [84]:
from sktime.transformations.hierarchical.aggregate import Aggregator

y = Aggregator().fit_transform(y)
y

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Volume
Agency,SKU,YearMonth,Unnamed: 3_level_1
Agency_01,SKU_01,2013-01,80.676000
Agency_01,SKU_01,2013-02,98.064000
Agency_01,SKU_01,2013-03,133.704000
Agency_01,SKU_01,2013-04,147.312000
Agency_01,SKU_01,2013-05,175.608000
...,...,...,...
__total,__total,2017-08,599553.665250
__total,__total,2017-09,556966.701300
__total,__total,2017-10,542554.007475
__total,__total,2017-11,457914.412950


### Train-test split

In [85]:
from sktime.forecasting.model_selection import temporal_train_test_split
y_train, y_test = temporal_train_test_split(y, test_size=18)

test_timeindex = y_test.index.get_level_values(-1).unique()
test_timeindex

PeriodIndex(['2016-07', '2016-08', '2016-09', '2016-10', '2016-11', '2016-12',
             '2017-01', '2017-02', '2017-03', '2017-04', '2017-05', '2017-06',
             '2017-07', '2017-08', '2017-09', '2017-10', '2017-11', '2017-12'],
            dtype='period[M]', name='YearMonth')

In [86]:
from utils import display_hierarchical_timeseries

display_hierarchical_timeseries(y_train, y_test)

interactive(children=(Dropdown(description='State:', options=('Agency_01', 'Agency_02', 'Agency_03', 'Agency_0…

## Parallelization

Instead of needing to manually iterate over the series, we can use the builtin parallelization to handle this 🙂.

When a univariate forecasting model is fitted to a hierarchical time series, one model copy is created for each series in the hierarchy and fitted separately. All models share the same hyperparameter.

In [87]:
from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.forecasting.fbprophet import Prophet


model = Prophet(freq="Q")
model.fit(y_train)

test_predictions = model.predict(fh=test_timeindex)
test_predictions


21:20:16 - cmdstanpy - INFO - Chain [1] start processing
21:20:16 - cmdstanpy - INFO - Chain [1] done processing
21:20:17 - cmdstanpy - INFO - Chain [1] start processing
21:20:17 - cmdstanpy - INFO - Chain [1] done processing
21:20:17 - cmdstanpy - INFO - Chain [1] start processing
21:20:17 - cmdstanpy - INFO - Chain [1] done processing
21:20:17 - cmdstanpy - INFO - Chain [1] start processing
21:20:17 - cmdstanpy - INFO - Chain [1] done processing
21:20:17 - cmdstanpy - INFO - Chain [1] start processing
21:20:17 - cmdstanpy - INFO - Chain [1] done processing
21:20:17 - cmdstanpy - INFO - Chain [1] start processing
21:20:17 - cmdstanpy - INFO - Chain [1] done processing
21:20:17 - cmdstanpy - INFO - Chain [1] start processing
21:20:17 - cmdstanpy - INFO - Chain [1] done processing
21:20:18 - cmdstanpy - INFO - Chain [1] start processing
21:20:18 - cmdstanpy - INFO - Chain [1] done processing
21:20:18 - cmdstanpy - INFO - Chain [1] start processing
21:20:18 - cmdstanpy - INFO - Chain [1]

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Volume
Agency,SKU,YearMonth,Unnamed: 3_level_1
Agency_01,SKU_01,2016-07,34.567985
Agency_01,SKU_01,2016-08,82.860599
Agency_01,SKU_01,2016-09,62.245607
Agency_01,SKU_01,2016-10,76.752518
Agency_01,SKU_01,2016-11,13.102124
...,...,...,...
__total,__total,2017-08,549435.467869
__total,__total,2017-09,485786.930209
__total,__total,2017-10,522700.999557
__total,__total,2017-11,474188.210622


In [88]:
model.forecasters_

Unnamed: 0,Unnamed: 1,forecasters
Agency_01,SKU_01,Prophet(freq='Q')
Agency_01,SKU_02,Prophet(freq='Q')
Agency_01,SKU_03,Prophet(freq='Q')
Agency_01,SKU_04,Prophet(freq='Q')
Agency_01,SKU_05,Prophet(freq='Q')
...,...,...
Agency_60,SKU_05,Prophet(freq='Q')
Agency_60,SKU_07,Prophet(freq='Q')
Agency_60,SKU_23,Prophet(freq='Q')
Agency_60,__total,Prophet(freq='Q')


## Tuning hyperparameters with Optuna

Optuna is a hyperparameter optimization framework that supports many sampling strategies. The default is Tree of Parzen Estimators (TPE), which is a Bayesian-like optimization algorithm.

We need to define the search space, which may vary depending on the nature of hyperparemeter.
Below, we tune some hyperparameters for demonstration purposes, with few evaluations to speed up the process.

In [89]:
from sktime.forecasting.model_selection import (ForecastingOptunaSearchCV)
from sktime.split import ExpandingWindowSplitter
from optuna.distributions import CategoricalDistribution, IntUniformDistribution, LogUniformDistribution

cv = ExpandingWindowSplitter(fh=[0,1,2,3], initial_window=36, step_length=12)

tuning_model = ForecastingOptunaSearchCV(
    forecaster=Prophet(freq="Q"),
    param_grid={"n_changepoints":IntUniformDistribution(2,20),
                "yearly_seasonality":CategoricalDistribution([True, False]),
                "seasonality_mode":CategoricalDistribution(["additive", "multiplicative"]),
                "changepoint_prior_scale":LogUniformDistribution(0.0001, 0.01),
                "seasonality_prior_scale":LogUniformDistribution(0.0001, 10),},
    cv=cv,
    n_evals=2
)

tuning_model.fit(y_train)



[I 2024-08-15 21:21:38,079] A new study created in memory with name: no-name-1ea282b7-69fc-404a-b690-ac88617a489a
21:21:39 - cmdstanpy - INFO - Chain [1] start processing
21:21:39 - cmdstanpy - INFO - Chain [1] done processing
21:21:40 - cmdstanpy - INFO - Chain [1] start processing
21:21:40 - cmdstanpy - INFO - Chain [1] done processing
21:21:40 - cmdstanpy - INFO - Chain [1] start processing
21:21:40 - cmdstanpy - INFO - Chain [1] done processing
21:21:40 - cmdstanpy - INFO - Chain [1] start processing
21:21:40 - cmdstanpy - INFO - Chain [1] done processing
21:21:40 - cmdstanpy - INFO - Chain [1] start processing
21:21:40 - cmdstanpy - INFO - Chain [1] done processing
21:21:40 - cmdstanpy - INFO - Chain [1] start processing
21:21:40 - cmdstanpy - INFO - Chain [1] done processing
21:21:40 - cmdstanpy - INFO - Chain [1] start processing
21:21:40 - cmdstanpy - INFO - Chain [1] done processing
21:21:40 - cmdstanpy - INFO - Chain [1] start processing
21:21:40 - cmdstanpy - INFO - Chain [1

In [90]:
tuning_model.best_params_

{'n_changepoints': 15,
 'yearly_seasonality': True,
 'seasonality_mode': 'multiplicative',
 'changepoint_prior_scale': 0.0030201994026721564,
 'seasonality_prior_scale': 0.0013604763579329925}

In [91]:
tuning_model.best_forecaster_

## Tuning each series individually

In the example above, we tuned the hyperparameter that performs the best on average for all timeseries.
However, it is possible that the best hyperparameter for each series is different. 

We will use `ForecastBylevel` to apply tuning separately for each series in the hierarchy. 

In [92]:
from sktime.forecasting.compose import ForecastByLevel


tune_by_level = ForecastByLevel(
    forecaster=tuning_model.set_params(n_evals=1),
    groupby="local"
)

tune_by_level.fit(y_train)

tuned_by_level_predictions = tune_by_level.predict(fh=test_timeindex)

[I 2024-08-15 21:26:57,921] A new study created in memory with name: no-name-b0573d4b-6982-43e2-97be-fc34076ba27e
21:26:57 - cmdstanpy - INFO - Chain [1] start processing
21:26:58 - cmdstanpy - INFO - Chain [1] done processing
21:26:58 - cmdstanpy - INFO - Chain [1] start processing
21:26:58 - cmdstanpy - INFO - Chain [1] done processing
21:26:58 - cmdstanpy - INFO - Chain [1] start processing
21:26:58 - cmdstanpy - INFO - Chain [1] done processing
[I 2024-08-15 21:26:58,593] A new study created in memory with name: no-name-755377a5-0489-460c-ab09-6fdd622f7974
21:26:58 - cmdstanpy - INFO - Chain [1] start processing
21:26:58 - cmdstanpy - INFO - Chain [1] done processing
21:26:58 - cmdstanpy - INFO - Chain [1] start processing
21:26:58 - cmdstanpy - INFO - Chain [1] done processing
21:26:58 - cmdstanpy - INFO - Chain [1] start processing
21:26:58 - cmdstanpy - INFO - Chain [1] done processing
[I 2024-08-15 21:26:58,866] A new study created in memory with name: no-name-abf3074f-204d-41c

In [93]:
tune_by_level.forecasters_

Unnamed: 0,Unnamed: 1,forecasters
Agency_01,SKU_01,ForecastByLevel(forecaster=ForecastingOptunaSe...
Agency_01,SKU_02,ForecastByLevel(forecaster=ForecastingOptunaSe...
Agency_01,SKU_03,ForecastByLevel(forecaster=ForecastingOptunaSe...
Agency_01,SKU_04,ForecastByLevel(forecaster=ForecastingOptunaSe...
Agency_01,SKU_05,ForecastByLevel(forecaster=ForecastingOptunaSe...
...,...,...
Agency_60,SKU_05,ForecastByLevel(forecaster=ForecastingOptunaSe...
Agency_60,SKU_07,ForecastByLevel(forecaster=ForecastingOptunaSe...
Agency_60,SKU_23,ForecastByLevel(forecaster=ForecastingOptunaSe...
Agency_60,__total,ForecastByLevel(forecaster=ForecastingOptunaSe...


In [94]:
tune_by_level.get_fitted_params()

IndexError: invalid index to scalar variable.

In [17]:
tune_by_level.forecasters_.forecasters.apply(lambda x: x.forecaster_.best_params_).to_frame("Best params")

Unnamed: 0,Unnamed: 1,Best params
Agency_01,SKU_01,"{'n_changepoints': 16, 'seasonality_mode': 'mu..."
Agency_01,SKU_02,"{'n_changepoints': 10, 'seasonality_mode': 'mu..."
Agency_01,SKU_03,"{'n_changepoints': 6, 'seasonality_mode': 'add..."
Agency_01,SKU_04,"{'n_changepoints': 17, 'seasonality_mode': 'mu..."
Agency_01,SKU_05,"{'n_changepoints': 9, 'seasonality_mode': 'add..."
...,...,...
Agency_60,SKU_05,"{'n_changepoints': 8, 'seasonality_mode': 'mul..."
Agency_60,SKU_07,"{'n_changepoints': 17, 'seasonality_mode': 'ad..."
Agency_60,SKU_23,"{'n_changepoints': 20, 'seasonality_mode': 'ad..."
Agency_60,__total,"{'n_changepoints': 14, 'seasonality_mode': 'mu..."


In [18]:
plot_australian_tourism_widget(y_train, y_test, {"Simple ETS":test_predictions, "Tuned model" : tuned_by_level_predictions})

interactive(children=(Dropdown(description='State:', options=('Agency_01', 'Agency_02', 'Agency_03', 'Agency_0…

## Reconciliation

Probably, your forecasts won't be _coherent_ with respect to the hierarchy. The sum of the forecasts for each series in a level will not be equal to the forecast for the total of that level.

This can mean two things:

1. By definition, one of them is wrong.
2. The users of the forecasts will not be happy.

In [115]:
def check_unconsistency(preds):
    total_level_predictions = preds.loc[("__total", "__total")]
    total_from_bottom_level_predictions = preds.loc[ (preds.index.get_level_values(1) != "__total")].groupby(level=-1).sum()

    difference = total_level_predictions - total_from_bottom_level_predictions
    return difference / total_from_bottom_level_predictions

check_unconsistency(tuned_by_level_predictions)

Unnamed: 0_level_0,Volume
YearMonth,Unnamed: 1_level_1
2016-07,0.00353
2016-08,0.030269
2016-09,-0.116733
2016-10,0.002905
2016-11,-0.094061
2016-12,0.070343
2017-01,-0.107199
2017-02,-0.041882
2017-03,0.028123
2017-04,0.054637


There are, fortunately, techniques to fix this. We call them `reconciliation` techniques.

The hierarchy constrains, such as the sum of children must be equal to the parent,
are a set of linear constraints, and we have some strategies to satisfy them:

1. Bottom-up (`bu`): we forecast the series at the lowest level, and then we aggregate them to the higher levels.
2. Forecast Proportions (`td_fcst`): we use the forecasts at bottom levels to estimate the proportions with respect to the total, and then we multiply them by the total forecast
3. Orthogonal Projection (`ols`): a.k.a. ordinary least squares, this amounts to using the linear contraints to build a projection matrix that takes the forecasts to a hyperplane that satisfies the constraints.
4. Oblique Projection (`wls_str`): this performs a weighted least squares projection, considering the scale of the series before computing the projection.
4. Mint Shrink (`mint`): this is a more advanced technique that uses the information in the training set to estimate the covariance matrix of the errors, and then it uses this information choose the oblique projection that minimizes the mean squared error of the reconciled forecasts.


In [114]:
from sktime.transformations.hierarchical.reconcile import Reconciler

reconciler = Reconciler(method="ols") # mint, bu, td_fcst
reconciled_predictions = reconciler.fit_transform(tuned_by_level_predictions)

In [116]:
check_unconsistency(reconciled_predictions)

Unnamed: 0_level_0,Volume
YearMonth,Unnamed: 1_level_1
2016-07,0.0
2016-08,-4.261373e-16
2016-09,-1.008975e-15
2016-10,4.390313e-16
2016-11,-7.454851e-16
2016-12,5.887017e-16
2017-01,2.602412e-16
2017-02,-3.560775e-16
2017-03,2.086543e-16
2017-04,0.0


## Using pipelines to reconcile

In [117]:
from sktime.forecasting.compose import TransformedTargetForecaster

model_with_reconciler = TransformedTargetForecaster(
    steps=[
        ("forecaster", model),
        ("reconciler", Reconciler(method="ols"))
    ]
)

model_with_reconciler.fit(y_train)
reconciled_predictions = model_with_reconciler.predict(fh=test_timeindex)

22:09:43 - cmdstanpy - INFO - Chain [1] start processing
22:09:43 - cmdstanpy - INFO - Chain [1] done processing
22:09:43 - cmdstanpy - INFO - Chain [1] start processing
22:09:43 - cmdstanpy - INFO - Chain [1] done processing
22:09:43 - cmdstanpy - INFO - Chain [1] start processing
22:09:43 - cmdstanpy - INFO - Chain [1] done processing
22:09:43 - cmdstanpy - INFO - Chain [1] start processing
22:09:43 - cmdstanpy - INFO - Chain [1] done processing
22:09:43 - cmdstanpy - INFO - Chain [1] start processing
22:09:43 - cmdstanpy - INFO - Chain [1] done processing
22:09:43 - cmdstanpy - INFO - Chain [1] start processing
22:09:44 - cmdstanpy - INFO - Chain [1] done processing
22:09:44 - cmdstanpy - INFO - Chain [1] start processing
22:09:44 - cmdstanpy - INFO - Chain [1] done processing
22:09:44 - cmdstanpy - INFO - Chain [1] start processing
22:09:44 - cmdstanpy - INFO - Chain [1] done processing
22:09:44 - cmdstanpy - INFO - Chain [1] start processing
22:09:44 - cmdstanpy - INFO - Chain [1]

In [17]:
from sktime.transformations.hierarchical.reconcile import Reconciler


model_with_reconciler = tune_by_level.clone() * Reconciler(method="ols")

model_with_reconciler.fit(y_train)
reconciled_predictions = model_with_reconciler.predict(fh=test_timeindex)

[I 2024-08-14 21:07:44,394] A new study created in memory with name: no-name-11d12df0-4d5b-47ab-a4e0-7b8f7b320d83
21:07:44 - cmdstanpy - INFO - Chain [1] start processing
21:07:44 - cmdstanpy - INFO - Chain [1] done processing
21:07:44 - cmdstanpy - INFO - Chain [1] start processing
21:07:44 - cmdstanpy - INFO - Chain [1] done processing
21:07:44 - cmdstanpy - INFO - Chain [1] start processing
21:07:44 - cmdstanpy - INFO - Chain [1] done processing
21:07:44 - cmdstanpy - INFO - Chain [1] start processing
21:07:44 - cmdstanpy - INFO - Chain [1] done processing
21:07:44 - cmdstanpy - INFO - Chain [1] start processing
21:07:45 - cmdstanpy - INFO - Chain [1] done processing
21:07:45 - cmdstanpy - INFO - Chain [1] start processing
21:07:45 - cmdstanpy - INFO - Chain [1] done processing
21:07:45 - cmdstanpy - INFO - Chain [1] start processing
21:07:45 - cmdstanpy - INFO - Chain [1] done processing
21:07:45 - cmdstanpy - INFO - Chain [1] start processing
21:07:45 - cmdstanpy - INFO - Chain [1

In [18]:
plot_australian_tourism_widget(y_train, y_test, {"Simple ETS":test_predictions, "Tuned model" : tuned_by_level_predictions,
                                                 "Reconciled" : reconciled_predictions})


interactive(children=(Dropdown(description='State:', index=3, options=('ACT', 'New South Wales', 'Northern Ter…

In [10]:
from sktime.performance_metrics.forecasting import MeanSquaredScaledError

metric = MeanSquaredScaledError(multilevel="raw_values")
metric(y_train=y_train, y_true=y_test, y_pred=test_predictions.loc[y_test.index])

Unnamed: 0,Unnamed: 1,MeanSquaredScaledError
ACT,Business,0.278996
ACT,Holiday,2.33236
ACT,Other,0.676287
ACT,Visiting,0.404921
ACT,__total,0.75346
New South Wales,Business,5.327788
New South Wales,Holiday,1.010906
New South Wales,Other,3.258773
New South Wales,Visiting,2.135785
New South Wales,__total,9.278747


In [11]:
metric = MeanSquaredScaledError(multilevel="uniform_average",)
metric(y_train=y_train, y_true=y_test, y_pred=test_predictions.loc[y_test.index])

1.7884450867238293

In [33]:
from sktime.performance_metrics.forecasting import MeanSquaredScaledError

metric = MeanSquaredScaledError(multilevel="uniform_average_time")
(metric(y_train=y_train, y_true=y_test, y_pred=tuned_by_level_predictions.loc[y_test.index]),
 metric(y_train=y_train, y_true=y_test, y_pred=Reconciler(method="ols").fit_transform(tuned_by_level_predictions.loc[y_test.index])))

(1.275161072258068, 1.2747852332586802)

In [12]:
metric = MeanSquaredScaledError(multilevel="uniform_average_time")
metric(y_train=y_train, y_true=y_test, y_pred=test_predictions.loc[y_test.index])

0.6060018160798033

[I 2024-08-14 08:26:23,113] A new study created in memory with name: no-name-61319768-c55f-4f36-b90a-2a5a65f7fb31


In [41]:
best_tuning_forecaster = tuning_model.best_forecaster_

new_test_predictions = best_tuning_forecaster.predict(fh=pd.period_range("2008", "2017", freq="Q"))

plot_australian_tourism_widget(y_train, y_test, {"Simple ETS":test_predictions, "Tuned ETS":new_test_predictions})

interactive(children=(Dropdown(description='State:', index=3, options=('ACT', 'New South Wales', 'Northern Ter…