Is it possible to use GridSearchCV for hierarchical data with Reconciler? #4101
Replies: 6 comments 28 replies
-
Strange, this should work - in theory (and my expectation), you should be able to combine the above-mentioned wrappers and any Metrics, grid search, reconciler, individual forecasters, all should work out-of-the-box with hierarchical data. There was a bug with In case something breaks for you, we would appreciate some short and self-contained code with dummy data that has the error you are experiencing, so we can debug. FYI some people who come to my mind as having worked recently on hierarchical functionality: @ciaran-g, @danbartl, @KishManani. |
Beta Was this translation helpful? Give feedback.
-
Regarding your second question:
If you want to do a grid search for each series separately, you need to wrap the grid search in If you have the I'm not sure whether there is currently a way to both (a) fit grid search parameter by series and (b) tune by using metrics that are computed after reconciliation. It will be possible though once @VyomkeshVyas finishes the |
Beta Was this translation helpful? Give feedback.
-
Here there is an example of my use case: from sktime.utils._testing.hierarchical import _make_hierarchical
from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.forecasting.trend import PolynomialTrendForecaster
from sktime.forecasting.model_selection import ForecastingGridSearchCV, ExpandingWindowSplitter
from sktime.transformations.hierarchical.aggregate import Aggregator
from sktime.forecasting.reconcile import ReconcilerForecaster
from sktime.forecasting.compose import TransformedTargetForecaster
y = _make_hierarchical()
agg = Aggregator()
y_agg = agg.fit_transform(y)
param_grid = [{"forecaster": [ExponentialSmoothing()],
"forecaster__trend": ['add','mul']
},
{"forecaster": [PolynomialTrendForecaster()],
"forecaster__degree": [1,2]}
]
pipe = TransformedTargetForecaster(steps=[
("forecaster", ExponentialSmoothing())])
N_cv_fold = 2
step_cv = 1
fh = [1,2]
initial_window_cv_len = len(y_agg.index.get_level_values(2).unique()) - (N_cv_fold - 1) * step_cv - fh[-1]
cv = ExpandingWindowSplitter(
initial_window = initial_window_cv_len,
step_length = step_cv,
fh = fh)
reconciler = ReconcilerForecaster(pipe, method="ols")
gscv = ForecastingGridSearchCV(
forecaster=reconciler,
param_grid=param_grid,
cv=cv,
n_jobs=-1,
verbose = 1
)
gscv.fit(y_agg) However in this case, as you said, the grid search will compute the aggregate score and will try to find a single parameter setting that is best for all series, together and not the best forecaster for each series in the hierarchy (which is my goal) Another approach would be to use a for cicle for the most granular hierarchy index and store best forecaster for each series in a dictionary (however it will not use the benefits of vectorization) as the following code. It would then be ideal if we could configure the forecasters for each series, for example in the dataframe of the image below extract from a tutorial on sktime site: param_grid = [{"forecaster": [ExponentialSmoothing()],
"forecaster__trend": ['add','mul']
},
{"forecaster": [PolynomialTrendForecaster()],
"forecaster__degree": [1,2]}
]
N_cv_fold = 2
step_cv = 1
fh = [1,2]
initial_window_cv_len = len(y_agg.index.get_level_values(2).unique()) - (N_cv_fold - 1) * step_cv - fh[-1]
hierarchy = list(set(list(zip(y_agg.index.get_level_values(0),
y_agg.index.get_level_values(1)))))
forecaster_dict = {}
for ts in hierarchy:
y_ts = y_agg[(y_agg.index.get_level_values(0) == ts[0]) &
(y_agg.index.get_level_values(1) == ts[1])]
pipe = TransformedTargetForecaster(steps=[
("forecaster", ExponentialSmoothing())])
cv = ExpandingWindowSplitter(
initial_window = initial_window_cv_len,
step_length = step_cv,
fh = fh)
gscv = ForecastingGridSearchCV(
forecaster=pipe,
param_grid=param_grid,
cv=cv,
n_jobs=-1,
verbose = 1
)
gscv.fit(y_ts)
forecaster_dict[ts] = gscv.best_forecaster_ |
Beta Was this translation helpful? Give feedback.
-
Hm, what you did here is very interesting: best_forecasters = {}
for ts in gscv_bylevel.get_fitted_params()["forecasters"].index:
best_forecasters[ts] = gscv_bylevel.get_fitted_params()["forecasters"] \
.loc[(ts[0], ts[1]),'forecasters'] \
.get_fitted_params()["forecaster__best_forecaster__forecaster"]
best_forecasters that is, the data frame is filled with nested parameters if called with the name of the nested parameter. Perhaps this should just happen by default if you call |
Beta Was this translation helpful? Give feedback.
-
Can you explain what you mean here by "it would be ideal"?
|
Beta Was this translation helpful? Give feedback.
-
Hello, I have tried to reproduce the code on this thread (see below) and had an issue when trying to access the
I have come up with a "partial solution" which is fitting again the forecaster, in addition to the previous reconciler fit. This is: Step 1: Calling I have tried this with sktime 0.17.1, 0.17.2 and the latest 0.18 versions and all had the same issue. Is there any way to access Has anyone been able to execute this code without having errors? I understand the solution to my question is what is proposed in the thread, but I am having the issue listed above. The code I have tried is the one below: from sktime.utils._testing.hierarchical import _make_hierarchical
from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.forecasting.trend import PolynomialTrendForecaster
from sktime.forecasting.model_selection import ForecastingGridSearchCV, ExpandingWindowSplitter
from sktime.transformations.hierarchical.aggregate import Aggregator
from sktime.forecasting.reconcile import ReconcilerForecaster
from sktime.forecasting.compose import TransformedTargetForecaster
from sktime.forecasting.compose import ForecastByLevel
y = _make_hierarchical()
agg = Aggregator()
y_agg = agg.fit_transform(y)
param_grid = [{"forecaster": [ExponentialSmoothing()],
"forecaster__trend": ['add','mul']
},
{"forecaster": [PolynomialTrendForecaster()],
"forecaster__degree": [1,2]}
]
pipe = TransformedTargetForecaster(steps=[
("forecaster", ExponentialSmoothing())])
N_cv_fold = 2
step_cv = 1
fh = [1,2]
initial_window_cv_len = len(y_agg.index.get_level_values(2).unique()) - (N_cv_fold - 1) * step_cv - fh[-1]
cv = ExpandingWindowSplitter(
initial_window = initial_window_cv_len,
step_length = step_cv,
fh = fh)
gscv = ForecastingGridSearchCV(
forecaster=pipe,
param_grid=param_grid,
cv=cv,
n_jobs=-1,
verbose = 1
)
gscv_bylevel = ForecastByLevel(gscv, 'local')
reconciler = ReconcilerForecaster(gscv_bylevel, method="ols")
reconciler.fit(y_agg)
# reconciler.forecaster.fit(y_agg) # If this line is uncommented, the error disappears
reconciler.forecaster.get_fitted_params() # here we get the error when accessing ".get_fitted_params()" Thank you. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I have a hierarchical time series dataframe and for each time series I want to find the best forecaster from a list running a GridSearchCV using a params grid space. Then I want to reconcile the result with the
ReconcilerForecaster
class.If I insert the reconciler forecaster into the forecaster parameter of
ForecastingGridSearchCV
I have a message that this is not supported.Another way, I think, is to fit the
ForecastingGridSearchCV
for each series separately. But then how can I assign the best forecaster I found for each time series to theReconcilerForecaster
?Beta Was this translation helpful? Give feedback.
All reactions