[ENH] `HierarchyEnsembleForecaster` for level- or node-wise application of forecasters on panel/hierarchical data #3905

VyomkeshVyas · 2022-12-08T15:39:21Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

HierarchyEnsembleForecaster() aggregates panel-type data and applies different univariate forecasters on the aggregated data by each hierarchical level/node. For aggregation, it employs sktime's bulit-in 'Aggregator' class.

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

A reviewer should concentrate their feedback on the forecaster's ability to :-

Fit a separate forecaster on each hierarchical level of the panel/hierarchical data, with and without exogenous data.
Fit a separate forecaster on each hierarchical node of the panel/hierarchical data, with and without exogenous data.
Fit a 'default' forecaster (if passed as argument) on nodes/levels not mentioned in the 'forecasters' argument.
Make predictions by each hierarchical level of the fitted aggregated panel data.
Make predictions by each hierarchical node of the fitted aggregated panel data.

Any other comments?

PR checklist

For all contributions

I've added myself to the list of contributors.
Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
I've added unit tests and made sure they pass locally.
The PR title starts with either [ENH], [MNT], [DOC], or [BUG] indicating whether the PR topic is related to enhancement, maintenance, documentation, or bug.

For new estimators

I've added the estimator to the online documentation.
I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

…to hier-ensm

fkiraly · 2022-12-10T23:21:54Z

neat! let us know when you would like a review

VyomkeshVyas · 2022-12-20T12:00:46Z

neat! let us know when you would like a review

Hi @fkiraly, this PR is ready for review now. Thanks.

fkiraly · 2022-12-20T13:33:26Z

excellent! Will start the CI and we'll see if anything fails.

fkiraly · 2022-12-22T00:05:16Z

the doctest is failing, since it checks actual printout against expected.

When you run the line
>>> forecaster.fit(y, fh=[1, 2, 3]),

the printout is HierarchyEnsembleForecaster(

You can catch that by adding the line

HierarchyEnsembleForecaster(...) (like this, with three dots, and without three >), directly after

fkiraly

Great, looks good!

Some small things:

can you kindly fix the doctest? Explanation is above.
there is a merge conflict with the contributors file, kindly fix

fkiraly · 2022-12-23T02:31:35Z

hm, these look like genuine failures when you are trying to hash your node dict

fkiraly · 2022-12-23T02:31:48Z

recommendation: run your tests locally!

…to hier-ensm

VyomkeshVyas · 2023-01-10T12:17:03Z

recommendation: run your tests locally!

Its strange but the tests didn't fail locally. Fixed the doctsring error, but still not able to recreate the unhashable type failure.

fkiraly · 2023-01-10T12:59:27Z

how odd. Should we try to run them again? I'll restart.

There is also a merge conflict in the contributors file, kindly update from main.

added hier-ensm and init files

added test file

…to hier-ensm

docs/source/api_reference/forecasting.rst

sktime/forecasting/compose/_hierarchy_ensemble.py

fkiraly

These are genuine failures now.

Have you run check_estimator locally and tried to debug?

I would recommend that, see here:
https://www.sktime.org/en/stable/developer_guide/add_estimators.html

VyomkeshVyas · 2023-01-20T21:01:44Z

These are genuine failures now.

Have you run check_estimator locally and tried to debug?

I would recommend that, see here: https://www.sktime.org/en/stable/developer_guide/add_estimators.html

Hi @fkiraly, I have the updated the code and majority of the issues are fixed. But still, some tests (around 12) are failing due to a common error :- ValueError('Length of names must match number of levels in MultiIndex.').

This error is inevitable for a particular test instance, the way the check_estimator tests are designed currently. Why I think so?
I'll give a very brief overview of HierarchyEnsemblerForecaster(). The hier-ensm forecaster first aggregates the data and then fits a separate forecaster either by level or by nodes. For that, it takes three arguments : 'by', 'forecasters' and 'default'.
'by' can be 'level' or 'node', 'forecasters' can be list of tuples (name, BaseForecaster, level/node) or BaseForecaster and 'default' is BaseForecaster (which is None if not specified).

The above error is linked with all the test instances with by = 'node'. Ideally, the length of a particular node being passed in 'forecasters' attributes should be N-1, where N is the levels of multiindex data. (The last level of the data is assumed to be a timepoint index and hence N-1). The way the tests are designed, the hier-ensm forecaster will require different length of nodes being passed in 'forecasters' argument for different category of data (for eg univariate and panel data). Since, nodes can only be specified once for a particular category of data before running the tests, it fails for the other categories.

Could you please give some suggestions on how to handle this issue?

…to hier-ensm

fkiraly · 2023-01-22T08:42:41Z

@VyomkeshVyas, sorry for the delay in the reply. I was fixing some merge conflicts and doctest errors in this PR so the tests would run through to the failures that you are referring to (hope that's ok). Once I can see the failures, I'll have a look.

fkiraly · 2023-01-23T00:13:27Z

Could you please give some suggestions on how to handle this issue?

Thanks for the explanation!
I think this is precisely the issue. It's a specific instance where the forecaster requires assumptions on the data format that are stronger than the input contract.

Indeed, as the test framework is designed, all inputs are passed to all forecasters, which in this case upsets the new forecaster, as its parameters need to match the levels.

I see multiple solutions:

test the node case separately, not using the default framework. I.e., skip the appropriate framework tests by not including the parameters in get_test_params or skipping tests via tests/_config, and add manual tests instead
modify the forecaster so it does something sensible for data that doesn't match the node specification (e.g., nodes not present are simply ignored or similar)
an extension to the testing framework or estimators that allows compatibility checks between parameters and data - I have been thinking about this but this would require a bigger design (e.g., STEP), written by me, you or someone else, so it may be overkill for the problem at hand

VyomkeshVyas · 2023-01-24T11:13:15Z

@fkiraly Thanks for the suggestions! That's very helpful.
I have added a new functionality for a test instance when length of individual node being passed mismatches the level of multi-index data. The fix seems to work well as all the tests are passed now but, I would like to have your opinion whether that's a right solution. For example, if I have a data with multi-index (A,B,C,D) with D being the timepoint index and 'forecasters' being passed is ('f', F, [(x, y)] ). Then, the forecaster F will be fitted to the data with multi-index (A, B) == (x, y), which previously would have required (A, B, C) == (x, y, z).

VyomkeshVyas · 2023-01-24T12:37:34Z

I forgot to add aggregation levels in X, y in update and predict functions. I am working on it.

fkiraly · 2023-01-24T14:42:09Z

i.e., you went with option 2, right?
Makes sense, with the "ignore nodes" option.

I forgot to add aggregation levels in X, y in update and predict functions. I am working on it.

Let us know when you think this is ready.

VyomkeshVyas · 2023-01-24T16:14:25Z

i.e., you went with option 2, right? Makes sense, with the "ignore nodes" option.

yea, with option 2. But, instead I am not ignoring the mismatched node, rather grouping all the nodes which are super set of mismatch node.

I forgot to add aggregation levels in X, y in update and predict functions. I am working on it.

Let us know when you think this is ready.

Its ready now. Thanks again.

fkiraly

Excellent! Impressive for a first contribution!

To be frank, when I saw the first version and the specs, I thought, "well this might end in sweat and tears", because it was very ambitious - hierarchical data, _HeterogenousMetaEstimator which is not easy to inherit from, the issue with "data must fit the parameters" which I also didn't fully see how to solve, a docstring which is difficult to write with formal accuracy, etcetera.

But none of that blocked you!
Absolutely impressive!!

Welcome to sktime, @VyomkeshVyas!
Way to make an entrance.

VyomkeshVyas · 2023-01-27T15:04:30Z

@fkiraly Thank you very much !!
It's been a great learning for me and I totally enjoyed it. A big shout out to @ciaran-g for continuous guidance, without which it might actually have ended in "sweat and tears".

fkiraly · 2023-01-27T17:13:22Z

A big shout out to @ciaran-g for continuous guidance, without which it might actually have ended in "sweat and tears".

Well, that's why sktime is a community of contributors - to help reach the best of one's potential :-)

VyomkeshVyas added 4 commits December 8, 2022 14:11

added hier-ensm frcstr

c110b41

Merge branch 'hier-ensm' of https://github.com/VyomkeshVyas/sktime in…

f215034

…to hier-ensm

updated exogenous var functionality

477683b

updated contributors

2644d5a

added unit test and updated get_test_params()

6efa86c

VyomkeshVyas marked this pull request as ready for review December 13, 2022 20:12

VyomkeshVyas requested review from fkiraly, aiwalter and GuzalBulatova as code owners December 13, 2022 20:12

updated level == 1 functionality

ef5a605

fkiraly requested changes Dec 22, 2022

View reviewed changes

VyomkeshVyas and others added 2 commits December 22, 2022 16:11

fixed doctest and contributors file

ca475a2

Merge branch 'main' into pr/3905

ec2479b

fkiraly changed the title ~~[ENH] HierarchyEnsembleForecaster() for panel/hierarchical data~~ [ENH] HierarchyEnsembleForecaster for panel/hierarchical data Dec 31, 2022

VyomkeshVyas added 2 commits January 10, 2023 12:08

docstring error fixed

eb58435

Merge branch 'hier-ensm' of https://github.com/VyomkeshVyas/sktime in…

8f2d349

…to hier-ensm

VyomkeshVyas closed this Jan 10, 2023

VyomkeshVyas force-pushed the hier-ensm branch from 8f2d349 to c2eb974 Compare January 10, 2023 13:04

VyomkeshVyas added 3 commits January 10, 2023 13:31

Add files via upload

f10408c

added hier-ensm and init files

Add files via upload

0ce5b95

added test file

updated src file

8a9dae6

VyomkeshVyas reopened this Jan 10, 2023

VyomkeshVyas added 3 commits January 10, 2023 14:46

Merge branch 'hier-ensm' of https://github.com/VyomkeshVyas/sktime in…

67ab1e4

…to hier-ensm

fixed unhashable error

dbaf4b3

some minor updates

f053f0e

fkiraly reviewed Jan 11, 2023

View reviewed changes

docs/source/api_reference/forecasting.rst Outdated Show resolved Hide resolved

fkiraly reviewed Jan 11, 2023

View reviewed changes

sktime/forecasting/compose/_hierarchy_ensemble.py Outdated Show resolved Hide resolved

fkiraly requested changes Jan 11, 2023

View reviewed changes

fkiraly mentioned this pull request Jan 13, 2023

[DOC] add newer features to the hierarchical forecasting tutorial - grid search, metrics, ensembles, etc #4106

Open

fixed test issues

b709a3f

VyomkeshVyas and others added 8 commits January 20, 2023 21:17

conflict resolve

14f3fd6

Merge branch 'main' into pr/3905

9dfd7c2

fix merge conflict

815bee8

Update .all-contributorsrc

b709e84

Update .all-contributorsrc

4ee519b

fix typo in doctest

254f23b

docsttring error fixed

d9db84b

Merge branch 'hier-ensm' of https://github.com/VyomkeshVyas/sktime in…

e098022

…to hier-ensm

Node length issue fixed

0ae76eb

Updated update and predict

6883982

fkiraly approved these changes Jan 27, 2023

View reviewed changes

fkiraly changed the title ~~[ENH] HierarchyEnsembleForecaster for panel/hierarchical data~~ [ENH] HierarchyEnsembleForecaster for level- or node-wise application of forecasters on panel/hierarchical data Jan 28, 2023

fkiraly merged commit eb2ca69 into sktime:main Jan 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] `HierarchyEnsembleForecaster` for level- or node-wise application of forecasters on panel/hierarchical data #3905

[ENH] `HierarchyEnsembleForecaster` for level- or node-wise application of forecasters on panel/hierarchical data #3905

VyomkeshVyas commented Dec 8, 2022 •

edited

fkiraly commented Dec 10, 2022

VyomkeshVyas commented Dec 20, 2022

fkiraly commented Dec 20, 2022

fkiraly commented Dec 22, 2022

fkiraly left a comment

fkiraly commented Dec 23, 2022

fkiraly commented Dec 23, 2022

VyomkeshVyas commented Jan 10, 2023

fkiraly commented Jan 10, 2023

fkiraly left a comment

VyomkeshVyas commented Jan 20, 2023

fkiraly commented Jan 22, 2023 •

edited

fkiraly commented Jan 23, 2023

VyomkeshVyas commented Jan 24, 2023

VyomkeshVyas commented Jan 24, 2023

fkiraly commented Jan 24, 2023

VyomkeshVyas commented Jan 24, 2023

fkiraly left a comment

VyomkeshVyas commented Jan 27, 2023

fkiraly commented Jan 27, 2023

[ENH] HierarchyEnsembleForecaster for level- or node-wise application of forecasters on panel/hierarchical data #3905

[ENH] HierarchyEnsembleForecaster for level- or node-wise application of forecasters on panel/hierarchical data #3905

Conversation

VyomkeshVyas commented Dec 8, 2022 • edited

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Any other comments?

PR checklist

For all contributions

For new estimators

fkiraly commented Dec 10, 2022

VyomkeshVyas commented Dec 20, 2022

fkiraly commented Dec 20, 2022

fkiraly commented Dec 22, 2022

fkiraly left a comment

Choose a reason for hiding this comment

fkiraly commented Dec 23, 2022

fkiraly commented Dec 23, 2022

VyomkeshVyas commented Jan 10, 2023

fkiraly commented Jan 10, 2023

fkiraly left a comment

Choose a reason for hiding this comment

VyomkeshVyas commented Jan 20, 2023

fkiraly commented Jan 22, 2023 • edited

fkiraly commented Jan 23, 2023

VyomkeshVyas commented Jan 24, 2023

VyomkeshVyas commented Jan 24, 2023

fkiraly commented Jan 24, 2023

VyomkeshVyas commented Jan 24, 2023

fkiraly left a comment

Choose a reason for hiding this comment

VyomkeshVyas commented Jan 27, 2023

fkiraly commented Jan 27, 2023

[ENH] `HierarchyEnsembleForecaster` for level- or node-wise application of forecasters on panel/hierarchical data #3905

[ENH] `HierarchyEnsembleForecaster` for level- or node-wise application of forecasters on panel/hierarchical data #3905

VyomkeshVyas commented Dec 8, 2022 •

edited

fkiraly commented Jan 22, 2023 •

edited