[ENH] extending reducers to hierarchical data #2396

danbartl · 2022-04-05T19:45:37Z

This is the final step to enable global forecasting via an efficient application of make_reduction using the new argument transformers.

I will do some refactoring, introduce checks etc., but already posted to discuss implementation strategy, see below.

Example use case

# -*- coding: utf-8 -*-
"""Test extraction of features across (shifted) windows."""
__author__ = ["danbartl"]

import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline

from sktime.datasets import load_airline
from sktime.datatypes import get_examples
from sktime.forecasting.compose import make_reduction
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.transformations.series.summarize import WindowSummarizer

# Load data that will be the basis of tests
y = load_airline()
y_pd = get_examples(mtype="pd.DataFrame", as_scitype="Series")[0]
y_series = get_examples(mtype="pd.Series", as_scitype="Series")[0]
y_multi = get_examples(mtype="pd-multiindex", as_scitype="Panel")[0]
# y Train will be univariate data set
y_train, y_test = temporal_train_test_split(y)

# Create Panel sample data
mi = pd.MultiIndex.from_product([[0], y.index], names=["instances", "timepoints"])
y_group1 = pd.DataFrame(y.values, index=mi, columns=["y"])

mi = pd.MultiIndex.from_product([[1], y.index], names=["instances", "timepoints"])
y_group2 = pd.DataFrame(y.values, index=mi, columns=["y"])

y_grouped = pd.concat([y_group1, y_group2])

# Get different WindowSummarizer functions
kwargs = WindowSummarizer.get_test_params()[0]
kwargs_alternames = WindowSummarizer.get_test_params()[1]
kwargs_variant = WindowSummarizer.get_test_params()[2]

regressor = make_pipeline(
    RandomForestRegressor(),
)

forecaster = make_reduction(
    regressor,
    scitype="tabular-regressor",
    transformers=[WindowSummarizer(**kwargs)],
    window_length=10,
)

forecaster.fit(y_grouped, fh=1)

y_pred = forecaster.predict(fh=1)

Open questions:

What kind of arguments do we want to use in transformers? Currently I can only think about WindowSummarizer, but we could of course apply any kind of function where we want a grouped application. Currently only the first transformer in the list is applied (I will extend this after we resolved the open implementation question)

What is currently best practice to apply an Imputer across the different y time series grouped together in pd-multiindex? Should that also be covered here?

fkiraly · 2022-04-05T19:57:08Z

notebook is failing - related to the bug in get_time_index fixed here?
#2380

…tion_hier

…institute/sktime into make_reduction_hier

…tion_hier

* upstream/main: (34 commits) Update codecov uploader from deprecated version and cosmetic improvements of CI scripts. (sktime#2389) bump version (sktime#2445) Fix typo in PULL_REQUEST_TEMPLATE.md (sktime#2446) [BUG] Incorrect indices returned by make_reduction on hierarchical data fixed (sktime#2438) [BUG] fix erroneous direct passthrough in ColumnEnsembleForecaster (sktime#2436) [BUG] forecasting pipeline dunder fix (sktime#2431) [BUG] temp workaround for unnamed levels in hierarchical X passed to aggregator (sktime#2432) Release v0.11.1 (sktime#2428) [ENH] extending reducers to hierarchical data, add transform-on-y functionality (sktime#2396) [ENH] interface to statsmodels SARIMAX interface (sktime#2400) [BUG] fixed fitting logic for postprocessing in `TransformedTargetForecaster` (sktime#2426) [BUG] `TransformedTargetForecaster` inverses were not working for univariate transformers and more than one quantile (sktime#2425) [BUG] fixing proba predict methods of forecasting tuning estimators (sktime#2423) [ENH] suppressing deprecation messages in `all_estimators` estimator retrieval, address `dtw` import message (sktime#2418) [BUG] fixed `score_average` parameter of proba metrics, docstrings (sktime#2401) [BUG] Sets "can handle missing value" tag in ARIMA and AutoARIMA (sktime#2420) [ENH] tests for `check_estimator` tests passing (sktime#2408) [ENH] post-processing in `TransformedTargetForecaster`, dunder method for (transformed `y`) forecasting pipelines (sktime#2404) [ENH] extend `_HeterogeneousMetaEstimator` estimator to allow mixed tuple/estimator list (sktime#2406) [BUG] fixed get_time_index for most mtypes (sktime#2380) ...

fkiraly and others added 2 commits April 3, 2022 18:26

fixed get_time_index

ddc5ae2

ReducerCheck

7ef15d1

danbartl requested review from fkiraly, aiwalter and TonyBagnall as code owners April 5, 2022 19:45

danbartl added 11 commits April 6, 2022 16:10

Merge https://github.com/alan-turing-institute/sktime into make_reduc…

cae72a3

…tion_hier

Merge commit 'refs/pull/2380/head' of https://github.com/alan-turing-…

0008e3a

…institute/sktime into make_reduction_hier

GetTimeIndexFix

6bd4232

ReworkedChanges

6187cdd

ReworkedChanges

6cf3acf

GeneralRework

0c3bbca

FixingBugs

7fd1ff3

Merge https://github.com/alan-turing-institute/sktime into make_reduc…

cfc2549

…tion_hier

Alltestspassed

8a0d74d

Merge https://github.com/alan-turing-institute/sktime into make_reduc…

1a7224c

…tion_hier

All

c1a6776

danbartl requested a review from mloning as a code owner April 9, 2022 15:56

danbartl added 5 commits April 9, 2022 17:57

All

1f3bd57

FixedGroupingAgain

1407ac7

Merge https://github.com/alan-turing-institute/sktime into make_reduc…

fb44131

…tion_hier

FixedTages

be8656d

MovedOneTag

2f52c0a

fkiraly changed the title ~~ReducerCheck~~ [ENH] extending reducers to hierarchical data Apr 10, 2022

TestFix

47b2732

fkiraly merged commit 999810f into sktime:main Apr 10, 2022

This was referenced Apr 14, 2022

[ENH] window summaries and lagged reducer #1612

Closed

[ENH] design make_reduction to include window summaries, specify end state #1685

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] extending reducers to hierarchical data #2396

[ENH] extending reducers to hierarchical data #2396

danbartl commented Apr 5, 2022

fkiraly commented Apr 5, 2022

[ENH] extending reducers to hierarchical data #2396

[ENH] extending reducers to hierarchical data #2396

Conversation

danbartl commented Apr 5, 2022

fkiraly commented Apr 5, 2022