Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] by-horizon forecaster, for different estimator/parameter per horizon #4811

Merged
merged 7 commits into from Jul 12, 2023

Conversation

fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Jul 2, 2023

Fixes #3018

Implements a forecaster FhPlexForecaster that allows to specify different parameters per forecasting horizon.

To specify different forecasters, combine with MultiplexForecaster or MultiplexTransformer.

Potentially useful in #4776 (comment).

FYI @davidgilbertson, would appreciate feedback.

Currently does not have as features (but should have?)

  • multiple fh indices per forecaster
  • inheritance from _HeterogenousMetaEstimator - I already prepared the _forecasters attr for that but didn't yet test it with inheritance. Hopefully works out of the box.

@fkiraly fkiraly added implementing algorithms Implementing algorithms, estimators, objects native to sktime module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels Jul 2, 2023
@fkiraly fkiraly marked this pull request as ready for review July 3, 2023 14:54
@davidgilbertson
Copy link
Contributor

I don't have any strong opinions on this. It does look like it could solve the problem I have, but requires quite a lot of user code.

I think/hope I might be confused because I thought what I'm doing was pretty much stock-standard time series analysis and would be how the default, simple, inflexible would work. Can you confirm whether or not this particular use case should be currently possible?

  • I have X and y time series data, with multiple instances.
  • I want to put, say, 2 lags of y data into my X dataframe (two new columns) and pass that to a particular estimator's .fit() method.
  • I want to predict, say, 14 timesteps in the future

Everything I've tried fails in some way:

  • make_reduction by default lags every column in X - a flag to control whether the user wants to lag X, y, both, or some named columns would solve this.
  • make_reduction passing transformers and window_length=None can only do 'global' calculations, not one-estimator-per-instance.
  • YfromX leaks future values of y into the dataframe passed to my estimator during .predict()
  • DirectReductionForecaster takes far too long to run (more than 30x slower)

So is this use case (X and y data, want to put lags of y into X) actually impossible at the moment, and this PR resolves that? Or is there something I've missed.

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jul 3, 2023

actually impossible at the moment, and this PR resolves that? Or is there something I've missed.

Yes - the idea is to make it possible first (the combination of unlagged X, lagged y, global, which for an odd reason seems not supported) and then optimize.

Tbh the off-shelf reducers should work here.

Btw, it would be great if you could paste some code for DirectReductionForecaster in this issue that shows a long runtime: #3224 - I have been mostly diagnosing the RecursiveReductionForecaster.

@davidgilbertson
Copy link
Contributor

OK I'm glad you agree its odd that it's not supported, that means I'm not going crazy :)

Actually it seems I've just discovered a way. I think I can wrap my whole pipeline in ForecastByLevel, and then use the reduction method that only works with global pooling:

make_reduction(
    HistGradientBoostingRegressor(),
    transformers=[
        WindowSummarizer(
            lag_feature={"lag": range(1, target_lags + 1)},
            truncate="bfill",
            n_jobs=n_jobs,
        ),
    ],
    pooling="global",
    window_length=None,
)

Haven't tested thoroughly, and it still has a tiny leak due to bfill reaching into the future, but it seems that it at least allows me to use sktime.

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jul 3, 2023

OK I'm glad you agree its odd that it's not supported, that means I'm not going crazy :)

No, not at all.

It is not uncommon that features are added consecutively on estimators (e.g., global, use of X) so that combination cases don't work properly.

That's why we wanted to start with a re-design in 2022, which has the "right user interface" imo, it ended up working, but being computationally too expensive. So the next step is keeping the interface but making it faster.

@fkiraly fkiraly merged commit 534d21a into main Jul 12, 2023
24 checks passed
@fkiraly fkiraly deleted the fhplex branch July 12, 2023 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality implementing algorithms Implementing algorithms, estimators, objects native to sktime module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] forecasting horizon conditional parameter wrapper
2 participants