Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Tracking issue for make_reduction outstanding issues #5736

Open
fkiraly opened this issue Jan 13, 2024 · 3 comments
Open

[ENH] Tracking issue for make_reduction outstanding issues #5736

fkiraly opened this issue Jan 13, 2024 · 3 comments
Labels
enhancement Adding new functionality module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting

Comments

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 13, 2024

This issue is to track work items related to the reducers in forecasting.compose._reduce.

current reducer

simplification

  • local/global branches seem duplicative. This should be split up or otherwise refactored
  • similar with point vs proba predictions. This should also undergo simplification: [ENH] support for probabilistic regressors (skpro) in make_reduction, direct reduction #5536
  • specifically, numpy preallocation and consistent use should be revisited, for both global and proba.
  • the split of logic between the reducer classes and _BaseWindowForecaster does not seem consistent - this should be revisited.

bugs

  • the global setting was not covered by tests, and fails tests, see here: [BUG] fix global pooling in forecasting reducers #5722
  • X is also lagged, that is unexpected for the user and should be changed. Best way is probably lag_X parameter, then change the default via deprecation.

improvements

  • many sklearn regressors support diagnostics, e.g., feat importances based on name. These are not being passed on.
  • docstrings of private functions are unclear, their guarantees as well. This shoould be improved with clear docstrings, and potentially asserts at the end. This is probably helpful as supportive measure in bugfixing and refactor as well.
  • numpy writes raise warnings, these should be addressed.
  • proper math in the docstrings. It is obscure to a technical reader what algorithm is exactly implemented (e.g., treatment of X, behaviour in global case, cutting of windows, which parts of the time series are exactly used in fit/predict, etc)
    • could be transferred from new direct reducer

extensions

  • probabilistic predictions for recursive - it is not obvious what that should be though
  • direct multioutput - the old reducer does not suppot this, but the new one does
  • allowing the user to specify whether X gets lagged or not, similar to the new direct reducer

Related issues or PR:

@fkiraly fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels Jan 13, 2024
@fkiraly
Copy link
Collaborator Author

fkiraly commented Jan 13, 2024

FYI @benHeid

@mlisovyi
Copy link

mlisovyi commented Jan 14, 2024

We have been trying to start using reduction forecasters for our use-case (including exogenous features X) and after some unexpected results we have built a minimal reproducible example to be able to inspect data being fed to the sklearn estimator by the classes created by make_reduction as well as to DirectReductionForecaster/RecursiveReductionForecaster as of version 0.25.0.
As a result we do not see a way to make use of the "old" classes for the case (using make_reduction), where X values are not available in the future. Technically, a models spits out results, but they are based on totally wrong values for X (and event differntly wrong for different pooling schemes). This use-case seems to work in the new implementation of DirectReductionForecaster (using X_treatment="shifted"). Would you be interested in the results or is there already enough information? If yes, then here or in #3224 or in a separate issue?

I wonder why X_treatment is not available in RecursiveReductionForecaster? Is there a conceptual or a technical issue? Or is it in the making or it simply lacks manpower? Or there is some alternative away to achieve it that I overlook (I see #3224 mentions it, but I fail to understand the point :( )?

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jan 14, 2024

@mlisovyi, thanks for the feedback! Yes, the results might be interesting.

In a nutshell, the biggest problem is lack of contributor time on these estimators, as fixing some of the known issues is unpleasant, and volunteers tend to gravitate elsewhere.
(if we had the means to hire someone the reducers would be high on the list)

The current situation:

  • "old" make_reduction has been worked on by a dozen or so people, so it is not very well structured, and it is hard to get into this part of the codebase. Making changes to it - such as introducing X_treatment - is tedious.
    • as this issue mentions, there is a secondary problem about lack of documentatio what it does to X.
    • I have been trying to make gradual improvements, but it proves tedious...
  • The "new" reducers were meant as a replacement, but they end up having worse runtime, we do not fully understadn why. Diagnostic PR is here: DRAFT [ENH] runtime optimizations for reducer forecasters #4976
    • a option would be to tak the DirectReductionForecaster and help with the runtime issue. Once that is solved, it can be more easily extended to RecursiveReductionForecaster.

why X_treatment is not available in RecursiveReductionForecaster? Is there a conceptual or a technical issue?

Mostly a time investment issue. Would you perhaps be willing to help, if your use cases require it?

Or is it in the making or it simply lacks manpower?

Yes, or womanpower, anyone who can help with the coding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting
Development

No branches or pull requests

2 participants