Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design/implement forecast interval predictions #97

Closed
mloning opened this issue Jul 23, 2019 · 5 comments
Closed

Design/implement forecast interval predictions #97

mloning opened this issue Jul 23, 2019 · 5 comments
Assignees
Labels
API design API design & software architecture feature request New feature or request implementing algorithms Implementing algorithms, estimators, objects native to sktime implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality

Comments

@mloning
Copy link
Contributor

mloning commented Jul 23, 2019

Linked to evaluation framework #64

@mloning mloning added feature request New feature or request API design API design & software architecture implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality must - high priority implementing algorithms Implementing algorithms, estimators, objects native to sktime labels Jul 23, 2019
@mloning mloning self-assigned this Jul 23, 2019
@mloning mloning added this to To do in Use case 3: supervised forecasting via automation Jul 23, 2019
@fkiraly
Copy link
Collaborator

fkiraly commented Aug 1, 2019

This is a can of worms...

Why: this is a different "kind" of prediction, essentially a third option next to deterministic and probabilistic supervised prediction or forecast.

Prediction intervals can hence appear in any case when the predicted object has a continuous/numeric type.

To my knowledge there is no package that has a clean interface for prediction forecasts, but what comes closest is ...
https://github.com/alan-turing-institute/skpro
... which is currently looking for a maintainer or v2 re-design.

As a hack, one could make PI boundaries part of a dict return object, similar to mean/var in sklearn, but it wouldn't account for:

  • the possibility of producing PI boundaries at different levels of prediction confidence (e.g., 75%, 90%)
  • deriving these from fully probabilistic predictions
  • evaluation via losses specific for PI

As a slightly less hacky option, one could attach traits to each method about which kinds of predictions it can make: PI mean/var; PI, quantile-based, fully probabilistic, etc.
The return object of predict would be a dictionary with pre-defined fields, and there would be input fields to specify the quantiles one want for PI.

Anything "nicer" (e.g., along the lines of MLJ's ontology of det/prob/etc) would require an interface redesign and/or further work on skpro, in my opinion.

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 1, 2019

@frthjf 's opinion might also be helpful, if he's still following github discussions...

@mloning
Copy link
Contributor Author

mloning commented Aug 1, 2019

Maybe we should just adopt a quick, hacky statsmodel/pmdarima-style solution where one can optionally compute confidence intervals for a user given alpha value (related to #104). The fully probabilistic interface requires much more work.

@mloning mloning closed this as completed in f1ea2ce Aug 6, 2019
Workstream: forecasting and series transformers automation moved this from To do to Done Aug 6, 2019
Use case 3: supervised forecasting automation moved this from To do to Done Aug 6, 2019
@frthjf
Copy link

frthjf commented Aug 6, 2019

A bit late to the party so please if ignore if no longer applicable as already closed. I might lack context of the wider project here, but from what I understand it seems indeed like a type of problem that the skpro API tried to address. I'd like to think of the problem as 'speculative API' problem where the issue is that we try to define an intermediate return object without knowing how it is going to be used. In skpro, the object was a distribution and the issue is that the user might be interested in something as complex as the pdf() but might just as well only care about the mean. The return type somehow has to represent mean, pdf() and any other conceivable property of interest but ideally only the ones that are actually used to avoid a lot of wasted computation. Here, we have a similar problem where the general return object are intervals ranging from simple boundery values to something like get_interval(confidence=0.9).

In skpro, the solution paradigm to this issue was lazy execution; the computation of mean, pdf() etc. is described but only executed when accessed and actually used. This has the problem that one still needs to define how all the potential properties of interest would be computed for each an every model, so the computional burden of full eager execution is transformed into an implementational burden that is not very user friendly.

The other option is something like a predictive API where you say, well, we assume most people are only interested in alpha value intervals, so let's hack that in and if it turns out people actually care about other things then we have to go back and implement something else (I think that's the way to go as a quick fix)

Now, I believe there is a better way to solve this because we shouldn't be forced to speculate about what is actually needed if we could look at the entire computation ahead of the execution of each steps and truncate intermediate objects that are not needed for the end result (a bit like in Tensorflow's compute graph). Unfortunately, as @fkiraly recognized, this is a true can of worms and would require a huge amount of work. In any case, maybe this perspective helps with the further development.

@mloning mloning reopened this Aug 11, 2019
Workstream: forecasting and series transformers automation moved this from Done to In progress Aug 11, 2019
Use case 3: supervised forecasting automation moved this from Done to In progress Aug 11, 2019
@mloning
Copy link
Contributor Author

mloning commented Jan 30, 2020

See #218 and corresponding API design document

@mloning mloning closed this as completed Jan 30, 2020
Workstream: forecasting and series transformers automation moved this from In progress to Done Jan 30, 2020
Use case 3: supervised forecasting automation moved this from In progress to Done Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API design API design & software architecture feature request New feature or request implementing algorithms Implementing algorithms, estimators, objects native to sktime implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality
Development

No branches or pull requests

3 participants