[ENH] `quantile` method for distributions, default implementation of forecaster `predict_quantiles` if `predict_proba` is present #4513

fkiraly · 2023-04-27T16:59:38Z

This PR introduces a quantile method to all sktime distributions.

This complements ppf (which also returns quantiles) in that it returns quantiles in the same format as forecasters' predict_quantiles, and broadcasts quantile points in the same way as in forecasters.

This PR also adds a default implementation of predict_quantiles if predict_proba is present, through calling quantiles of the predict_proba return.

This also enables the common usage pattern beyond 0.18.0, where quantile is called on the predict_proba return, similar to the tensorflow method, but in an sktime compatible mtype.

fkiraly · 2023-04-27T16:59:56Z

FYI @yarnabrina, re probabilistic predictions - would appreciate a reivew!

yarnabrina · 2023-04-28T04:09:49Z

Looks okay, but two questions:

How do I try/test these? Which estimators has this capability as of now? I got none (just Normal and TFNormal distributions) with this: pytest --co -k "test_quantile" sktime/proba/tests/test_all_distrs.py.
Since _predict_interval can work with _predict_quantiles, and you added support for _predict_quantiles using _predict_proba, can we extend _predict_proba support to _predict_interval as well?

fkiraly · 2023-04-28T11:58:57Z

How do I try/test these? Which estimators has this capability as of now? I got none (just Normal and TFNormal distributions) with this: pytest --co -k "test_quantile" sktime/proba/tests/test_all_distrs.py.

This is not an estimator capability, but a capability of distributions. Currently indeed there are only Normal and TFNormal, as well as the abstract tensorflow based adapter _BaseTFDistribution (that's tested through TFNormal).

Since _predict_interval can work with _predict_quantiles, and you added support for _predict_quantiles using _predict_proba, can we extend _predict_proba support to _predict_interval as well?

That happens indirectly, my thinking is that the preferred method to use as a default for _predict_interval is _predict_quantiles, not _predict_proba. My reasoning for this priority is that these methods are content-wise the same except how the columns are parametrized and named.

To see that we always have a default if just one of the methods is implemented, in the current PR, let's consider two cases where _predict_interval is not implemented.

Case 1: both _predict_quantiles is implemented, and _predict_proba may or may not be implemented.
Then _predict_interval gets the default from _predict_quantiles.

Case 2: only _predict_proba is implemented, not _predict_quantiles.
Then _predict_interval gets the default from _predict_quantiles, which in turn gets it from _predict_proba.

These cases are distinct and exhaustive, assuming at least one other method is implemented.

Do you think we should do this differently? Or, same logic, but implement it differently?

fkiraly · 2023-04-28T17:57:59Z

@yarnabrina, merging to have some fix in place for the release action, but happy to change the logic later on

scheduled deprecation and change actions for the 0.18.0 release * remove `VectorizedDF.get_iloc_indexer` * switch default of `legacy_interface` in `predict_proba` to `False` For deprecation of the `predict_proba` legacy interface, depends on: * #4513 * #4514

yarnabrina · 2023-04-29T03:44:26Z

Do you think we should do this differently? Or, same logic, but implement it differently?

I completely agree with the logic, but I think I'm missing how predict_interval would work if both _predict_interval and _predict_quantiles are undefined (in the sense of not overridden at estimator level) and only _predict_proba is overridden.

If actual (non-base) estimator do not define _predict_interval or _predict_quantiles, the checks with _has_implementation_of will be False, but _predict_proba implementation will make can_do_proba as True. So, it goes pass L1995, and also goes passL2003. My question/doubt is how will L2030 work then, because I don't see pred_int being defined anywhere else in the function.

Do we not need something like this here in _predict_interval as well? Of course the logic needs to be a bit different to conform to the output format.

sktime/sktime/forecasting/base/_base.py

Line 2106 in 195e517

elif implements_proba:

If I have missed some logic which will call _predict_quantiles of BaseForecaster from _predict_interval of BaseForecaster to make use of _predict_proba in actual estimator in such cases, please let me know.

fkiraly · 2023-04-29T12:31:47Z

@yarnabrina, moved discussion here:
#4528

(I've merged the above as a fix to a blocker in the deprecation which was blocking the release, but we should get this sorted out!)

fkiraly added 2 commits April 27, 2023 17:47

Update base.py

ac102aa

default for predict_quantiles

6c576fb

rename method to quantile

c5b2c9a

fkiraly mentioned this pull request Apr 27, 2023

[MNT] 0.18.0 deprecation actions #4510

Merged

testing

195e517

fkiraly merged commit 91290af into main Apr 28, 2023
22 checks passed

fkiraly deleted the proba-quantiles branch April 28, 2023 17:58

fkiraly mentioned this pull request Apr 29, 2023

[ENH] review probabilistic forecasting methods default dispatch logic #4528

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] `quantile` method for distributions, default implementation of forecaster `predict_quantiles` if `predict_proba` is present #4513

[ENH] `quantile` method for distributions, default implementation of forecaster `predict_quantiles` if `predict_proba` is present #4513

fkiraly commented Apr 27, 2023 •

edited

fkiraly commented Apr 27, 2023

yarnabrina commented Apr 28, 2023

fkiraly commented Apr 28, 2023 •

edited

fkiraly commented Apr 28, 2023

yarnabrina commented Apr 29, 2023 •

edited

fkiraly commented Apr 29, 2023

[ENH] quantile method for distributions, default implementation of forecaster predict_quantiles if predict_proba is present #4513

[ENH] quantile method for distributions, default implementation of forecaster predict_quantiles if predict_proba is present #4513

Conversation

fkiraly commented Apr 27, 2023 • edited

fkiraly commented Apr 27, 2023

yarnabrina commented Apr 28, 2023

fkiraly commented Apr 28, 2023 • edited

fkiraly commented Apr 28, 2023

yarnabrina commented Apr 29, 2023 • edited

fkiraly commented Apr 29, 2023

[ENH] `quantile` method for distributions, default implementation of forecaster `predict_quantiles` if `predict_proba` is present #4513

[ENH] `quantile` method for distributions, default implementation of forecaster `predict_quantiles` if `predict_proba` is present #4513

fkiraly commented Apr 27, 2023 •

edited

fkiraly commented Apr 28, 2023 •

edited

yarnabrina commented Apr 29, 2023 •

edited