[ENH] Add skforecast forecaster autoreg adapter #5447

yarnabrina · 2023-10-17T12:26:31Z

Adds a new compose forecaster SkforecastForecasterAutoreg, which is comparable to make_reduction with strategy='recursive' and uses ForecasterAutoreg from skforecast package in the backend.

TODO's for future (may or may not be in scope for this PR):

support for ForecasterAutoregDirect which is comparable to strategy='direct'
1. add a strategy parameter similar to make_reduction and accordingly make the call in _create_forecaster
2. add a new SkforecastForecasterAutoregDirect forecaster
support for ForecasterAutoregMultiSeries which is comparable to a single global model for all series in panel/hierarchical dataset with series identifier is a feature itself
- As of now, different values for exogenous features for diifferent series is not supported (ref. Use of multiple exogenous time series with multiple endogenous time series JoaquinAmatRodrigo/skforecast#546)
support for ForecasterAutoregMultiVariate which is comparable to multiple global models for all series in panel/hierarchical dataset, without series identifier is a feature but different targets
- As of now, different values for exogenous features for diifferent series is not supported (ref. Use of multiple exogenous time series with multiple endogenous time series JoaquinAmatRodrigo/skforecast#546) (irrelevant as per first comment below)
switch to predict_quantiles from predict_bootstrapping + numpy.quantile in _predict_quantiles (ref. support more than 2 percentiles to be passed for predict_interval JoaquinAmatRodrigo/skforecast#572)
support feature importance
support missing values based on regressor - how?
support passing bootstrapping arguments - during init?
fix test errors - MUST for this PR

JavierEscobarOrtiz · 2023-10-17T13:05:05Z

Hello @yarnabrina

Regarding the ForecasterAutoregMultiVariate TODO, the model will only be able to predict one series at a time, even if it is learning for all of them. So the exog variables need to be related only to the target series.

Adding more series to the training matrix causes it to grow horizontally (adding more columns). The exog variables will be related to all series. However, since you are only predicting your target series, the exogenous information needs to be related to that series.

https://skforecast.org/latest/user_guides/dependent-multi-series-multivariate-forecasting

sktime/forecasting/compose/_skforecast_reduce.py

fkiraly · 2023-10-25T20:33:01Z

am I interpreting the multiple issues raised in skforecast correctly that running it through the sktime test framework dredges up various bugs in skforecast that we either have to patch on the sktime side, or directly fix in skforecast?

yarnabrina · 2023-10-26T09:31:25Z

Yes I'm expecting a quick bugfix and enhancements in skforecast from @JavierEscobarOrtiz. These seem major enough issues to me so it's worth fixing there directly instead of a sktime patch, but I don't know how long that may take though, but I see a new branch there for v11 already.

Other than those, there may be a few changes needed in sktime as well, or some more checks to satisfy the tests. I am trying to isolate exactly what's causing those, not clear so far.

fkiraly · 2023-10-26T11:45:03Z

my interpretation is that there are at least two issues:

NaN are created somewhere in the pipeline, sklearn does not like this
the predict_interval returns are non-conformant, as variable names are coerced to string even if they are integer. This is likely a side effect of sklearn under the hood of skforecast.

yarnabrina · 2023-10-27T05:27:52Z

as variable names are coerced to string even if they are integer

This is probably not from sklearn or skforecast. I am doing this before passing to fit or predict using _coerce_column_names static method myself. Before I added this, there were a lot more failures, and then I noted that make_reduction also uses this with _coerce_col_str function.

(There can still be side effect though, I am not saying that does not exist. But I'm doing myself as well.)

NaN are created somewhere in the pipeline, sklearn does not like this

This is probably the index issue I created in skforecast. If I am not wrong, sktime adds no restriction on values of indices whatsoever as long as they are date/time or integers and montonous (please correct me), but as of now it seems skforecast needs them to start from 0 if it's not range index. @JavierEscobarOrtiz confirmed it's a bug, but in the time it's fixed in skforecast, do you have any suggestion to handle in sktime, if you faced it before for some other estimators?

fkiraly · 2023-10-27T07:28:38Z

do you have any suggestion to handle in sktime, if you faced it before for some other estimators?

The canonical solution if an interfaced estimator returns garbage indices is to overwrite them with the expected indices - as long as we are really certain that the forecasts are what we want (bad indices can indicate bad forecasts...)

The code is templateable, for _predict it is in lines 175-178 of the forecasting_supersimple extension template.

I was considering to add sth similar for _predict_interval etc, but we don't have it.
Rows are the same, columns should use self._y_index and from_product to form the expected multi-index (alpha-s and coverage-s should retain their original order).

…thub.com/yarnabrina/sktime-fork into pr/5447

fkiraly

I can spot two problems remaining:

skforecast cannot deal with Index typed index, expecting RangeIndex or data index type. This fails the input check with an uninformative message (new_index accessed before assigned)
requested/returned data in the probabilistic case seem non-matching

I would put both down to issues with non-contiguous fh. I would suggest to address these by requesting contiguous fh, then subsetting the result.

yarnabrina · 2024-02-09T15:44:55Z

I have these tags set already specifically for those reasons:

"enforce_index_type": [pandas.DatetimeIndex, pandas.RangeIndex]
"capability:insample": False
"capability:pred_int": True
"capability:pred_int:insample": False

I thought the first will deal with index issue, and the rest with probabilistic predictions. Can you please tell what do these tags mean then?

Also, I don't know what data is being passed in the test, but this works for me locally. I am predict with fh where values are not-consecutive, and probablistic predictions also appeared to be in correct format. Am I overlooking something?

>>> 
>>> from sklearn.ensemble import RandomForestRegressor
>>> from sktime.datasets import load_longley
>>> from sktime.forecasting.compose import SkforecastAutoreg
>>> 
>>> y, X = load_longley()
>>> 
>>> y = y.reset_index(drop=True)
>>> X = X.reset_index(drop=True)
>>> 
>>> y_train = y.head(n=12)
>>> 
>>> X_train = X.head(n=12)
>>> X_test = X.tail(n=4)
>>> 
>>> forecaster = SkforecastAutoreg(RandomForestRegressor(), [2, 4])
>>> 
>>> forecaster.fit(y_train, X=X_train)
SkforecastAutoreg(lags=[2, 4], regressor=RandomForestRegressor())
>>> 
>>> forecaster.predict(fh=[1, 4], X=X_test)
12    67405.08
15    67487.88
Name: TOTEMP, dtype: float64
>>> 
>>> forecaster.predict_interval(fh=[3, 2], X=X_test, coverage=[0.9, 0.95])
      TOTEMP                              
        0.90                0.95          
       lower     upper     lower     upper
13  66817.64  66817.64  66817.64  66817.64
14  66817.64  66817.64  66817.64  66817.64
>>> 
>>> forecaster.predict_quantiles(fh=range(1, 5), X=X_test, alpha=[0.8, 0.3, 0.2, 0.7])
      TOTEMP                              
         0.8       0.3       0.2       0.7
12  67859.08  67859.08  67859.08  67859.08
13  67859.08  67859.08  67859.08  67859.08
14  67859.08  67859.08  67859.08  67859.08
15  67941.88  67941.88  67941.88  67941.88
>>>

The predictions themselves are super suspicious though - why are these constant for a given horizon?

fkiraly · 2024-02-09T18:12:09Z

I have these tags set already specifically for those reasons:

"enforce_index_type": [pandas.DatetimeIndex, pandas.RangeIndex]
"capability:insample": False
"capability:pred_int": True
"capability:pred_int:insample": False

I thought the first will deal with index issue, and the rest with probabilistic predictions. Can you please tell what do these tags mean then?

enforce_index_type: originally meant to coerce index types (and restore them in predict, this is unfortunatey not linked to any logic in the base class right now
capability:insample user facing for lookup - there is no logic that would raise an error
capability:pred_int: user facing, also controls error message if predict_interval etc are called, and tests
capability:pred_int:insample: user facing, also controls some tests

fkiraly · 2024-02-09T18:14:52Z

Also, I don't know what data is being passed in the test, but this works for me locally. I am predict with fh where values are not-consecutive, and probablistic predictions also appeared to be in correct format. Am I overlooking something?

My suspicion is, the index ends up Index (with object dtype) and not MultiIndex as it should - you can get those from _get_columns if you want

fkiraly · 2024-02-09T18:17:39Z

The predictions themselves are super suspicious though - why are these constant for a given horizon?

Though suspicion is cast on a third party, no?

Abhay-Lejith · 2024-05-29T06:29:25Z

@fkiraly @yarnabrina Once I finish other tasks and want to start working on this, do I make another pr? Otherwise I think I will need push access to @yarnabrina's fork for my commits to be visible here right?

yarnabrina · 2024-05-29T14:41:31Z

@fkiraly @yarnabrina Once I finish other tasks and want to start working on this, do I make another pr? Otherwise I think I will need push access to @yarnabrina's fork for my commits to be visible here right?

Yes, a separate PR seems better. What you can do is similar to what @fnhirwa did with darts PR -> in a new branch of your fork, cherry pick commits from my fork, and continue thereupon.

fkiraly · 2024-06-30T14:56:05Z

superseded by #6531

Continuing with work done in #5447. #### What does this implement/fix? Explain your changes. An adapter for the skforecast `ForecasterAutoreg` . #### Does your contribution introduce a new dependency? If yes, which one? skforecast

yarnabrina added interfacing algorithms Interfacing existing algorithms/estimators from third party packages module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels Oct 17, 2023

yarnabrina mentioned this pull request Oct 17, 2023

support more than 2 percentiles to be passed for predict_interval JoaquinAmatRodrigo/skforecast#572

Closed

yarnabrina force-pushed the add-skforecast-ForecasterAutoreg-adapter branch from 545a30e to 107df7f Compare October 22, 2023 05:55

fkiraly reviewed Oct 22, 2023

View reviewed changes

sktime/forecasting/compose/_skforecast_reduce.py Outdated Show resolved Hide resolved

fkiraly reviewed Oct 22, 2023

View reviewed changes

sktime/forecasting/compose/_skforecast_reduce.py Outdated Show resolved Hide resolved

This was referenced Oct 23, 2023

ForecasterAutoreg fails to fit when exog do not have string column names JoaquinAmatRodrigo/skforecast#575

Open

ForecasterAutoreg fails to fit when index do not start from 0 JoaquinAmatRodrigo/skforecast#576

Closed

yarnabrina force-pushed the add-skforecast-ForecasterAutoreg-adapter branch from b317c63 to 0022853 Compare October 25, 2023 16:50

yarnabrina force-pushed the add-skforecast-ForecasterAutoreg-adapter branch from 0022853 to c4ba9b7 Compare November 18, 2023 05:33

yarnabrina mentioned this pull request Nov 18, 2023

[BUG] behaviour of new CI changed suddenly #5559

Closed

added ForecasterAutoreg adapter from skforecast

0135c5b

yarnabrina force-pushed the add-skforecast-ForecasterAutoreg-adapter branch from c4ba9b7 to 0135c5b Compare December 9, 2023 05:27

fkiraly and others added 4 commits February 8, 2024 22:07

Update _skforecast_reduce.py

2d1ff70

Merge branch 'main' into pr/5447

2feaf59

added ForecasterAutoreg adapter from skforecast

63be5d3

added clone of estimators (suggested by @fkiraly on discord)

8a583f4

yarnabrina force-pushed the add-skforecast-ForecasterAutoreg-adapter branch from 0135c5b to 8a583f4 Compare February 9, 2024 05:21

fkiraly added 2 commits February 9, 2024 11:19

Merge branch 'add-skforecast-ForecasterAutoreg-adapter' of https://gi…

7301207

…thub.com/yarnabrina/sktime-fork into pr/5447

Update _skforecast_reduce.py

ee26420

fkiraly requested changes Feb 9, 2024

View reviewed changes

Merge branch 'main' into pr/5447

9e87dff

yarnabrina mentioned this pull request May 3, 2024

[ENH] coordination discussion on foundation models, deep learning, backends #6381

Open

fkiraly assigned Abhay-Lejith May 30, 2024

Abhay-Lejith mentioned this pull request Jun 4, 2024

&yarnabrina [ENH] skforecast ForecasterAutoreg adapter #6531

Merged

4 tasks

fkiraly closed this Jun 30, 2024

yarnabrina deleted the add-skforecast-ForecasterAutoreg-adapter branch July 3, 2024 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Add skforecast forecaster autoreg adapter #5447

[ENH] Add skforecast forecaster autoreg adapter #5447

yarnabrina commented Oct 17, 2023 •

edited

Loading

JavierEscobarOrtiz commented Oct 17, 2023

fkiraly commented Oct 25, 2023 •

edited

Loading

yarnabrina commented Oct 26, 2023

fkiraly commented Oct 26, 2023

yarnabrina commented Oct 27, 2023

fkiraly commented Oct 27, 2023

fkiraly left a comment

yarnabrina commented Feb 9, 2024

fkiraly commented Feb 9, 2024

fkiraly commented Feb 9, 2024

fkiraly commented Feb 9, 2024

Abhay-Lejith commented May 29, 2024

yarnabrina commented May 29, 2024

fkiraly commented Jun 30, 2024

[ENH] Add skforecast forecaster autoreg adapter #5447

[ENH] Add skforecast forecaster autoreg adapter #5447

Conversation

yarnabrina commented Oct 17, 2023 • edited Loading

JavierEscobarOrtiz commented Oct 17, 2023

fkiraly commented Oct 25, 2023 • edited Loading

yarnabrina commented Oct 26, 2023

fkiraly commented Oct 26, 2023

yarnabrina commented Oct 27, 2023

fkiraly commented Oct 27, 2023

fkiraly left a comment

Choose a reason for hiding this comment

yarnabrina commented Feb 9, 2024

fkiraly commented Feb 9, 2024

fkiraly commented Feb 9, 2024

fkiraly commented Feb 9, 2024

Abhay-Lejith commented May 29, 2024

yarnabrina commented May 29, 2024

fkiraly commented Jun 30, 2024

yarnabrina commented Oct 17, 2023 •

edited

Loading

fkiraly commented Oct 25, 2023 •

edited

Loading