-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Add skforecast forecaster autoreg adapter #5447
[ENH] Add skforecast forecaster autoreg adapter #5447
Conversation
Hello @yarnabrina Regarding the Adding more series to the training matrix causes it to grow horizontally (adding more columns). The exog variables will be related to all series. However, since you are only predicting your target series, the exogenous information needs to be related to that series. https://skforecast.org/latest/user_guides/dependent-multi-series-multivariate-forecasting |
545a30e
to
107df7f
Compare
b317c63
to
0022853
Compare
am I interpreting the multiple issues raised in |
Yes I'm expecting a quick bugfix and enhancements in skforecast from @JavierEscobarOrtiz. These seem major enough issues to me so it's worth fixing there directly instead of a sktime patch, but I don't know how long that may take though, but I see a new branch there for v11 already. Other than those, there may be a few changes needed in sktime as well, or some more checks to satisfy the tests. I am trying to isolate exactly what's causing those, not clear so far. |
my interpretation is that there are at least two issues:
|
This is probably not from (There can still be side effect though, I am not saying that does not exist. But I'm doing myself as well.)
This is probably the index issue I created in |
The canonical solution if an interfaced estimator returns garbage indices is to overwrite them with the expected indices - as long as we are really certain that the forecasts are what we want (bad indices can indicate bad forecasts...) The code is templateable, for I was considering to add sth similar for |
0022853
to
c4ba9b7
Compare
c4ba9b7
to
0135c5b
Compare
0135c5b
to
8a583f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can spot two problems remaining:
- skforecast cannot deal with
Index
typed index, expectingRangeIndex
or data index type. This fails the input check with an uninformative message (new_index
accessed before assigned) - requested/returned data in the probabilistic case seem non-matching
I would put both down to issues with non-contiguous fh
. I would suggest to address these by requesting contiguous fh
, then subsetting the result.
I have these tags set already specifically for those reasons:
I thought the first will deal with index issue, and the rest with probabilistic predictions. Can you please tell what do these tags mean then? Also, I don't know what data is being passed in the test, but this works for me locally. I am predict with >>>
>>> from sklearn.ensemble import RandomForestRegressor
>>> from sktime.datasets import load_longley
>>> from sktime.forecasting.compose import SkforecastAutoreg
>>>
>>> y, X = load_longley()
>>>
>>> y = y.reset_index(drop=True)
>>> X = X.reset_index(drop=True)
>>>
>>> y_train = y.head(n=12)
>>>
>>> X_train = X.head(n=12)
>>> X_test = X.tail(n=4)
>>>
>>> forecaster = SkforecastAutoreg(RandomForestRegressor(), [2, 4])
>>>
>>> forecaster.fit(y_train, X=X_train)
SkforecastAutoreg(lags=[2, 4], regressor=RandomForestRegressor())
>>>
>>> forecaster.predict(fh=[1, 4], X=X_test)
12 67405.08
15 67487.88
Name: TOTEMP, dtype: float64
>>>
>>> forecaster.predict_interval(fh=[3, 2], X=X_test, coverage=[0.9, 0.95])
TOTEMP
0.90 0.95
lower upper lower upper
13 66817.64 66817.64 66817.64 66817.64
14 66817.64 66817.64 66817.64 66817.64
>>>
>>> forecaster.predict_quantiles(fh=range(1, 5), X=X_test, alpha=[0.8, 0.3, 0.2, 0.7])
TOTEMP
0.8 0.3 0.2 0.7
12 67859.08 67859.08 67859.08 67859.08
13 67859.08 67859.08 67859.08 67859.08
14 67859.08 67859.08 67859.08 67859.08
15 67941.88 67941.88 67941.88 67941.88
>>> The predictions themselves are super suspicious though - why are these constant for a given horizon? |
|
My suspicion is, the index ends up |
Though suspicion is cast on a third party, no? |
@fkiraly @yarnabrina Once I finish other tasks and want to start working on this, do I make another pr? Otherwise I think I will need push access to @yarnabrina's fork for my commits to be visible here right? |
Yes, a separate PR seems better. What you can do is similar to what @fnhirwa did with |
superseded by #6531 |
Continuing with work done in #5447. #### What does this implement/fix? Explain your changes. An adapter for the skforecast `ForecasterAutoreg` . #### Does your contribution introduce a new dependency? If yes, which one? skforecast
Adds a new compose forecaster
SkforecastForecasterAutoreg
, which is comparable tomake_reduction
withstrategy='recursive'
and usesForecasterAutoreg
fromskforecast
package in the backend.TODO's for future (may or may not be in scope for this PR):
ForecasterAutoregDirect
which is comparable tostrategy='direct'
make_reduction
and accordingly make the call in_create_forecaster
SkforecastForecasterAutoregDirect
forecasterForecasterAutoregMultiSeries
which is comparable to a single global model for all series in panel/hierarchical dataset with series identifier is a feature itselfForecasterAutoregMultiVariate
which is comparable to multiple global models for all series in panel/hierarchical dataset, without series identifier is a feature but different targetsAs of now, different values for exogenous features for diifferent series is not supported (ref. Use of multiple exogenous time series with multiple endogenous time series JoaquinAmatRodrigo/skforecast#546)(irrelevant as per first comment below)predict_quantiles
frompredict_bootstrapping
+numpy.quantile
in_predict_quantiles
(ref. support more than 2 percentiles to be passed forpredict_interval
JoaquinAmatRodrigo/skforecast#572)