[FEATURE] Non-continuous index in predict #50

MichalChromcak · 2020-11-15T12:39:38Z

Is your feature request related to a problem? Please describe.
Sometimes it might be handy to be able to pass an index for predict methods, which would have gaps. E.g. wanting only forecasting horizon of 1d, 3d, and 7d ahead.

Describe the solution you'd like

X_pred = pd.DataFrame(index=pd.DatetimeIndex(["2020-01-01", "2020-01-03", "2020-01-07"]))
preds = model.predict(X_pred)
assert X_pred.index.to_list() == preds.index.to_list()

Some models support that natively, for some models, the only way might be predicting for full range and later select from the results only X_pred values.

full_ind = pd.DatetimeIndex(X_pred.index.min(), X_pred.index.max())
preds = model.predict(pd.DataFrame(index=full_ind).merge(X_pred, left_index=True, right_index=True))
return preds.loc[X_pred.index.to_list()]

SklearnWrapper might just pick exact things (thus support list of concrete lags) - especially handy when having optimize_for_horizon set on True

Describe alternatives you've considered
Make it uniform for all models, while "wasting" computational time, but having just one implementation.

Additional context
Request partially raised in order to be compliant with sktime, but we could also discuss its usefulness in general.

The text was updated successfully, but these errors were encountered:

MichalChromcak · 2020-11-15T12:41:07Z

@pavelkrizek Could you provide your opinion on the feature? The result of our discussion will determine the next steps for HCrystalBallForecaster in sktime implemented in this PR

pavelkrizek · 2020-11-16T18:42:12Z

@MichalChromcak Is this necessary for sktime wrapper? In principle, somebody who cares just about some particular horizon could achieve the same thing by just passing custom metrics to CV, which will i.e. exponentially weigh the error terms based on time - this way gets the best model just for a particular horizon and they could filter the result themselves.

I also see bigger changes needed than just filtering out the index in the predict method. In order to make it useful, we would also need to change things in cross-validation, i.e. our Splitter is not supporting right now returning just one sample with the arbitrary horizon (horizon 10, gives you 10 samples for predict), also our CV plotting will not be probably very meaningful with these settings, so having a better picture of how many things needs to be changed to make it usable would be a better starting point (would propose to change 1 wrapper and try to use it in different contexts to see what breaks...). Some scouting of methods that are already supporting this out-of-box would be also useful to see if there are some performance benefits of passing the filtered data to predict or just post-filtering in order to decide if one uniform solution "fits all".

MichalChromcak · 2020-11-17T15:21:11Z

@pavelkrizek At least from the test suite, there are some cases when data is passed with indices including gaps. We might also raise NotImplementedError in there, or maybe better directly in HCrystalBall if such things occur.

@mloning Would be acceptable to raise NotImplementedError in such case for sktime?

mloning · 2020-11-18T18:13:35Z

Hi @MichalChromcak, yes that would be okay, we may have ignore any failing unit test in sktime for the HCrystalball wrapper.

MichalChromcak added the enhancement New feature or request label Nov 15, 2020

MichalChromcak closed this as completed Dec 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Non-continuous index in predict #50

[FEATURE] Non-continuous index in predict #50

MichalChromcak commented Nov 15, 2020

MichalChromcak commented Nov 15, 2020

pavelkrizek commented Nov 16, 2020 •

edited

MichalChromcak commented Nov 17, 2020

mloning commented Nov 18, 2020

[FEATURE] Non-continuous index in predict #50

[FEATURE] Non-continuous index in predict #50

Comments

MichalChromcak commented Nov 15, 2020

MichalChromcak commented Nov 15, 2020

pavelkrizek commented Nov 16, 2020 • edited

MichalChromcak commented Nov 17, 2020

mloning commented Nov 18, 2020

pavelkrizek commented Nov 16, 2020 •

edited