-
Notifications
You must be signed in to change notification settings - Fork 89
Fix ARIMA not accounting for gap in prediction from end of training data #3884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3884 +/- ##
=======================================
+ Coverage 99.7% 99.7% +0.1%
=======================================
Files 346 346
Lines 36304 36325 +21
=======================================
+ Hits 36167 36188 +21
Misses 137 137
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@@ -142,7 +142,25 @@ def _match_indices(self, X, y): | |||
def _set_forecast(self, X): | |||
from sktime.forecasting.base import ForecastingHorizon | |||
|
|||
fh_ = ForecastingHorizon([i + 1 for i in range(len(X))], is_relative=True) | |||
# we can only calculate the difference if the indices are of the same type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set this as a fallback to previous behavior in case we ever receive inconsistent indices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the index type being different indicative of something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should only be the result of inconsistent user behavior and this is a safeguard against that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good, just had a few questions! Thanks for taking care of this.
units_diff = len(dates_diff) - 1 | ||
fh_ = ForecastingHorizon( | ||
[units_diff + i for i in range(len(X))], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we needed an off-by-one offset in the non-gap case. Should that still be the case here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the + 1 since range(len(X))
starts at 0. In the gap case units_diff
should always be > 0 so we don't need it!
@@ -139,6 +140,7 @@ def test_set_forecast(ts_data): | |||
) | |||
|
|||
clf = ARIMARegressor() | |||
clf.last_X_index = X.index[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we never ran fit in this test so last_X_index
wasn't being set!
|
||
|
||
@pytest.mark.parametrize("use_covariates", [True, False]) | ||
def test_arima_regressor_can_forecast_arbitrary_dates(use_covariates, ts_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful here to highlight how this is arbitrary dates in a comment or something - how long is X_test here, so how big of a gap is it between the training data and what we're asking to predict on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add!
assert ( | ||
arima.predict(X_test).tail(5).tolist() == arima.predict(X_test_last_5).tolist() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fail when run on current main?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some small things and questions. Little nit to approve.
@@ -142,7 +142,25 @@ def _match_indices(self, X, y): | |||
def _set_forecast(self, X): | |||
from sktime.forecasting.base import ForecastingHorizon | |||
|
|||
fh_ = ForecastingHorizon([i + 1 for i in range(len(X))], is_relative=True) | |||
# we can only calculate the difference if the indices are of the same type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the index type being different indicative of something?
Fixes #3853.