New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: support some integer indexes in tsa models #8488
base: main
Are you sure you want to change the base?
ENH: support some integer indexes in tsa models #8488
Conversation
@ChadFulton - Thanks for the feedback! I changed the code based on your recommendation in #8487 and added a warning and a test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this PR! It looks good to me. I just left a couple of easy comments.
|
||
# Fit the forecaster and ensure that the right warning is raised. | ||
with pytest.warns(ValueWarning, match=warning_text): | ||
forecaster = sarimax.SARIMAX(endog=y, k_factors=1, factor_order=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k_factors
and factor_order
aren't arguments of SARIMAX
. I think you want order
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha yes - I think I was using UnobservedComponents or DynamicFactor in the notebook I was testing things.
fixed now.
# Fit the forecaster and ensure that the right warning is raised. | ||
with pytest.warns(ValueWarning, match=warning_text): | ||
forecaster = sarimax.SARIMAX(endog=y, k_factors=1, factor_order=1) | ||
forecaster = forecaster.fit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using fit
can be problematic in tests (due to numerical instability across platforms or due to the time it takes to perform the fitting), so it's better to just use a fixed set of parameters, e.g. something like:
forecaster = sarimax.SARIMAX(endog=y, order=(1, 0, 0))
forecaster = forecaster.smooth([0.5, 1.0])
statsmodels/tsa/base/tsa_model.py
Outdated
_index = RangeIndex(index[0], index[-1] + 1) | ||
if _index.equals(index): | ||
index = _index | ||
_index_generated = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the way the _index_generated
flag is used in the rest of the codebase, we should keep it as False
even though we are replacing the integer Index with a RangeIndex
here.
@ChadFulton - Thanks for the review! I have addressed the feedback now. One point that I'm not sure if you would agree with is that I now split the unsupported (but valid) indexes into 2 lists where one is the integer index (with step 1) not starting from zero to allow for testing the different warning messages and There's one failing test in the CI: statsmodels/tsa/statespace/tests/test_dynamic_factor.py::TestDynamicFactor_ar2_errors::test_mle |
That failure is pretty common on CI/Python 3.9 now. Need to fix before release, or skip. |
NumPy's guide.
Notes:
The PR is fixing the issue and not breaking any tests in statsmodels/tsa/statespace/tests/test_simulate.py.
I will also add a test for this edge case before making this ready for review.