Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: support some integer indexes in tsa models #8488

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

ltsaprounis
Copy link

@ltsaprounis ltsaprounis commented Oct 31, 2022

Notes:

The PR is fixing the issue and not breaking any tests in statsmodels/tsa/statespace/tests/test_simulate.py.
I will also add a test for this edge case before making this ready for review.

@ltsaprounis ltsaprounis marked this pull request as ready for review November 6, 2022 19:54
@ltsaprounis
Copy link
Author

@ChadFulton - Thanks for the feedback! I changed the code based on your recommendation in #8487 and added a warning and a test.

Copy link
Member

@ChadFulton ChadFulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this PR! It looks good to me. I just left a couple of easy comments.


# Fit the forecaster and ensure that the right warning is raised.
with pytest.warns(ValueWarning, match=warning_text):
forecaster = sarimax.SARIMAX(endog=y, k_factors=1, factor_order=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k_factors and factor_order aren't arguments of SARIMAX. I think you want order?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha yes - I think I was using UnobservedComponents or DynamicFactor in the notebook I was testing things.
fixed now.

# Fit the forecaster and ensure that the right warning is raised.
with pytest.warns(ValueWarning, match=warning_text):
forecaster = sarimax.SARIMAX(endog=y, k_factors=1, factor_order=1)
forecaster = forecaster.fit()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using fit can be problematic in tests (due to numerical instability across platforms or due to the time it takes to perform the fitting), so it's better to just use a fixed set of parameters, e.g. something like:

forecaster = sarimax.SARIMAX(endog=y, order=(1, 0, 0))
forecaster = forecaster.smooth([0.5, 1.0])

_index = RangeIndex(index[0], index[-1] + 1)
if _index.equals(index):
index = _index
_index_generated = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the way the _index_generated flag is used in the rest of the codebase, we should keep it as False even though we are replacing the integer Index with a RangeIndex here.

@ChadFulton ChadFulton changed the title Simulate integer index fix ENH: support some integer indexes in tsa models Nov 11, 2022
@ltsaprounis
Copy link
Author

ltsaprounis commented Nov 20, 2022

@ChadFulton - Thanks for the review!

I have addressed the feedback now. One point that I'm not sure if you would agree with is that I now split the unsupported (but valid) indexes into 2 lists where one is the integer index (with step 1) not starting from zero to allow for testing the different warning messages and index_generated flags. The other option I can think of is to add this type of index into the supported ones now.

There's one failing test in the CI: statsmodels/tsa/statespace/tests/test_dynamic_factor.py::TestDynamicFactor_ar2_errors::test_mle
The test is failing on a tolerance issue that seems unrelated to this PR, but I'm not sure!
I can't reproduce the failure on my machine.
image

@bashtage
Copy link
Member

bashtage commented Dec 8, 2022

That failure is pretty common on CI/Python 3.9 now. Need to fix before release, or skip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: time index from simulate method not correct for integer index not starting from zero
3 participants