ENH: Improvements to new ARIMA-type estimators #6159

ChadFulton · 2019-09-11T22:40:14Z

Collection of follow-ups to #5827. These can/should be broken out into individual PRs. Many are relatively straightforward and would make a good first PR.

General

Documentation (none was added in original PR).
Release notes.
Example notebook.
Double-check how sm.tsa.arima.ARIMA works with fix_params (it should fail except when the fit method is statespace).
Estimators that do not support seasonal models per se should support models where the only seasonal part is seasonal differencing.

GLS

Add support for fixed parameters
Improve "Returns" documentation for other_results.
Add documentation for why we have e.g. include_constant but not other trend specifications (e.g. that it is to maintain consistency with estimation methods with assumptions that require demeaned series).
Fix the following test and put it back into the GLS test suite:

@pytest.mark.low_precision('Test against Example 6.6.3 in Brockwell and Davis'
                           ' (2016)')
# @pytest.mark.xfail(reason="Source appears to find suboptimal parameters")
def test_brockwell_davis_example_663():
    # TODO: the parameters described by BD appear to be suboptimal (based on
    # llf computed from state space form), so that this test fails. Should try
    # to confirm with ITSM2000 (i.e. see if we can get it to find better
    # parameters closer to what we find, or compare llf, or something).
    # TODO: quite a slow test, and xfail anyway due to finding better
    # parameters...
    # Get the data, perform seasonal differencing
    endog = sbl.diff(12).iloc[12:]

    exog = pd.Series((sbl.index > '1983-01-01').astype(int),
                     index=sbl.index).diff(12).iloc[12:]

    res, _ = gls(endog, exog, order=(0, 0, 12), max_iter=3)

    assert_allclose(res.exog_params, -328.45, atol=1e-2)
    assert_allclose(res.ma_params,
                    [.219, .098, .031, .064, .069, .111, .081,
                     .057, .092, -0.28, .183, -.672], atol=1e-3)
    assert_allclose(res.sigma2, 12581, atol=1)

Hannan Rissanen

Better warnings / errors when series are short relative to lag length
Add support for fixed parameters
Seems like we could add support for seasonal parameters in this model.
Tests for the case with the bias-corrected estimator.

Innovations MLE

Add support for fixed parameters

Innovations algorithm

Add support for ARMA models; see Brockwell and Davis (2016) p.154 and Example 5.1.6. This estimator should be feasible given the Cython versions of the general innovations algorithm that introduced in PERF: Cythonize innovations algo and filter #5947.

The text was updated successfully, but these errors were encountered:

rajathpatel23 · 2019-09-15T18:50:00Z

@ChadFulton I would like to work on this enhancement, would like to request some help in guiding to start working on this enhancement.

Thank you

ChadFulton · 2019-09-16T00:31:36Z

@rajathpatel23 that would be great, thanks!

There are a number of things here that I think should be pretty self-contained, including:

Example notebooks: it would be great to add a notebook showing how to use the new models.
- There are tons of options here that would be great. One would be to replicate some of the "Examples" sections from Brockwell and Davis' book.
Tests: we could use more unit tests in general, and there are two specific issues I called out above:
- GLS test: the test I put in the issue, above, fails because we find better parameters than are reported in Brockwell and Davis' book. Some more investigation on this case would be useful, especially comparing against their ITSM2000 program (can be found at http://extras.springer.com/2002/978-0-387-21657-7/ITSM2000, only runs on Windows).
- We don't have any tests for the HR estimator with the bias correction, and this is because I couldn't find any packages that implemented it. However, I notice that Gomez and Maravall (e.g. in their chapter "Automatic Modeling Methods for Univariate Series") also have implemented this bias correction in TRAMO/SEATS, and so I was thinking that it is probably available in X13-ARIMA/SEATS, and we might be able to test against that package somehow.
- In general, it would be great to add more unit tests for any of the methods against ITSM2000 or X13-ARIMA/SEATS
Support for fixed parameters: GLS, innovations MLE, and Hannan Rissanen should each be able to support fixed parameters. If you wanted to try this, I would suggest picking just one to start with.

If you have a particular interest, I can go into more details.

emilmirzayev · 2019-09-18T06:02:44Z

Hi Chad,

How one can add examle Notebooks?
Which directory I should put them?

Thanks,

bashtage · 2019-09-18T07:19:13Z

Three steps to add a notebook:

Add your notebook to examples/notebooks
Add a block to the json file that contains the metadata for notebooks.
Add a nice screenshot for your notebook that makes it look interesting in docs/_static/images

I like the format that @ChadFulton uses with the title. I used it here: https://github.com/statsmodels/statsmodels/blob/master/docs/source/_static/images/rolling_ls.png

bashtage · 2019-09-18T07:19:51Z

NB: The json file determines where the example notebook appears in the docs, so please add in the correct section.

madhushree14 · 2020-12-10T11:23:03Z

Hi Chad,
I would like to pick up the "Support for fixed parameters". would you please share some more details on this enhancement?

ChadFulton · 2020-12-11T01:05:52Z

Hi Chad,
I would like to pick up the "Support for fixed parameters". would you please share some more details on this enhancement?

That would be much appreciated, thanks!

First, a little background (maybe you already know this, but just in case you haven't run into this feature before). In our models, all of the parameters are typically estimated by maximum likelihood. But in some cases, you might want to "fix" one of the parameters to a particular value, and then estimate the other parameters by maximum likelihood. A simple example is:

endog = np.random.normal(size=100)

# AR(1) model with an intercept
mod = sm.tsa.SARIMAX(endog, order=(1, 0, 0), trend='c')

# Suppose we want to estimate the AR(1) coefficient, but we want to specify the intercept to be 0.5
with mod.fix_params({'intercept': 0.5}):
    res = mod.fit()
print(res.summary())

this results in:

                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  100
Model:               SARIMAX(1, 0, 0)   Log Likelihood                -146.547
Date:                Thu, 10 Dec 2020   AIC                            297.094
Time:                        19:44:35   BIC                            302.305
Sample:                             0   HQIC                           299.203
                                - 100                                         
Covariance Type:                  opg                                         
=====================================================================================
                        coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------------
intercept (fixed)     0.5000        nan        nan        nan         nan         nan
ar.L1                -0.1595      0.096     -1.655      0.098      -0.348       0.029
sigma2                1.0972      0.144      7.631      0.000       0.815       1.379
===================================================================================
Ljung-Box (L1) (Q):                   0.13   Jarque-Bera (JB):                 2.30
Prob(Q):                              0.72   Prob(JB):                         0.32
Heteroskedasticity (H):               1.26   Skew:                            -0.37
Prob(H) (two-sided):                  0.50   Kurtosis:                         2.97
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

A more complete set of examples can be found in this notebook: https://www.statsmodels.org/stable/examples/notebooks/generated/statespace_fixed_params.html.

This feature is available in state space models (and also in the new ETSModel class), but it's not yet been implemented for all of the new ARIMA-type estimators referenced here (including Innovations MLE, Hannan-Rissanen, and GLS).

I think the easiest place to get started would be adding fixed parameters to the Hannan-Rissanen estimator, which is the hannan_rissanen function in the file statsmodels/tsa/arima/estimators/hannan_rissanen.py. The basic idea here, is that ultimately the parameters are estimated by least squares (see lines 121 and 141), and so fixing certain parameter values is straightforward.

To see why it is straightforward with OLS, consider the following regression equation:

y = a + b x_1 + c x_2 + e

If we want to fix b = 2, then the equation is y = a + 2 x_1 + c x_2 + e or we can rewrite it as a different regression

z = a + c x_2 + e

where we have created the new variable z according to z = y - 2 x_1.

So basically, what I suggest is the following:

Create a new Github issue (e.g. ENH: Fixed parameters in Hannan-Rissanen) for a more convenient location for discussion and advice going forwards.
Add a new argument fixed_params to hannan_rissanen, which should accept a dictionary with parameter names as keys and the fixed numbers as values.
In each case (lines 121 and 141), if there are fixed parameters, you need to remove the columns that correspond to those parameters from the exog arrays being passed to OLS (e.g. in line 121, the lagged_endog variable is passed as the exog argument), multiply them by the given fixed parameter value, and then subtract that to create a new endogenous variable (i.e. like the z I mentioned above).

Also, for the first attempt, I would just raise a NotImplementedError if unbiased=True (since I haven't really thought about fixed parameter values in this case).

Thanks!

madhushree14 · 2020-12-11T10:54:09Z

Thank you Chad for this nice illustration. I will start working accordingly.

madhushree14 · 2021-01-03T12:36:25Z

Hi Chad, could you please help me how to identify the corresponding column when removing the parameter from the 2d array?

jackzyliu · 2021-05-24T19:31:07Z

Hi @ChadFulton, first time contributing here. I would like to help out if this issue is still open and active.

I can pick up where @madhushree14 left off on #7202, or I can go on to work on GLS. Either way, I'd be curious to investigate the result difference on Brockwell and Davis example 6.6.3 next.

ChadFulton · 2021-05-27T14:13:40Z

Hi @jackzyliu, thanks, that would be much appreciated!

I think we need fixed parameters for HR and/or innovations MLE before we can support fixed parameters for GLS, so I will ping #7202 / #7355 to see what the status is.

In the meantime, if you have time/interest to take a look at the implementation of Hannan-Rissanen, that would be a great way to get into things here.

Thanks again!

jtimko16 · 2022-10-09T13:21:54Z

Hello,

I am looking for a good issue for my very first contribution to any Python module. As I used ARIMA several times before, and have background in econometric, this could be a good choice.

Is there any open part of this task where I could contribute? Thanks for guidance.

ChadFulton added comp-tsa type-enh good first issue labels Sep 11, 2019

ChadFulton added this to the Someday milestone Sep 11, 2019

ChadFulton mentioned this issue Oct 5, 2019

ENH/REF: SARIMAX start params #6186

Open

ChadFulton modified the milestones: Someday, 0.11 Oct 6, 2019

bashtage modified the milestones: 0.11, 0.12 Jan 24, 2020

ChadFulton modified the milestones: 0.12, 0.13 Oct 27, 2020

ChadFulton mentioned this issue Jan 7, 2021

ENH: Fixed parameters in Hannan-Rissanen #7202

Closed

This was referenced Jun 13, 2021

ENH: Add fixed_params to Hannan Rissanen (GH7202) #7497

Merged

ENH: Enable ARIMA.fit(method='hannan_rissanen') with fixed parameters #7501

Closed

ENH: Enable ARIMA.fit(method='hannan_rissanen') with fixed parameters (GH7501) #7502

Merged

jackzyliu mentioned this issue Jul 4, 2021

ENH: Add support for fixed parameters to SARIMAXSpecification and SARIMAXParams #7529

Open

This was referenced Jul 4, 2021

ENH: Add support for fixed parameters to SARIMAXSpecification and SARIMAXParams (GH7529) #7530

Open

ENH: Fixed parameters for ARMAX GLS #7531

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Improvements to new ARIMA-type estimators #6159

ENH: Improvements to new ARIMA-type estimators #6159

ChadFulton commented Sep 11, 2019 •

edited

Loading

rajathpatel23 commented Sep 15, 2019

ChadFulton commented Sep 16, 2019

emilmirzayev commented Sep 18, 2019

bashtage commented Sep 18, 2019

bashtage commented Sep 18, 2019

madhushree14 commented Dec 10, 2020

ChadFulton commented Dec 11, 2020

madhushree14 commented Dec 11, 2020

madhushree14 commented Jan 3, 2021

jackzyliu commented May 24, 2021

ChadFulton commented May 27, 2021

jtimko16 commented Oct 9, 2022 •

edited

Loading

ENH: Improvements to new ARIMA-type estimators #6159

ENH: Improvements to new ARIMA-type estimators #6159

Comments

ChadFulton commented Sep 11, 2019 • edited Loading

General

GLS

Hannan Rissanen

Innovations MLE

Innovations algorithm

rajathpatel23 commented Sep 15, 2019

ChadFulton commented Sep 16, 2019

emilmirzayev commented Sep 18, 2019

bashtage commented Sep 18, 2019

bashtage commented Sep 18, 2019

madhushree14 commented Dec 10, 2020

ChadFulton commented Dec 11, 2020

madhushree14 commented Dec 11, 2020

madhushree14 commented Jan 3, 2021

jackzyliu commented May 24, 2021

ChadFulton commented May 27, 2021

jtimko16 commented Oct 9, 2022 • edited Loading

ChadFulton commented Sep 11, 2019 •

edited

Loading

jtimko16 commented Oct 9, 2022 •

edited

Loading