arma with exog #14

Closed
wesm opened this Issue Aug 25, 2011 · 9 comments

Projects

None yet

3 participants

@wesm
Member
wesm commented Aug 25, 2011

Original Launchpad bug 800011: https://bugs.launchpad.net/statsmodels/+bug/800011
Reported by: josef-pktd (joep).

is this a bug?

import numpy as np
import scikits.statsmodels.api as sm

exog = sm.add_constant(np.random.randn(100, 2), prepend=True)
endog = exog.sum(1) + 0.2 * np.random.randn(100)
modarma = sm.tsa.ARMA(endog, exog)
resarma = modarma.fit(order=(1,1))

Optimization terminated successfully.
         Current function value: 5.533772
         Iterations 12
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "sm_overview.py", line 36, in <module>
    resarma = modarma.fit(order=(1,1))
  File "E:\Josef\eclipsegworkspace\statsmodels-git\statsmodels-josef\scikits\statsmodels\tsa\arima_model.py", line 359, in fit
    start_params = self._fit_start_params((k_ar,k_ma,k))
  File "E:\Josef\eclipsegworkspace\statsmodels-git\statsmodels-josef\scikits\statsmodels\tsa\arima_model.py", line 76, in _fit_start_params
    start_params[:k] = ols_params
ValueError: shape mismatch: objects cannot be broadcast to a single shape
@wesm
Member
wesm commented Aug 25, 2011

[ LP comment 1 by: Skipper Seabold, on 2011-06-21 02:46:19.321966+00:00 ]

There is already a constant in exog, and then you try to add one with trend = 'c' in fit. The way, add_trend/add_constant are written, they will not add another one, but the model sets k_trend = 1. I guess the error message could be more informative. Maybe the helper functions should raise a warning or an error, "X already contains a constant"?

But I get a singular matrix error from the optimization if I try to fit it with trend='nc'. This could be the case when the model is badly mis-specified rather than a bug, but I'm not sure.

@wesm
Member
wesm commented Aug 25, 2011

[ LP comment 2 by: Skipper Seabold, on 2011-06-21 02:49:59.614825+00:00 ]

The docstring of add_trend says that it doesn't check for this, but it does check for an existing constant if trend=='c'. I guess this should be made more robust. A check for a trend would be something like

if np.any(np.all(np.diff(X,1,0)==1,0)), then has_trend.

@wesm
Member
wesm commented Aug 25, 2011

[ LP comment 3 by: Skipper Seabold, on 2011-06-21 02:51:47.696926+00:00 ]

Yeah, it just doesn't like the mis-specified model. Maybe we could catch these LinAlgErrors and warn that the model is probably mis-specified?

[~/]
[36]: resarma = modarma.fit(order=(1,0),trend='nc', disp=0)                                          
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            4     M =           12
 This problem is unconstrained.

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N   Tit  Tnf  Tnint  Skip  Nact     Projg        F
    4   14   18      1     0     0   1.137D-05  -2.906D+01

CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH             

 Total User time 1.500E-01 seconds.


[~/]
[37]: resarma.params
[37]: array([ 0.98864285,  1.00004556,  1.00576764, -0.0356923 ])
@wesm
Member
wesm commented Aug 25, 2011

[ LP comment 4 by: joep, on 2011-06-21 03:21:28.247903+00:00 ]

I was just checking if I can run all models in the same or similar way

modols = sm.OLS(endog, exog)
modglsar = sm.GLSAR(endog, exog, rho=2)
modrlm = sm.RLM(endog, exog)
modlog = sm.Logit(endogbin, exog)
modglm = sm.GLM(endogbin, exog, family=sm.families.Binomial())
modar = sm.tsa.AR(endog)
modarma = sm.tsa.ARMA(endog)
modvar = sm.tsa.VAR(np.column_stack((endog, exog[:,1:])))

resols = modols.fit()
resglsar = modglsar.fit()
resrlm = modrlm.fit()
reslog = modlog.fit()
resglm = modglm.fit()
resar = modar.fit()
resarma = modarma.fit(order=(1,0))
resvar = modvar.fit()

ARMA is the only one without a default for everything in fit() but GenericLikelihood might also require something

I haven't looked at any details

instead of if np.any(np.all(np.diff(X,1,0)==1,0)), then has_trend.
maybe a check that the np.diff(X,1,0).var() < epsilon

for polynomial trends, I switched to rescaling trend to [-1, 1] or something like this, trend = np.linspace(-1,1,nobs)

@josef-pkt
Member

setting to prio-high as reminder to check whether this is still relevant

@jseabold
Member
jseabold commented May 1, 2013

Should we do something about this or just let the user figure out that they shouldn't add a constant twice?

@josef-pkt
Member

updated the example, still the original issue

I don't know why the constant is not treated as any other exog.
If a constant is detected already, then we should raise a ValueError

>>> exog = sm.add_constant(np.random.randn(100, 2), prepend=True)
>>> endog = exog.sum(1) + 0.2 * np.random.randn(100)
>>> modarma = sm.tsa.ARMA(endog, exog=exog, order=(1,1))
>>> resarma = modarma.fit()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2\statsmodels\statsmodels\tsa\arima_model.py", line 828, in fit
    start_params = self._fit_start_params((k_ar,k_ma,k), method)
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2\statsmodels\statsmodels\tsa\arima_model.py", line 453, in _fit_start_params
    start_params = self._fit_start_params_hr(order)
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2\statsmodels\statsmodels\tsa\arima_model.py", line 395, in _fit_start_params_hr
    start_params[:k] = ols_params

ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>> modarma = sm.tsa.ARMA(endog, exog=exog[:, 1:], order=(1,1))
>>> resarma = modarma.fit()
@josef-pkt
Member

removing prio-high tag again

@jseabold
Member
jseabold commented Feb 6, 2014

I'm going to close this. Issue is just that exog shouldn't contain a constant. This is specified in the documentation.

@jseabold jseabold closed this Feb 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment