Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arma with exog #14

Closed
wesm opened this issue Aug 25, 2011 · 9 comments

Comments

Projects
None yet
3 participants
@wesm
Copy link
Member

commented Aug 25, 2011

Original Launchpad bug 800011: https://bugs.launchpad.net/statsmodels/+bug/800011
Reported by: josef-pktd (joep).

is this a bug?

import numpy as np
import scikits.statsmodels.api as sm

exog = sm.add_constant(np.random.randn(100, 2), prepend=True)
endog = exog.sum(1) + 0.2 * np.random.randn(100)
modarma = sm.tsa.ARMA(endog, exog)
resarma = modarma.fit(order=(1,1))

Optimization terminated successfully.
         Current function value: 5.533772
         Iterations 12
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "sm_overview.py", line 36, in <module>
    resarma = modarma.fit(order=(1,1))
  File "E:\Josef\eclipsegworkspace\statsmodels-git\statsmodels-josef\scikits\statsmodels\tsa\arima_model.py", line 359, in fit
    start_params = self._fit_start_params((k_ar,k_ma,k))
  File "E:\Josef\eclipsegworkspace\statsmodels-git\statsmodels-josef\scikits\statsmodels\tsa\arima_model.py", line 76, in _fit_start_params
    start_params[:k] = ols_params
ValueError: shape mismatch: objects cannot be broadcast to a single shape
@wesm

This comment has been minimized.

Copy link
Member Author

commented Aug 25, 2011

[ LP comment 1 by: Skipper Seabold, on 2011-06-21 02:46:19.321966+00:00 ]

There is already a constant in exog, and then you try to add one with trend = 'c' in fit. The way, add_trend/add_constant are written, they will not add another one, but the model sets k_trend = 1. I guess the error message could be more informative. Maybe the helper functions should raise a warning or an error, "X already contains a constant"?

But I get a singular matrix error from the optimization if I try to fit it with trend='nc'. This could be the case when the model is badly mis-specified rather than a bug, but I'm not sure.

@wesm

This comment has been minimized.

Copy link
Member Author

commented Aug 25, 2011

[ LP comment 2 by: Skipper Seabold, on 2011-06-21 02:49:59.614825+00:00 ]

The docstring of add_trend says that it doesn't check for this, but it does check for an existing constant if trend=='c'. I guess this should be made more robust. A check for a trend would be something like

if np.any(np.all(np.diff(X,1,0)==1,0)), then has_trend.

@wesm

This comment has been minimized.

Copy link
Member Author

commented Aug 25, 2011

[ LP comment 3 by: Skipper Seabold, on 2011-06-21 02:51:47.696926+00:00 ]

Yeah, it just doesn't like the mis-specified model. Maybe we could catch these LinAlgErrors and warn that the model is probably mis-specified?

[~/]
[36]: resarma = modarma.fit(order=(1,0),trend='nc', disp=0)                                          
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            4     M =           12
 This problem is unconstrained.

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N   Tit  Tnf  Tnint  Skip  Nact     Projg        F
    4   14   18      1     0     0   1.137D-05  -2.906D+01

CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH             

 Total User time 1.500E-01 seconds.


[~/]
[37]: resarma.params
[37]: array([ 0.98864285,  1.00004556,  1.00576764, -0.0356923 ])
@wesm

This comment has been minimized.

Copy link
Member Author

commented Aug 25, 2011

[ LP comment 4 by: joep, on 2011-06-21 03:21:28.247903+00:00 ]

I was just checking if I can run all models in the same or similar way

modols = sm.OLS(endog, exog)
modglsar = sm.GLSAR(endog, exog, rho=2)
modrlm = sm.RLM(endog, exog)
modlog = sm.Logit(endogbin, exog)
modglm = sm.GLM(endogbin, exog, family=sm.families.Binomial())
modar = sm.tsa.AR(endog)
modarma = sm.tsa.ARMA(endog)
modvar = sm.tsa.VAR(np.column_stack((endog, exog[:,1:])))

resols = modols.fit()
resglsar = modglsar.fit()
resrlm = modrlm.fit()
reslog = modlog.fit()
resglm = modglm.fit()
resar = modar.fit()
resarma = modarma.fit(order=(1,0))
resvar = modvar.fit()

ARMA is the only one without a default for everything in fit() but GenericLikelihood might also require something

I haven't looked at any details

instead of if np.any(np.all(np.diff(X,1,0)==1,0)), then has_trend.
maybe a check that the np.diff(X,1,0).var() < epsilon

for polynomial trends, I switched to rescaling trend to [-1, 1] or something like this, trend = np.linspace(-1,1,nobs)

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Oct 6, 2011

setting to prio-high as reminder to check whether this is still relevant

@jseabold

This comment has been minimized.

Copy link
Member

commented May 1, 2013

Should we do something about this or just let the user figure out that they shouldn't add a constant twice?

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Aug 15, 2013

updated the example, still the original issue

I don't know why the constant is not treated as any other exog.
If a constant is detected already, then we should raise a ValueError

>>> exog = sm.add_constant(np.random.randn(100, 2), prepend=True)
>>> endog = exog.sum(1) + 0.2 * np.random.randn(100)
>>> modarma = sm.tsa.ARMA(endog, exog=exog, order=(1,1))
>>> resarma = modarma.fit()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2\statsmodels\statsmodels\tsa\arima_model.py", line 828, in fit
    start_params = self._fit_start_params((k_ar,k_ma,k), method)
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2\statsmodels\statsmodels\tsa\arima_model.py", line 453, in _fit_start_params
    start_params = self._fit_start_params_hr(order)
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2\statsmodels\statsmodels\tsa\arima_model.py", line 395, in _fit_start_params_hr
    start_params[:k] = ols_params

ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>> modarma = sm.tsa.ARMA(endog, exog=exog[:, 1:], order=(1,1))
>>> resarma = modarma.fit()
@josef-pkt

This comment has been minimized.

Copy link
Member

commented Aug 15, 2013

removing prio-high tag again

@jseabold

This comment has been minimized.

Copy link
Member

commented Feb 6, 2014

I'm going to close this. Issue is just that exog shouldn't contain a constant. This is specified in the documentation.

@jseabold jseabold closed this Feb 6, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.