ARIMA - The computed initial AR coefficients are not stationary #1155

Closed
feathj opened this Issue Oct 25, 2013 · 10 comments

Projects

None yet

3 participants

@feathj
feathj commented Oct 25, 2013

I hate pepper this repo with "that's not what happens in R!" type issues, but I am converting an R prototype into python, and that is what I am up against.

Given the following input set:
6.287,6.416,6.418,6.301,6.494,6.701,6.974,7.128,7.398,7.72,7.859,7.674,7.636,7.684,7.921,8.236,8.346,8.427,8.617,8.762,8.99,9.09,9.271,9.485,9.661,9.998,10.257,10.577,10.876,10.954,11.19,11.39,11.515

With order:
4,0,1

I am given the "computed initial AR coefficients are not stationary" error. For reference, R produces the following:

11.9341392,12.20679073,12.42277805,12.76083382,12.83398086,12.99070015,13.28943618,13.44647747,13.81164735,14.15876664

Any direction on this would be appreciated. Thanks!

@josef-pkt
Member

What are the parameters that R is estimating?

Similar case as in the other issue:

Your data is not stationary, so you need to decide what you want:

You could add a linear trend. (trend stationary)
you could use ARIMA with differencing (4, 1, 1) (difference stationary)

ARMA models are not designed for estimating non-stationary processes (not in Box-Jenkins), ARIMA are.

@jseabold
Member

I also get a warning from R, so it looks like you're getting junk again. What does a unit root test tell you?

> y = c(6.287,6.416,6.418,6.301,6.494,6.701,6.974,7.128,7.398,7.72,7.859,7.674,7.636,7.684,7.921,8.236,8.346,8.427,8.617,8.762,8.99,9.09,9.271,9.485,9.661,9.998,10.257,10.577,10.876,10.954,11.19,11.39,11.515)
> mod <- arima(y, c(4,0,1))
Warning message:
In arima(y, c(4, 0, 1)) : possible convergence problem: optim gave code=1
@feathj
feathj commented Oct 25, 2013

You could add a linear trend. (trend stationary)
you could use ARIMA with differencing (4, 1, 1) (difference stationary)

@josef-pkt I tried (4, 1, 1) as well and I got the same result. Not sure what you mean by "adding a linear trend". My lack of understanding this domain is really starting to show through :)

What does a unit root test tell you?

@jseabold You will have to bear with me, I am not sure what this means either. I don't have much of a stats background.

@jseabold
Member

The presence of a unit roots suggests that you might need to difference your data to induce stationarity.

You might need to twice difference this. Even with ARIMA(4,1,1) the roots are still very close to on the unit circle.

Where does this data come from? What's the generating process

@josef-pkt
Member

@jseabold Is the mean in ARIMA the drift? After differencing, I think this should be stationary and work.

"adding a linear trend"

ARMA with trend assumes we have a deterministic trend plus an ARMA error y = a * t + u_t, u_t is ARMA

The fit method only allows trend to be a constant
http://statsmodels.sourceforge.net/devel/generated/statsmodels.tsa.arima_model.ARMA.fit.html
then the trend would have to be added through the exog when creating the model, e.g. exog=np.arange(len(endog))
endog is Y, exog would be your linear trend.

@jseabold
Member

Though you may be okay with something like this

mod = sm.tsa.ARIMA(y, (4,1,1)).fit(trend='nc')
@jseabold
Member

I think it's not the drift given that the way it's written I think we have, but I'd have to double check (and spend some time thinking about it). This could probably use some review.

phi(L)(y_t - const) = \theta(L)\epsilon_t

Partially related to #274.

@josef-pkt
Member

But is y_t in this the differenced series in ARIMA?

phi(L)((1-L) y_t - const) = \theta(L)\epsilon_t

in this case const would be the drift (or stochastic trend)

Should be easy to verify if the forecast of an ARIMA(1, 1, 1) continuous around a drift trend.

Plotting the data looks like that there might be structural breaks or large multiperiod shocks that interrupt the trend.
I would raise a warning "flag" for automatic forecasting in cases like these, that they have to be "really" checked by comparing to other methods or/and by a human.

@feathj
feathj commented Oct 25, 2013

For our particular use case, the mantra is "an incorrect forecast is better than no forecast". Human interaction will be required to check most of the sets that are forecasted. R does indeed give a warning with the original set of data that I passed in, but it goes through with the forecast notwithstanding.

Does it make sense for us to somehow allow the forecast to go through with warnings instead of halting calculation with the exceptions? Maybe by passing a flag to the model?

@jseabold to answer your question about the source of this data, it is economic data. This particular set of data is employment numbers from IMF for Australia.

@jseabold
Member
  1. I would try to make sure you have the right model first. Try to make your data stationary if that's the problem, or include some exogenous variable to account for different regimes maybe.
  2. If you know you're going to have weird data, then I would put the data in a try/except loop and catch the error we return. Something like this
try:
    mod = ARMA(...).fit()
except:
    mod = ARMA(...).fit(start_params=[1, .1, .1, .1])

or whatever if the problem is that you're getting the initial values are not stationary. I'm not sure about our including a flag to avoid this check, because it will just lead to convergence problems down the road. If you really want to force it to go through, you can try

mod = ARMA(...).fit(transparams=False)

It should at least give you something, but I'd add in some checks for stationarity if you're doing forecasting.

@jseabold jseabold closed this Oct 27, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment