FAQ: ARIMAX versus Stata #2474

josef-pkt · 2015-06-27T17:45:54Z

When estimating an ARIMAX(p, 1, q), Stata differences also the exog variables. The statsmodels version does not difference the exog.

In order to replicate the behavior of Stata, we need to diff the exog ourselves.
Note: When we diff, then we need to preserve the initial observation which is nan, since it will be truncated during estimation.

Numpy np.diff drops the invalid initial observation. pandas DataFrame diff keeps the initial observation as missing.

The following replicates the Stata results. (I'm using ndarrays in the model, but the same should work with pandas.

mod111 = sm.tsa.ARIMA(np.asarray(data_sample['loginv']), (1,1,1), 
                   #exog=np.asarray(data_sample[['loggdp', 'logcons']]))   # exog in levels
                   exog=np.asarray(data_sample[['loggdp', 'logcons']].diff()))

res111 = mod111.fit(disp=1, solver='bfgs', maxiter=5000)
exog_full_d = data[['loggdp', 'logcons']].diff()
res111.predict(start=197, end=202, exog=exog_full_d.values[197:])

The text was updated successfully, but these errors were encountered:

josef-pkt · 2015-06-27T17:46:27Z

@jseabold @ChadFulton Did you run into this before?

jseabold · 2015-06-27T18:12:59Z

I used gretl for all validation for time series.

josef-pkt added comp-tsa FAQ labels Jun 27, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ: ARIMAX versus Stata #2474

FAQ: ARIMAX versus Stata #2474

josef-pkt commented Jun 27, 2015

josef-pkt commented Jun 27, 2015

jseabold commented Jun 27, 2015 via email

FAQ: ARIMAX versus Stata #2474

FAQ: ARIMAX versus Stata #2474

Comments

josef-pkt commented Jun 27, 2015

josef-pkt commented Jun 27, 2015

jseabold commented Jun 27, 2015 via email