ref:
http://www.econ.uiuc.edu/~econ508/R/e-ta6_R.html




## Tests for Autocorrelated Errors
### Background:


If you run a regression without lagged variables, and detect autocorrelation, your OLS estimators are unbiased, consistent, but inefficient and provide incorrect standard errors. 

In the case that you include lagged dependent variables among the covariates and still detect autocorrelation, then you are in bigger trouble: OLS estimators are inconsistent.

To test for the presence of autocorrelation, you have a large menu of options. Here we suggest the use of the `Breusch-Godfrey` test, 

In [2]:
#require(zoo)
#install.packages("dyn", repos= "https://cran.rstudio.com" )
library(dyn) # for time series, dealing with lag or NA problems 

package 'dyn' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\oldyu\AppData\Local\Temp\RtmpmOIDJr\downloaded_packages


"package 'dyn' was built under R version 3.4.3"

In [3]:
auto<-read.table("http://www.econ.uiuc.edu/~econ508/data/AUTO2.txt",header=T)
head(auto)

quarter,gas,price,income,miles
1959.1,-8.015248,4.67575,-4.50524,2.647592
1959.2,-8.01106,4.691292,-4.492739,2.647592
1959.3,-8.019878,4.689134,-4.498873,2.647592
1959.4,-8.012581,4.722338,-4.491904,2.647592
1960.1,-8.016769,4.70747,-4.490103,2.647415
1960.2,-7.976376,4.699136,-4.489107,2.647238


In [10]:
# generate square terms
auto$price2<-auto$price^2
auto$princ<-auto$price*auto$income

In [14]:
gas<-ts(auto$gas,start=1959,frequency=4)
price<-ts(auto$price,start=1959,frequency=4)
income<-ts(auto$income,start=1959,frequency=4)
miles<-ts(auto$miles,start=1959,frequency=4)
price2<-price^2
princ<-price*income


### Test: Breusch-Godfrey

Run an OLS in your original equation:

$$gas_{t} = \beta_{0} + \beta_{1} income_{t}+ \beta_{2} price_{t} + \beta_{3} (price_{t})^{2} + \beta_{4} (price_{t}*income_{t}) + u_{t}$$

Obtain the estimated residuals:

Regress the estimated residuals (uhat) on the explanatory variables of the original model (income, price, price2, priceinc, constant) and lagged residuals (L.uhat). Call this the `auxiliary regression`.


From the auxiliary regression above, obtain the R-squared and multiply it by the number of included observations:

In [11]:
model<-lm(gas~income+price+price2+princ, auto) 

In [12]:
uhat<-model$resid

In [13]:
uhat<- ts(uhat,start=1959,frequency=4)

In [15]:
model.adj<-dyn$lm(uhat~lag(uhat,-1)+income+price+price2+princ) 

In [16]:
R2<-summary(model.adj)$r.squared
R2

Under the null hypothesis of no autocorrelation, the test statistic NR2 converges asymptotically to a Chi-squared with s degrees of freedom, where s is the number of lags of the residuals included in the auxiliary regression. In the case above, s=1, and we have:

In [17]:
N<-127 #Sample size

# Or N<-(model$df)+length(model$coef) 

N*R2

In the example above, NR2 = 115.46 > 3.84 = Chi2 (1, 5%). Hence, we reject the null hypothesis of no autocorrelation on the disturbances.

###  Test for ARCH Errors
To test for ARCH errors, you can use an LM test as follows:

Run an OLS in your original equation:

```
model2<-lm(gas~income+price+price2+princ)
```

Generate the residuals and the squared residuals.

```
uhat2<-(model$resid)^2
uhat2<-ts(uhat2,start=1959,frequency=4)
```

Regress squared residuals on the explanatory variables of the original model (income, price, price2, priceinc, constant) and lagged squared residuals. Call this an auxiliary regression.

```
f<-dyn$lm(uhat2~lag(uhat2,-1)+lag(uhat2,-2)+ lag(uhat2,-3)+lag(uhat2,-4)+price+ income+price2+princ)
```

From the auxiliary regression, calculate NR2 and compare with a Chi-squared (q, 5%), where q is the number of included lags of the squared residuals:

```
R2<-summary(f)$r.squared
n<-(model$df)+length(model$coef) 
n*R2
```



In [18]:
uhat2<-(model$resid)^2
uhat2<-ts(uhat2,start=1959,frequency=4)

In [19]:
f<-dyn$lm(uhat2~lag(uhat2,-1)+lag(uhat2,-2)+ lag(uhat2,-3)+lag(uhat2,-4)+price+ income+price2+princ)

In [20]:
R2<-summary(f)$r.squared
n<-(model$df)+length(model$coef) 
n*R2

In [21]:
qchisq(.95, df=4) 

Under the null hypothesis of no ARCH errors, the test statistic NR2 converges asymptotically to a Chi-squared with q degrees of freedom, where q is the number of lags of the squared residuals included in the auxiliary regression. In the case above, q=4, and NR2=91.93 > 9.49 = Chi-squared(4, 5%). Therefore, we reject the null hypothesis of no ARCH, and admit that our regression presents time-varying variance.

In [None]:
d.d<-read.table("http://www.econ.uiuc.edu/~econ508/data/CPS.txt",header=T)
head(d.d)