# Robust Estimator of Covariance Matrix

## A dataset without heteroskedasticity

In [1]:
library(lmtest)
library(sandwich)

set.seed(1)
N<-50
# generate linear regression relationship
# with Homoskedastic variances
b<-.1

x <- sin(1:N)
y <- 1 + b*x + 5*rnorm(N)
## model fit and HC3 covariance
lm.fit <- lm(y ~ x)

summary(lm.fit)

Loading required package: zoo

Attaching package: ‘zoo’

The following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric




Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-11.2556  -2.5615   0.1541   3.0105   7.2307 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   1.5016     0.5931   2.532   0.0147 *
x            -0.2225     0.8368  -0.266   0.7914  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.194 on 48 degrees of freedom
Multiple R-squared:  0.001471,	Adjusted R-squared:  -0.01933 
F-statistic: 0.07073 on 1 and 48 DF,  p-value: 0.7914


### With heteroskdasticity

In [2]:
set.seed(1)

y <- 1 + b*x + 5*rnorm(N,mean=0,sd=1+x)

We can fit the linear model, but the standard error will be incorrect because it assume sthat the varaince of the observations is constant

In [3]:

lm.fit <- lm(y ~ x)
summary(lm.fit)


Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-22.5372  -1.0318  -0.1718   2.8359  10.3111 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   1.3395     0.7375   1.816   0.0756 .
x             0.2561     1.0406   0.246   0.8066  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 5.215 on 48 degrees of freedom
Multiple R-squared:  0.00126,	Adjusted R-squared:  -0.01955 
F-statistic: 0.06057 on 1 and 48 DF,  p-value: 0.8066


### Test with "robust" standard errors
Using the coeftest() function, we can calculate a "robust" standard error. Note that it is larger then the one calucaltd from the (incorrect) linear model which assumes that the variance of each observation are equal.

In [4]:
coeftest(lm.fit, vcov. = vcovHC)


t test of coefficients:

            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  1.33954    0.76165  1.7587   0.0850 .
x            0.25609    1.27954  0.2001   0.8422  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
