# **Econometrics 2**

# PC-Lab Session 2: Asymptotic Variance

**Author:** [Anthony Strittmatter](http://www.anthonystrittmatter.com)

We estimate the OLS coefficients and asymptotic variance using matrix algebra. In the first place, we have to load some packages. 

In [None]:
########################  Load Packages  ########################

# List of required packages
pkgs <- c('psych', 'ggplot2', 'dplyr')

# Load packages
for(pkg in pkgs){
    library(pkg, character.only = TRUE)
}

print('All packages successfully installed and loaded.')

## Data Generating Process (DGP)

We generate an artificial dataset ($N=200$). We consider the linear model,
\begin{equation*}
y= \beta_0 + \beta_1 x_1 + u,
\end{equation*}
with $\beta_0=0$, $\beta_1=1$, and $x_1,u \sim N(0,1)$.

In [None]:
############## Data Generating Process (DGP) ##############
set.seed(1001)

N <- 200 # sample size

# Generate variables
x0 <- matrix(1, nrow = N, ncol = 1) # intercept (vector of one'S)
x1 <- matrix(rnorm(N), nrow = N, ncol = 1) # standard normal distributed covariate
X <- cbind(x0,x1) # matrix of covariates
u <- matrix(rnorm(N), nrow = N, ncol = 1) # standard normal distributed error term
y <- x1 + u # outcome variable
# the true effect of x1 on y is 1
# the true intercept is 0

dataset <- as.data.frame(cbind(y,x1)) # dataframe will be needed later

print('Data is generated.')

## Descriptive Statistics

In [None]:
############## Descriptive Statistics ##############

round(describe(dataset), digits=3)

## Scatter Plot

In [None]:
############## Scatter Plot ##############

dataset %>%
 ggplot(aes(x = x1, y = y)) +
 geom_point(colour = "red") 

## OLS Results Uning Off-the-Shelf Code

In [None]:
############## Off-the-Shelf OLS estimator ##############

# data has to be in a dataframe (and not matrix) to use the lm command
lmodel <- lm(y ~ x1, data = dataset)
summary(lmodel)

# Students Exercises

1. Use matrix algebra to calulate the OLS coefficients:
\begin{equation*}
\hat{\beta} = (X'X)^{-1} X'Y.
\end{equation*}




The following R commands might be useful:
- *%*%*: matrix multiplication
- *solve()*: calculates the inverse
- *t()*: calculates the transposed
- *diag(a)*: builds an identity matrix of dimension a
- *for (i in c(1:10)) { }*: loops i in { } from 1 to 10

In [None]:
############## Put your code here ##############

# Apply OLS formula


#################################################

2. Calulate the variance and standard errors of $\hat{\beta}_0$ and $\hat{\beta}_1$. Use the homoskedastic variance formula
\begin{equation*}
Var(\hat{\beta}) = \sigma^2 (X'X)^{-1},
\end{equation*}
with $\displaystyle \sigma^2 = \frac{1}{N-2} \sum_{i=1}^{N} \hat{u}_i^2$.

In [None]:
############## Put your code here ##############

# Calulate the error term

# Calculate Sigma-squared with degrees-of-freedom adjustment

# Inverse design matrix

#################################################

3. Calulate the variance and standard errors of $\hat{\beta}_0$ and $\hat{\beta}_1$, using the heteroskedasticity robust Eicker–Huber–White variance formula
\begin{equation*}
Var(\hat{\beta}) = (X'X)^{-1} (X' diag(u_1^2, ..., u_N^2) X )(X'X)^{-1}.
\end{equation*}
How would you make the degree-of-freedom adjustment?


In [None]:
############## Put your code here ##############

# Calulate diagonal matrix of squared error (using a loop)


# Variance calulation

#################################################

4. Bootstrap the variance and standard errors of $\hat{\beta}_0$ and $\hat{\beta}_1$ (with 9,999 bootstrap replications).

Helpful R commands:
- *srswr()*: draws random sample with replacement 
- *rep(x,times=a)*: repeats a times observations in x 

In [None]:
############## Put your code here ##############
set.seed(1001)
rep = 9999

# Loop with boostrap resamples

#################################################