# Instrumental variable estimation in R

R Package requirements:
* `AER`
* `sampleSelection`
* `knitr`
* `tidyverse`
* `ivpack`
* `ivmodel`

Reference: Principles of Econometrics with `R` available at https://bookdown.org/ccolonescu/RPoE4/

In `R`: You can use the function `ivreg()` which is in the package *AER: Applied Econometrics with R.* The package gives you also various relevant statistics for weak instruments and over-identification. 

You may need to first install AER:

`install.packages("AER", dependencies = TRUE)`

`?ivreg` 

tells you a bit about the function. Follow the same procedure to install all other necessary packages.

Other useful packages for more advanced analysis include `ivmodel` and `ivpack`. And again apply the same procedure to load or install the packages.

To use `ivreg()` you first need to load the AER package:

### `Mroz87` dataset (from {sampleSelection} package)

`Mroz87` data frame contains data about 753 married women. These data are collected within the "Panel Study of Income Dynamics" (PSID). Of the 753 observations, the first 428 are for women with positive hours worked in 1975, while the remaining 325 observations are for women who did not work for pay in 1975. A more complete discussion of the data is found in Mroz (1987), Appendix 1.

References: Mroz, T.A. (1987). The Sensitivity of an Empirical Model of Married Women's Hours of Work to Economic and Statistical Assumptions. *Econometrica*, **55**, 765–799.

Consider the following wage model `mroz` dataset. 
$$
\log(\textrm{wage})=\beta_1+\beta_2 \textrm{educ}+\beta_3 \textrm{exper}+\beta_4 \textrm{exper}^2+\epsilon
$$
The problem with this model is that the error term may include some unobserved attributes, such as personal ability, that determine both `wage` and `education`. Hence, the independent variable `educ` is correlated with the error term, is endogenous.

So we need an instrument to address the endogeneity of `educ`. We can try with `motheduc`. It is reasonable to assume that it does not directly influence the daughter’s wage, but it influences her education.

#### First stage
First we run the two-stage model with only one instrument,  `motheduc`. The first stage is to regress  `educ` on other regressors and the instrument, 
$$
\textrm{educ}=\delta_1+\delta_2 \textrm{exper}+\delta_3 \textrm{exper}^2+\theta_1 \textrm{motheduc}+\nu
$$


In [2]:
library(AER)
library(sampleSelection)

In [3]:
data(Mroz87)
mroz1 <- Mroz87[Mroz87$lfp==1,] #restricts sample to lfp=1
educ.ols <- lm(educ~exper+I(exper^2)+motheduc, data=mroz1)

In [8]:
library(knitr)
library(tidyverse)
library(broom)

In [9]:
 lm(educ~exper+I(exper^2)+motheduc, data=mroz1)%>%
  tidy() %>%
  kable(col.names = c("Predictor", "Coefficents", "Std error", "t-stat", "p-value"),
          digits=4, align='c',caption= "First stage in the 2SLS model for the 'wage' equation") 
         #kable_classic(full_width = F, html_font = "Cambria")



Table: First stage in the 2SLS model for the 'wage' equation

|  Predictor  | Coefficents | Std error | t-stat  | p-value |
|:-----------:|:-----------:|:---------:|:-------:|:-------:|
| (Intercept) |   9.7751    |  0.4239   | 23.0605 | 0.0000  |
|    exper    |   0.0489    |  0.0417   | 1.1726  | 0.2416  |
| I(exper^2)  |   -0.0013   |  0.0012   | -1.0290 | 0.3040  |
|  motheduc   |   0.2677    |  0.0311   | 8.5992  | 0.0000  |

The p-value for `motheduc` is very low indicating a strong correlation between this instrument and the endogenous variable `educ` **even after controling for other variables**. 

#### Second stage
The second stage in the two-stage procedure is to create the fitted values of `educ` from the first stage and plug them into the model of interest to replace the original variable `educ`.

In [10]:
educHat <- fitted(educ.ols)
wage.2sls <- lm(log(wage)~educHat+exper+I(exper^2), data=mroz1)
kable(tidy(wage.2sls), col.names = c("Predictor", "Coefficents", "Std error", "t-stat", "p-value"),
      digits=4, align='c',caption=
  "Second stage in the 2SLS model for the 'wage' equation")



Table: Second stage in the 2SLS model for the 'wage' equation

|  Predictor  | Coefficents | Std error | t-stat  | p-value |
|:-----------:|:-----------:|:---------:|:-------:|:-------:|
| (Intercept) |   0.1982    |  0.4933   | 0.4017  | 0.6881  |
|   educHat   |   0.0493    |  0.0391   | 1.2613  | 0.2079  |
|    exper    |   0.0449    |  0.0142   | 3.1668  | 0.0017  |
| I(exper^2)  |   -0.0009   |  0.0004   | -2.1749 | 0.0302  |

These are the results of the explicit 2SLS procedure. Note that the standard errors calculated in this way are **incorrect**. The correct method is to use the `ivreg` function to solve an instrumental variable model in `R`.

Let's compare some models:

In [11]:
mroz1.ols <- lm(log(wage)~educ+exper+I(exper^2), data=mroz1)
kable(tidy(mroz1.ols), col.names = c("Predictor", "Coefficents", "Std error", "t-stat", "p-value"),
          digits=4, align='c',caption= "OLS model") 

mroz1.iv1 <- ivreg(log(wage)~educ+exper+I(exper^2)|
            exper+I(exper^2)+motheduc, data=mroz1)
kable(tidy(mroz1.iv1), col.names = c("Predictor", "Coefficents", "Std error", "t-stat", "p-value"),
          digits=4, align='c',caption= "2SLS model motheduc as instrument")

mroz1.iv2 <- ivreg(log(wage)~educ+exper+I(exper^2)|
            exper+I(exper^2)+motheduc+fatheduc, data=mroz1) 
kable(tidy(mroz1.iv2), col.names = c("Predictor", "Coefficents", "Std error", "t-stat", "p-value"),
          digits=4, align='c',caption= "2SLS model motheduc & fatheduc as instruments")



Table: OLS model

|  Predictor  | Coefficents | Std error | t-stat  | p-value |
|:-----------:|:-----------:|:---------:|:-------:|:-------:|
| (Intercept) |   -0.5220   |  0.1986   | -2.6282 | 0.0089  |
|    educ     |   0.1075    |  0.0141   | 7.5983  | 0.0000  |
|    exper    |   0.0416    |  0.0132   | 3.1549  | 0.0017  |
| I(exper^2)  |   -0.0008   |  0.0004   | -2.0628 | 0.0397  |



Table: 2SLS model motheduc as instrument

|  Predictor  | Coefficents | Std error | t-stat  | p-value |
|:-----------:|:-----------:|:---------:|:-------:|:-------:|
| (Intercept) |   0.1982    |  0.4729   | 0.4191  | 0.6754  |
|    educ     |   0.0493    |  0.0374   | 1.3159  | 0.1889  |
|    exper    |   0.0449    |  0.0136   | 3.3039  | 0.0010  |
| I(exper^2)  |   -0.0009   |  0.0004   | -2.2690 | 0.0238  |



Table: 2SLS model motheduc & fatheduc as instruments

|  Predictor  | Coefficents | Std error | t-stat  | p-value |
|:-----------:|:-----------:|:---------:|:-------:|:-------:|
| (Intercept) |   0.0481    |  0.4003   | 0.1202  | 0.9044  |
|    educ     |   0.0614    |  0.0314   | 1.9530  | 0.0515  |
|    exper    |   0.0442    |  0.0134   | 3.2883  | 0.0011  |
| I(exper^2)  |   -0.0009   |  0.0004   | -2.2380 | 0.0257  |

The explicit 'Second Stage' model above and the IV model with only `motheduc` instrument yield the same coefficients (the `educ` in the IV model is equivalent to the `educHat` in the second stage), but the standard errors are different. The correct ones are those provided by the IV model.

#### Notes:
1. Some of the individuals are not in the labor force, their wages are zero and the log cannot be calculated. These observations using only those for which `lpf==1`. 
2. The instrument list in the command `ivreg` includes the instrument `motheduc` and all exogenous regressors. 
3. The vertical bar `|` separates the regressor list from the instrument list.

### Specification & Weak Instruments

Let's test for weak instruments in the `wage` equation. We can test the joint significance of the instruments in an `educ` model as shown earlier

$$
\textrm{educ}=\delta_1+\delta_2 \textrm{exper}+\delta_3 \textrm{exper}^2+\theta_1 \textrm{motheduc}+\theta_2 \textrm{fatheduc}+\nu
$$

Essentially we could just run an F-statistic on the first stage regression. And use Stock and Yogo (2004) rule of thumb that the F-stat$>10$ or, for only one instrument, a t-stat$>3.16$, to make sure that we have strong instruments:

In [12]:
educ2.ols <- lm(educ~exper+I(exper^2)+motheduc+fatheduc, data=mroz1)
tab <- tidy(educ.ols)
kable(tab, col.names = c("Predictor", "Coefficents", "Std error", "t-stat", "p-value"), digits=4,
      caption="The 'educ' first-stage equation")
linearHypothesis(educ2.ols, c("motheduc=0", "fatheduc=0"))



Table: The 'educ' first-stage equation

|Predictor   | Coefficents| Std error|  t-stat| p-value|
|:-----------|-----------:|---------:|-------:|-------:|
|(Intercept) |      9.7751|    0.4239| 23.0605|  0.0000|
|exper       |      0.0489|    0.0417|  1.1726|  0.2416|
|I(exper^2)  |     -0.0013|    0.0012| -1.0290|  0.3040|
|motheduc    |      0.2677|    0.0311|  8.5992|  0.0000|

Unnamed: 0_level_0,Res.Df,RSS,Df,Sum of Sq,F,Pr(>F)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,425,2219.216,,,,
2,423,1758.575,2.0,460.6411,55.4003,4.268909e-22


The null hypothesis is rejected that both `motheduc` and `fatheduc` coefficients are zero. This means that **at least one instrument is strong**.

Note that this is not the same as looking at the F-stat from the OLS regression. That F-stat test the linear hypothesis on all regressors, not just the instruments. 

In [13]:
summary(educ2.ols)


Call:
lm(formula = educ ~ exper + I(exper^2) + motheduc + fatheduc, 
    data = mroz1)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.8057 -1.0520 -0.0371  1.0258  6.3787 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  9.102640   0.426561  21.340  < 2e-16 ***
exper        0.045225   0.040251   1.124    0.262    
I(exper^2)  -0.001009   0.001203  -0.839    0.402    
motheduc     0.157597   0.035894   4.391 1.43e-05 ***
fatheduc     0.189548   0.033756   5.615 3.56e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.039 on 423 degrees of freedom
Multiple R-squared:  0.2115,	Adjusted R-squared:  0.204 
F-statistic: 28.36 on 4 and 423 DF,  p-value: < 2.2e-16


#### Wu-Hausman and Sargan tests

Let's test whether the variables of concern are indeed endogenous. This problem is addressed by the Hausman test for endogeneity, where the null hypothesis is H$_0: \textrm{Cov}(x,\epsilon)=0$. If we reject the null hypothesis then we have endogeneity and we need for instrumental variables.

The Sargan test is to test for the validity of instruments. It tests whether the instruments are corrrelated with the error term, and can only be performed for the extra instruments, those that are in excess of the number of endogenous variables. This is also called a test for **overidentifying restrictions**. The null hypothesis is that the covariance between the instrument and the error term is zero, i.e., H$_0: \textrm{Cov}(z,\epsilon)=0$. Rejecting the null indicates that at least one of the extra instruments is not valid.

`R` automatically performs these two tests and the weak instrument test, and reports the results in the output to the `ivreg` function.

In [14]:
summary(mroz1.iv2, diagnostics=TRUE)


Call:
ivreg(formula = log(wage) ~ educ + exper + I(exper^2) | exper + 
    I(exper^2) + motheduc + fatheduc, data = mroz1)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.0986 -0.3196  0.0551  0.3689  2.3493 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.0481003  0.4003281   0.120  0.90442   
educ         0.0613966  0.0314367   1.953  0.05147 . 
exper        0.0441704  0.0134325   3.288  0.00109 **
I(exper^2)  -0.0008990  0.0004017  -2.238  0.02574 * 

Diagnostic tests:
                 df1 df2 statistic p-value    
Weak instruments   2 423    55.400  <2e-16 ***
Wu-Hausman         1 423     2.793  0.0954 .  
Sargan             1  NA     0.378  0.5386    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6747 on 424 degrees of freedom
Multiple R-Squared: 0.1357,	Adjusted R-squared: 0.1296 
Wald test: 8.141 on 3 and 424 DF,  p-value: 2.787e-05 


With one instrument:

In [15]:
summary(mroz1.iv1, diagnostics=TRUE)


Call:
ivreg(formula = log(wage) ~ educ + exper + I(exper^2) | exper + 
    I(exper^2) + motheduc, data = mroz1)

Residuals:
     Min       1Q   Median       3Q      Max 
-3.10804 -0.32633  0.06024  0.36772  2.34351 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.1981861  0.4728772   0.419  0.67535   
educ         0.0492630  0.0374360   1.316  0.18891   
exper        0.0448558  0.0135768   3.304  0.00103 **
I(exper^2)  -0.0009221  0.0004064  -2.269  0.02377 * 

Diagnostic tests:
                 df1 df2 statistic p-value    
Weak instruments   1 424    73.946  <2e-16 ***
Wu-Hausman         1 423     2.968  0.0856 .  
Sargan             0  NA        NA      NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6796 on 424 degrees of freedom
Multiple R-Squared: 0.1231,	Adjusted R-squared: 0.1169 
Wald test: 7.348 on 3 and 424 DF,  p-value: 8.228e-05 


The test results for the wage equation can be interpreted as:

1. Weak instruments test: rejects the null --> that at least one instrument is strong
2. (Wu-)Hausman test for endogeneity: rejects the null that the `educ` is uncorrelated with the error term only at the 10% level, indicating that `educ` is marginally endogenous.
3. Sargan overidentifying restrictions: does not reject the null, meaning that the extra instruments are valid and hence uncorrelated with the error term.

#### More Weak Instrument Tests

In [33]:
library(ivmodel)

#### With one instrument

In [40]:
ivmodel1 <- ivmodel(log(mroz1$wage), mroz1$educ, mroz1$motheduc, mroz1$exper+I(mroz1$exper^2), intercept = TRUE,
        beta0 = 0, alpha = 0.05, k = c(0, 1),
        manyweakSE = FALSE, heteroSE = FALSE, clusterID = NULL,
        deltarange = NULL, na.action = na.omit)
summary(ivmodel1)


Call:
ivmodel(Y = log(mroz1$wage), D = mroz1$educ, Z = mroz1$motheduc, 
    X = mroz1$exper + I(mroz1$exper^2), intercept = TRUE, beta0 = 0, 
    alpha = 0.05, k = c(0, 1), manyweakSE = FALSE, heteroSE = FALSE, 
    clusterID = NULL, deltarange = NULL, na.action = na.omit)
sample size: 428
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

First Stage Regression Result:

F=74.37008, df1=1, df2=425, p-value is < 2.22e-16
R-squared=0.1489278,   Adjusted R-squared=0.1469253
Residual standard error: 2.112016 on 426 degrees of freedom
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Coefficients of k-Class Estimators:

             k Estimate Std. Error t value Pr(>|t|)    
OLS    0.00000  0.11019    0.01427   7.724 8.17e-14 ***
Fuller 0.99765  0.05294    0.03741   1.415    0.158    
TSLS   1.00000  0.05204    0.03768   1.381    0.168    
LIML   1.00000  0.05204    0.03768   1.381    0.168    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
_ _ _ 

`AR.test` computes the Anderson-Rubin (1949) test for the `ivmodel` object as well as the associated confidence interval.

Arguments into AR.test():
- ivmodel = ivmodel object
- beta0 = Null value β0 for testing null hypothesis H0 : β = β0 in ivmodel. Default is 0.
- alpha = The significance level for hypothesis testing. Default is 0.05.

In [41]:
 AR.test(ivmodel1, beta0 = 0, alpha = 0.05)

lower,upper
-0.02786563,0.1244509


It is also possible to construct Confidence Internval for the IV estimates, with Anderson-Rubin CI displayed among others. 

Use `confint` to return a matrix of two columns, where each row represents a confident interval for different IV approaches, including k-Class, Anderson and Rubin (AR) and CLR (Moreira 2003) estimations.

In [42]:
confint(ivmodel1)

Unnamed: 0,2.5%,97.5%
OLS,0.08214757,0.1382262
Fuller,-0.02058514,0.1264713
TSLS,-0.02202749,0.1261006
LIML,-0.02202749,0.1261006
AR,-0.02786563,0.1244509
CLR,-0.02786596,0.1244511


In [45]:
confint(ivmodel1,level=0.90)

Unnamed: 0,5%,95%
OLS,0.0866713,0.1337024
Fuller,-0.008722431,0.1146086
TSLS,-0.010078334,0.1141515
LIML,-0.010078334,0.1141515
AR,-0.013889445,0.1127801
CLR,-0.01388978,0.1127804


#### With two instruments

In [20]:
ivmodel(log(mroz1$wage), mroz1$educ, mroz1$motheduc+mroz1$fatheduc, mroz1$exper+I(mroz1$exper^2), intercept = TRUE,
        beta0 = 0, alpha = 0.05, k = c(0, 1),
        manyweakSE = FALSE, heteroSE = FALSE, clusterID = NULL,
        deltarange = NULL, na.action = na.omit)


Call:
ivmodel(Y = log(mroz1$wage), D = mroz1$educ, Z = mroz1$motheduc + 
    mroz1$fatheduc, X = mroz1$exper + I(mroz1$exper^2), intercept = TRUE, 
    beta0 = 0, alpha = 0.05, k = c(0, 1), manyweakSE = FALSE, 
    heteroSE = FALSE, clusterID = NULL, deltarange = NULL, na.action = na.omit)
sample size: 428
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

First Stage Regression Result:

F=111.3931, df1=1, df2=425, p-value is < 2.22e-16
R-squared=0.2076707,   Adjusted R-squared=0.2058063
Residual standard error: 2.037825 on 426 degrees of freedom
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

Coefficients of k-Class Estimators:

             k Estimate Std. Error t value Pr(>|t|)    
OLS    0.00000  0.11019    0.01427   7.724 8.17e-14 ***
Fuller 0.99765  0.06384    0.03155   2.023   0.0436 *  
TSLS   1.00000  0.06331    0.03170   1.997   0.0464 *  
LIML   1.00000  0.06331    0.03170   1.997   0.0464 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’