# How Violations of the Monotonoicity and First-Stage Assumpions Affect IV Estimates

In [1]:
## Set seed and parameters
library(lfe) # for OLS (easier robust SE)
library(AER) # for IV
library(data.table)
set.seed(20015)
N <- 50000

Loading required package: Matrix

Loading required package: car

Loading required package: carData

Loading required package: lmtest

Loading required package: zoo


Attaching package: ‘zoo’


The following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric



Attaching package: ‘lmtest’


The following object is masked from ‘package:lfe’:

    waldtest


Loading required package: sandwich

Loading required package: survival



Let's look at the effect of confidence on test scores. Since both confidence and scores are influence by the student's underlying talent, we need an instrument to experimentally vary their confidence. We therefore expose half of them to a priming treatment to increase their confidence. The priming effects are heterogenous. For now, we'll assume they are monotonic and non-zero in expectation. Later, we'll violate these assumptions.
* *talent* is an unobserved (latent) variable that drives both confidence and score
* *primed* is a binary indicator for whether a student was primed
* *confidence* is how well the student expects to do after being exposed to the prime or placebo
* *score* is how well the student actually does

In [2]:
# Note that this is a data.table, which will make adding variables much easier after this
d <- data.table(talent = runif(N, 0, 100),
                primed = rbinom(N, size=1, prob=0.5))

First stage effects of priming on confidence are heterogenous, ranging from 0 to 10

In [3]:
d[, stage.1.effect := 10*runif(N, 0, 1)]

Confidence equals talent plus the stage.1.effect (if student was primed) plus noise

In [4]:
d[, confidence := talent + stage.1.effect * primed + runif(N, -10, 10)]

A student's score is 75% the direct result of talent and 25% the result of confidence

In [5]:
d[, score := talent*0.75 + confidence * 0.25]

A peek at the first stage - priming clearly affects confidence (effect is about 5)

In [6]:
d[,mean(confidence), by = primed]

primed,V1
<int>,<dbl>
0,49.96717
1,54.93231


A peek at the reduced form - priming clearlly affects score

In [7]:
d[,mean(score), by = primed] (effect is about about 1.1)

primed,V1
<int>,<dbl>
0,50.00487
1,51.14293


1.1/5 = 0. 22 Thus, the IV estimator correctly estimates that confidence is responsible for about a quart of a student's score. Let's see if the 2sls estimator gives us the same thing.

In [9]:
IV <- ivreg(score ~ confidence | primed , data = d)
summary(IV, robust=TRUE)


Call:
ivreg(formula = score ~ confidence | primed, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-38.7172 -19.1978  -0.2206  19.2539  38.8447 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 38.55188    2.10490  18.315  < 2e-16 ***
confidence   0.22921    0.04009   5.718 1.09e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 22.25 on 49998 degrees of freedom
Multiple R-Squared: 0.4078,	Adjusted R-squared: 0.4078 
Wald test: 32.69 on 1 and 49998 DF,  p-value: 1.086e-08 


Now let's see what happens when there are defiers in the population. The following fucntion allows you to var the proportion of defiers by shifting the range of first_state effects downwards from [0,10] (no defiers) all the way down to [-10,0] (all defiers).

In [10]:
change_first_stage <- function(prop.defiers){
    e <- d
    e[, stage.1.effect := 10*runif(N, 0-prop.defiers, 1-prop.defiers)]
    e[, confidence := talent + stage.1.effect*primed + runif(N, -10, 10)]
    e[, score := talent*0.75 + confidence * 0.25]
    IV <- ivreg(score ~ confidence | primed , data = e)
    t <- summary(IV, robust=TRUE)
    return(c("prop defiers" = prop.defiers,round(t$coefficients[2,c(1,4)],3)))
}

In [11]:
values = seq(0,1,0.1)
lapply( values, change_first_stage)

Note that as the proportion of defiers increases the the estimates become downwardly biased the variance gets worse. At the point where half the sample are defiers the first-stage assumption fails as there is no longer any effect of the instrument (priming) on the treatment (confidence). As the proportion of defiers surpasses 0.5 however, there is once again a correlation between istrument and treatment---a negative correlation. The variance is still high but begins to drop. The bias is upward now, but getting smaller as well. Eventually when the entire population are defiers the bias vanishess and the variance becomes as precise as when we stated. Moral: "Defier" is in the eye-of-the-beholder. One researcher's defier is another researcher's complier.  What matters is how close the first-stage is to being monotonic (either upwards or downwards) as this will affect both bias and variance.