# Using Instrumental Variables for Treatment Effects in Quasi-Experiments

This is an example from Chapter 11 of the book [Methods Matter: Improving Causal Inference in Educational and Social Science](https://www.amazon.com/Methods-Matter-Improving-Inference-Educational-ebook/dp/B00HNSNBO4) by Richard Murnane and John Willett. This chapter explains how we can use Instrumental Variables Estimation to get a treatment effect using observational data.

In this example, we use IV to recover the treatment effect in cases where random assignment is applied to an offer to participate, where not every participates (compliers AND never takers allowed!), and where other people participate through some other means. Specifically, we would like to evaluate the effectiveness of a financial aid offer on the likelihood of a student finishing 8th grade in Bogotá, Colombia.

The data is sampled from [Vouchers for Private Schooling in Colombia: Evidence from a Randomized Natural Experiment](http://www.nber.org/papers/w8343) (2002) by Joshua Angrist and others.

Python implementation credit goes to J. Nathan Matias, who provides implementations of these methods in a [github repository](https://github.com/natematias/research_in_python).

In [1]:
import pandas as pd
import math
import statsmodels.formula.api as smf  # for doing statistical regression
import statsmodels.api as sm      # access to the wider statsmodels library, including R datasets

  from pandas.core import datetools


In [2]:
# Helper functions
def coefficient_to_odds_ratio(coefficient):
    """
    This function interprets a coefficient from a Logistic Regression model in as an odds ratio
    
    Given:
    coefficient -- coefficient from loistic regression output
    
    Returns:
    odds ratio (i.e. p/(1-p))
    """
    odds = math.exp(coefficient)
    print("Those using financial aid are {:2f} times more likely to finish the 8th grade!".format(odds))
    
def ate_odds_ratio(prob0, prob1):
    """
    This function calculates the average treatment effects as an odds ratio
    when the outcome is binary.
    
    Given:
    prob0 -- probability given no treatment
    prob1 -- probability given treatment
    
    Returns:
    odds ratio (i.e. p/(1-p))
    """
    odds = (prob1/(1-prob1)) - (prob0/(1-prob0))
    print("Odds ratio: {}".format(odds))

## Importing the Dataset

The dataset includes the following variables:
- `finish8th`: did the student finish 8th grade or not (outcome variable)
- `won_lottry`: won the lottery to receive offer of financial aid
- `use_fin_aid`: did the student use financial aid of any kind (not exclusive to the lottery) or not
- `base_age`: student age
- `male`: is the student male or not

In [3]:
# Import data set
voucher_df = pd.read_sas('colvoucher.sas7bdat') # reading in sas file
voucher_df.head()

Unnamed: 0,id,won_lottry,male,base_age,finish8th,use_fin_aid
0,3.0,1.0,0.0,11.0,1.0,1.0
1,4.0,0.0,1.0,11.0,1.0,1.0
2,5.0,0.0,1.0,11.0,1.0,0.0
3,6.0,0.0,0.0,9.0,0.0,0.0
4,10.0,1.0,1.0,11.0,1.0,1.0


## Summary Statistics

In [None]:
print("==============================================================================")
print("                              OVERALL SUMMARY"                                 )
print("==============================================================================")

print(voucher_df.describe())

for i in range(2):
    print("==============================================================================")
    print("                         USE FINANCIAL AID = %(i)d" % {"i":i}                  )
    print("==============================================================================")
    print(voucher_df[voucher_df['use_fin_aid']==i].describe())

## Average Treatment Effect

We can calculate the ATE empirically simply using the summary stats above, however this is an unconditional average treatment effect.

Use the fact that 

$ ATE = E[Y=1|T=1] - E[Y=1| T=0] $.

In case of binary outcome, we use

$ ATE = Pr[Y=1|T=1] - Pr[Y=1|T=0] $

and subsitute the conditional expectations/probabilities with their empirical estimates.

In [None]:
# Calculate unconditional average treatment effect
ate = ...
ate

In [None]:
# Calculate this in terms of odds ratio
ate_odds_ratio(..., ...)

## A Naive Logistic Regression

In [None]:
print("==============================================================================")
print("                            LOGISTIC REGRESSION"                               )
print("==============================================================================")
result = smf.glm(formula = "...", 
                 data=voucher_df,
                 family=sm.families.Binomial()).fit()
print(result.summary())

In [None]:
# Log odd units to odds ratio units
coefficient_to_odds_ratio(...)

## Two Stage Least Squares Estimation

Recall that using 2SLS estimation for the instrumental variables model requires estimating two equations:

__Stage 1:__ 

$ T = Z\alpha + X\psi + \nu $

__Stage 2:__

$ Y = {\hat{T}} \gamma + X\beta + \epsilon $

The instrument we choose must satisfy the two conditions:
 1. __Instrument relevance:__ $ Cov(Z, T) \neq 0 $
 2. __Instrument exogeneity:__ $ Cov(Z, \epsilon) = 0 $
 
Why is there endogeneity and what instrument is available in our dataset to combat this?

In [None]:
print("==============================================================================")
print("                                  FIRST STAGE"                                 )
print("==============================================================================")
result = smf.glm(formula = "...", 
                 data=voucher_df,
                 family=sm.families.Binomial()).fit()
voucher_df['use_fin_aid_fitted']= result.predict()
print(result.summary())

In [None]:
print()
print()
print("==============================================================================")
print("                                  SECOND STAGE"                                )
print("==============================================================================")
result = smf.glm(formula = " ...", 
                 data=voucher_df,
                 family=sm.families.Binomial()).fit()
print(result.summary())

In [None]:
# Transforming the log-odds units into odds ratio
coefficient_to_odds_ratio(...)

## Interpreting the Local Average Treatment Effect

When we use IV to get a causal effect, what are our results actually telling us? According to Murnane and Willett, "an estimate of a treatment effect obtained by IV methods should be regarded as an __estimated local average treatment effect (LATE)__. As mentioned earlier in the lecture slides, the LATE estimate can depend on your choice of instruments.

As stated in the book:

- Compliers "are willing to have their behavior determined by the outcomes of the lottery, regardless of the particular experimental conditions to which they were assigned" (278).
- Always-Takers "are families who will find and make use of financial aid to pay private-school fees" regardless of the lottery. They may find aid outside the lottery
- Never-takers are the mirror image of always-takers: "they will not make use of financial aid to pay childrens' fees at a private secondary school under any circumstances" (278)
- (there are other possible groups, like "defiers" (Gennetian et all, 2005) who always do the opposite of what investigators ask them to do, but we make the assumption of "no defiers" in this dataset)

In this context, IV estimates of the __local average treatment effect (LATE) for this quasi-experiment only applies to "compliers"__--and not to never-takers or always-takers.

__Tests for instrument validity/strength of instruments is an established line of research that we can point you to if wanted!__