# Lab 4: Instrumental Variables

This lab covers instrumental variables (IV) estimation through two applications:

- **Part 1**: Children's Television and Educational Performance (Sesame Street experiment)
- **Part 2**: Do Institutions Cause Growth? (Acemoglu, Johnson & Robinson, 2001)

We estimate the Intent-to-Treat (ITT) effect, the Local Average Treatment Effect (LATE) via the Wald estimator, and two-stage least squares (2SLS).

In [1]:
import numpy as np
import pandas as pd
from scipy import stats
import statsmodels.formula.api as smf
from linearmodels.iv import IV2SLS

## Part 1: Children's Television and Educational Performance

In the early 1970s, the Educational Testing Service conducted an experiment to evaluate the educational impact of Sesame Street. Children were randomly assigned to receive encouragement to watch the show, but compliance was imperfect: some encouraged children did not watch, and some non-encouraged children did.

- **Instrument** ($Z$): `encouraged` (random assignment to watch)
- **Treatment** ($D$): `watched` (actually watching Sesame Street)
- **Outcome** ($Y$): `letters` (post-test literacy score, 0-63)

### Question 1: Data Exploration

In [2]:
sesame = pd.read_csv('../data/lab4/iv_part1.csv')

print(f'Shape: {sesame.shape}')
print(f'\nSummary:')
sesame.describe()

Shape: (240, 3)

Summary:


Unnamed: 0,encouraged,watched,letters
count,240.0,240.0,240.0
mean,0.633333,0.775,26.741667
std,0.482902,0.418455,13.375176
min,0.0,0.0,0.0
25%,0.0,1.0,15.0
50%,1.0,1.0,23.0
75%,1.0,1.0,39.25
max,1.0,1.0,63.0


### Question 2: Unit Types

In the IV framework with imperfect compliance, we distinguish four types:

| Type | $D(Z=1)$ | $D(Z=0)$ | Description |
|------|-----------|-----------|-------------|
| **Compliers** | Watch | Don't watch | Respond to encouragement |
| **Always-takers** | Watch | Watch | Would watch regardless |
| **Never-takers** | Don't watch | Don't watch | Would not watch regardless |
| **Defiers** | Don't watch | Watch | Do the opposite (assumed away) |

### Question 3: Compliance Analysis

Examine the cross-tabulation of assignment and actual treatment to understand the extent of non-compliance.

In [3]:
# Cross-tabulation
print('=== Counts ===')
ct = pd.crosstab(sesame['encouraged'], sesame['watched'], margins=True)
ct.index = ['Not encouraged', 'Encouraged', 'Total']
ct.columns = ['Not watched', 'Watched', 'Total']
print(ct)

print('\n=== Proportions (by row) ===')
pt = pd.crosstab(sesame['encouraged'], sesame['watched'], normalize='index')
pt.index = ['Not encouraged', 'Encouraged']
pt.columns = ['Not watched', 'Watched']
print(pt.round(4))

=== Counts ===
                Not watched  Watched  Total
Not encouraged           40       48     88
Encouraged               14      138    152
Total                    54      186    240

=== Proportions (by row) ===
                Not watched  Watched
Not encouraged       0.4545   0.5455
Encouraged           0.0921   0.9079


Two-sided non-compliance: some encouraged children did not watch (non-compliers in the treatment group), and some non-encouraged children watched anyway (always-takers in the control group).

### Question 4: Proportion of Compliers

Under the monotonicity assumption (no defiers):

$$\pi_c = E[D_i | Z_i = 1] - E[D_i | Z_i = 0]$$

In [4]:
d_z1 = sesame.loc[sesame['encouraged'] == 1, 'watched'].mean()
d_z0 = sesame.loc[sesame['encouraged'] == 0, 'watched'].mean()
proportion_compliers = d_z1 - d_z0

print(f'P(watched | encouraged):     {d_z1:.4f}')
print(f'P(watched | not encouraged): {d_z0:.4f}')
print(f'Proportion of compliers:     {proportion_compliers:.4f}')

P(watched | encouraged):     0.9079
P(watched | not encouraged): 0.5455
Proportion of compliers:     0.3624


### Question 5: Intent-to-Treat (ITT) Effect

The ITT estimates the causal effect of *assignment* (not treatment) on the outcome:

$$\text{ITT} = E[Y_i | Z_i = 1] - E[Y_i | Z_i = 0]$$

In [5]:
y_z1 = sesame.loc[sesame['encouraged'] == 1, 'letters'].mean()
y_z0 = sesame.loc[sesame['encouraged'] == 0, 'letters'].mean()
itt = y_z1 - y_z0

print(f'Mean letters (encouraged):     {y_z1:.4f}')
print(f'Mean letters (not encouraged): {y_z0:.4f}')
print(f'ITT:                           {itt:.4f}')

print('\nVerification via OLS:')
itt_model = smf.ols('letters ~ encouraged', data=sesame).fit()
print(itt_model.summary().tables[1])

Mean letters (encouraged):     27.7961
Mean letters (not encouraged): 24.9205
ITT:                           2.8756

Verification via OLS:
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     24.9205      1.421     17.536      0.000      22.121      27.720
encouraged     2.8756      1.786      1.610      0.109      -0.642       6.393


The ITT combines the effect of the treatment on those who comply with the dilution from non-compliers. Being *encouraged* to watch Sesame Street increases literacy scores by approximately 3 points.

### Questions 6-7: Local Average Treatment Effect (LATE)

The LATE estimates the effect of actually *watching* for compliers, using the Wald estimator:

$$\text{LATE} = \frac{\text{ITT}}{\pi_c} = \frac{E[Y_i|Z_i=1] - E[Y_i|Z_i=0]}{E[D_i|Z_i=1] - E[D_i|Z_i=0]}$$

In [6]:
late = itt / proportion_compliers

print(f'ITT:                     {itt:.4f}')
print(f'Proportion of compliers: {proportion_compliers:.4f}')
print(f'LATE (Wald estimator):   {late:.4f}')

ITT:                     2.8756
Proportion of compliers: 0.3624
LATE (Wald estimator):   7.9340


The LATE is substantially larger than the ITT because it scales up to account for the dilution from non-compliers. For children who complied with encouragement, watching Sesame Street increased literacy scores by approximately 8 points.

### Question 8: ITT vs. LATE

The **LATE** is of greater interest to Sesame Street's producers because it isolates the causal effect of *actually watching* the show, whereas the ITT conflates the treatment effect with non-compliance rates.

### Question 9: Exclusion Restriction

The exclusion restriction requires that encouragement affects literacy *only through* its effect on watching behavior. This is plausible: being told to watch Sesame Street should not directly improve literacy unless the child actually watches. However, one could argue that encouragement might prompt parents to engage in other educational activities, potentially violating the restriction.

---

## Part 2: Do Institutions Cause Growth?

**Acemoglu, D., Johnson, S. & Robinson, J.A. (2001).** *The Colonial Origins of Comparative Development: An Empirical Investigation.* American Economic Review, 91(5), 1369-1401.

AJR argue that European colonial settlers established different types of institutions depending on local disease environments. Where settler mortality was high, extractive institutions were established; where it was low, settlers replicated inclusive European institutions.

- **Outcome** ($Y$): `GDP` (log GDP per capita)
- **Treatment** ($D$): `Exprop` (protection against expropriation, a proxy for institutional quality)
- **Instrument** ($Z$): `logMort` (log settler mortality)

In [7]:
ajr = pd.read_csv('../data/lab4/iv_part2.csv')

print(f'Shape: {ajr.shape}')
ajr.describe()

Shape: (64, 11)


Unnamed: 0,GDP,Exprop,Mort,Latitude,Neo,Africa,Asia,Namer,Samer,logMort,Latitude2
count,64.0,64.0,64.0,64.0,64.0,64.0,64.0,64.0,64.0,64.0,64.0
mean,8.0625,6.516094,245.911094,0.190483,0.0625,0.421875,0.140625,0.21875,0.171875,4.646749,0.057002
std,1.043701,1.468841,472.623943,0.145075,0.243975,0.497763,0.350382,0.416667,0.380254,1.252543,0.086039
min,6.11,3.5,8.55,0.0,0.0,0.0,0.0,0.0,0.0,2.145931,0.0
25%,7.3025,5.6175,68.9,0.0889,0.0,0.0,0.0,0.0,0.0,4.232656,0.007903
50%,7.95,6.475,78.15,0.16115,0.0,0.0,0.0,0.0,0.0,4.35863,0.026
75%,8.8525,7.3525,240.0,0.2671,0.0,1.0,0.0,0.0,0.0,5.480639,0.071343
max,10.22,10.0,2940.0,0.6667,1.0,1.0,1.0,1.0,1.0,7.986165,0.444489


### Question 2: OLS Estimates (Biased)

The OLS estimate of institutional quality on GDP is likely biased due to reverse causality and omitted variables.

In [8]:
ols1 = smf.ols('GDP ~ Exprop', data=ajr).fit()
ols2 = smf.ols('GDP ~ Exprop + Africa + Asia + Namer + Samer', data=ajr).fit()

print('=== OLS without covariates ===')
print(ols1.summary().tables[1])
print(f'\n=== OLS with regional controls ===')
print(ols2.summary().tables[1])

=== OLS without covariates ===
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      4.6609      0.409     11.402      0.000       3.844       5.478
Exprop         0.5220      0.061      8.527      0.000       0.400       0.644

=== OLS with regional controls ===
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      5.9879      0.612      9.792      0.000       4.764       7.212
Exprop         0.4234      0.058      7.313      0.000       0.307       0.539
Africa        -1.1363      0.397     -2.862      0.006      -1.931      -0.342
Asia          -0.8496      0.405     -2.100      0.040      -1.659      -0.040
Namer         -0.2230      0.396     -0.563      0.576      -1.016       0.570
Samer         -0.2125      0.402     -0.528      0.600      -1.0

### Question 3: Reduced Form

The reduced form regresses the outcome directly on the instrument. This estimates the ITT: the total effect of settler mortality on GDP.

In [9]:
rf1 = smf.ols('GDP ~ logMort', data=ajr).fit()
rf2 = smf.ols('GDP ~ logMort + Africa + Asia + Namer + Samer', data=ajr).fit()

print('=== Reduced form without covariates ===')
print(rf1.summary().tables[1])
print(f'\n=== Reduced form with regional controls ===')
print(rf2.summary().tables[1])

=== Reduced form without covariates ===
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     10.6943      0.373     28.644      0.000       9.948      11.441
logMort       -0.5664      0.078     -7.297      0.000      -0.722      -0.411

=== Reduced form with regional controls ===
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     10.6631      0.476     22.401      0.000       9.710      11.616
logMort       -0.4093      0.098     -4.165      0.000      -0.606      -0.213
Africa        -1.0687      0.536     -1.992      0.051      -2.142       0.005
Asia          -0.9002      0.501     -1.795      0.078      -1.904       0.103
Namer         -0.3143      0.498     -0.632      0.530      -1.310       0.682
Samer         -0.3047      0.503     -0.606   

### Question 4: First Stage and F-test for Weak Instruments

The first stage regresses the endogenous treatment on the instrument. The F-statistic tests instrument relevance; a common rule of thumb requires $F > 10$.

In [10]:
# First stage without covariates
fs1 = smf.ols('Exprop ~ logMort', data=ajr).fit()
print('=== First stage (no covariates) ===')
print(fs1.summary().tables[1])
print(f'F-statistic: {fs1.fvalue:.2f}')

print()

# First stage with covariates
fs_restricted = smf.ols('Exprop ~ Africa + Asia + Namer + Samer', data=ajr).fit()
fs_full = smf.ols('Exprop ~ logMort + Africa + Asia + Namer + Samer', data=ajr).fit()

print('=== First stage (with covariates) ===')
print(fs_full.summary().tables[1])

# F-test for instrument: compare model with and without logMort
from scipy.stats import f as f_dist
n = len(ajr)
k_full = len(fs_full.params)
k_restricted = len(fs_restricted.params)
f_stat = ((fs_restricted.ssr - fs_full.ssr) / (k_full - k_restricted)) / (fs_full.ssr / (n - k_full))
f_pval = 1 - f_dist.cdf(f_stat, k_full - k_restricted, n - k_full)

print(f'\nF-test for instrument (with covariates): F = {f_stat:.2f}, p = {f_pval:.4f}')
print(f'F < 10 suggests the instrument may be weak when controlling for region.')

=== First stage (no covariates) ===
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      9.3659      0.611     15.339      0.000       8.145      10.586
logMort       -0.6133      0.127     -4.831      0.000      -0.867      -0.360
F-statistic: 23.34

=== First stage (with covariates) ===
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      9.7945      0.843     11.623      0.000       8.108      11.481
logMort       -0.4381      0.174     -2.518      0.015      -0.786      -0.090
Africa        -1.5053      0.950     -1.585      0.118      -3.406       0.396
Asia          -0.8999      0.888     -1.014      0.315      -2.676       0.877
Namer         -1.2620      0.881     -1.433      0.157      -3.025       0.501
Samer         -1.1913      0.890     

### Question 5: Two-Stage Least Squares (2SLS)

Estimate the LATE using 2SLS via the `linearmodels` package.

In [11]:
# 2SLS without covariates
iv1 = IV2SLS(
    dependent=ajr['GDP'],
    exog=pd.DataFrame({'const': 1}, index=ajr.index),
    endog=ajr[['Exprop']],
    instruments=ajr[['logMort']]
).fit(cov_type='unadjusted')

print('=== 2SLS without covariates ===')
print(iv1.summary.tables[1])

print()

# 2SLS with covariates
exog_vars = ajr[['Africa', 'Asia', 'Namer', 'Samer']].copy()
exog_vars['const'] = 1

iv2 = IV2SLS(
    dependent=ajr['GDP'],
    exog=exog_vars,
    endog=ajr[['Exprop']],
    instruments=ajr[['logMort']]
).fit(cov_type='unadjusted')

print('=== 2SLS with regional controls ===')
print(iv2.summary.tables[1])

=== 2SLS without covariates ===
                             Parameter Estimates                              
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
const          2.0448     0.9837     2.0786     0.0377      0.1167      3.9728
Exprop         0.9235     0.1499     6.1590     0.0000      0.6296      1.2174

=== 2SLS with regional controls ===
                             Parameter Estimates                              
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
Africa         0.3376     0.9361     0.3606     0.7184     -1.4972      2.1724
Asia          -0.0595     0.7093    -0.0839     0.9331     -1.4498      1.3307
Namer          0.8647     0.7926     1.0909     0.2753     -0.6888      2.4182
Samer          0.8083     0.7770     1.0403     0.2982     -0.

The IV estimates suggest a large and statistically significant causal effect of institutions on GDP growth. A one-unit increase in protection against expropriation causes approximately a 0.9 unit increase in log GDP per capita. The IV estimates are larger than OLS, consistent with OLS being attenuated by measurement error in institutional quality.