<h1>ECON 140R Class 22</h1>

<b>Regression discontinuity (RD)</b> is an elegant, artful method of causal inference that invokes visualizations and the search for a <i>discontinuous jump</i> in some outcome when a treatment is applied at one or more <b>cutoff</b> points in a <b>running variable</b>. It is the subject of Chapter 4 in <i>Mastering Metrics</i>.

Learning objectives:

1. Running a basic RD using MLDA, and the dataset has been set up
2. Adding some extensions, like quadratic and interactions

In [None]:
library(haven)
library(dplyr)

Now let us load in the dataset `AEJfigs.dta` that Angrist and Pischke examine in Section 4.1. These data are similar to the minimum legal drinking age panel data we saw in chapter 5.

In [None]:
AEJfigs <- read_dta("AEJfigs.dta")
head(AEJfigs)

Note how the `agecell` variable is kind of funny looking. It is close to age in years plus half of 1/12, or something like a midpoint of a month. Proba bly Because months are different

We will need to generate some variables for the RD estimation.

In [None]:
# Create a recentered "age" variable that measures 
# "months" before or after age 21 
AEJfigs <- mutate(AEJfigs, age = agecell - 21)

# Create an indicator variable for over age 21
AEJfigs <- mutate(AEJfigs, over21 = as.integer(agecell >= 21))

# Age-squared, a quadratic term
AEJfigs <- mutate(AEJfigs, age2 = age^2)

# Age interacted with over-21
AEJfigs <- mutate(AEJfigs, over_age = over21*age)

# Age-squared interacted with over-21
AEJfigs <- mutate(AEJfigs, over_age2 = over21*age2)

# "Other external causes," a residual shown in the 5th row
# of Table 4.1
AEJfigs <- mutate(AEJfigs, ext_oth = external - homicide - suicide - mva)

AEJfigs

The dataset already appears to contain fitted values for the "dueling quadratic" specification, where the pre and post periods are allowed to be separate quadratics.

In [None]:
plot(AEJfigs$agecell, AEJfigs$all, ylim = c(80,115))
lines(AEJfigs$agecell, AEJfigs$allfitted, col = "red")

Below is the basic RD estimation, of equation (4.2) appearing on page 152:

$$
\bar{M}_{a} = \alpha + \rho \ D_a + \gamma \ a + e_a
$$

In addition to $D_a$, the indicator for age being 21 and over, we have a constant term, which is something like the average of the data minus the estimated effect of $D_a$, and we have a linear term in age. In the text on page 152, Angrist and Pischke cite $\rho = 7.7$ around an average death rate of about 95.

In [None]:
rd_reg1 <- lm(all ~ age + over21,
             data = AEJfigs)
summary(rd_reg1)

And here is the "dueling quadratics" or "quadratic on each side" specification:

$$
\bar{M}_{a} = \alpha + \rho \ D_a + \gamma_1 (a - a_0) + \gamma_2 (a - a_0)^2
+ \delta_1 \left[ (a - a_0) D_a
\right]
+ \delta_2 \left[ (a - a_0)^2 D_a
\right]
+ e_a
$$

In [None]:
rd_reg_q1 <- lm(all ~ age + age2 + over21 +
                over_age + over_age2,
                data = AEJfigs)
summary(rd_reg_q1)

<div style="text-align: right"> <span style="font-family:Papyrus; ">And they lived happily ever after. The End.</span></div>