In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm

# )

From 1573 to 1812, the Spanish government in Peru and the Bolivia republic created a forced labor system forcing over 200 indigenous communities to send one-seventh of their adult male population to work in the Potosí Silver and Huancavelica mercury mines. This contribution scheme had a border discontinuity. On one side, all communities sent the same percentage of their population, while on the other side, all communities were exempt. Clearly, this is a ‘sharp’ discontinuity design since Di is a deterministic function of $Z_i$. For a 'fuzzy' discontinuity design, we would for example have some people on the control side of the border being sent to work in the mines (the closer to the border the more likely to be sent to work in the mines).

# )

![image.png](attachment:image.png)

 a)

$$
\lim_{z \to z_0^+} P(Y_{ij} \leq r|Z_i = z) = \lim_{z \to z_0^-} P(Y_{ij} \leq r|Z_i = z) \\
\lim_{z \to z_0^+} F_{Y_{ij}}(r|Z_i = z) = \lim_{z \to z_0^-}F_{Y_{ij}}(r|Z_i = z)\\
\lim_{z \to z_0^+} dF_{Y_{ij}}(r|Z_i = z) = \lim_{z \to z_0^-}dF_{Y_{ij}}(r|Z_i = z)\\
\lim_{z \to z_0^+} \int_{Y}dF_{Y_{ij}}(r|Z_i = z) = \lim_{z \to z_0^-} \int_{Y} dF_{Y_{ij}}(r|Z_i = z) \\
$$
And eventually we get:
$$
\lim_{z \to z_0^+} 𝔼[Y_{ij}|Z_i = z] = \lim_{z \to z_0^-} 𝔼[Y_{ij}|Z_i = z] \\
$$

b)

$$
Y_i = \alpha D_i + Y_{0i}
$$

Taking limits and conditional expectations on $Z_i$:

$$ 
\begin{align}
\lim_{z \rightarrow z_0^+} \mathbb{E}[Y_i | Z_i = z] & = \alpha \lim_{z \rightarrow z_0^+} \mathbb{E}[D_i|Z_i = z] + \lim_{z \rightarrow z_0^+} \mathbb{E} [Y_{0i} | Z_i =z] \\
\lim_{z \rightarrow z_0^-} \mathbb{E}[Y_i | Z_i = z] & = \alpha \lim_{z \rightarrow z_0^-} \mathbb{E}[D_i|Z_i = z] + \lim_{z \rightarrow z_0^-} \mathbb{E} [Y_{0i} | Z_i =z]
\end{align}
$$

I then subtract the two equations from one another:

$$
\lim_{z \rightarrow z_0^+} \mathbb{E}[Y_i | Z_i = z] -\lim_{z \rightarrow z_0^-} \mathbb{E}[Y_i | Z_i = z] = \alpha \lim_{z \rightarrow z_0^+} \mathbb{E}[D_i|Z_i = z] -\alpha \lim_{z \rightarrow z_0^-} \mathbb{E}[D_i|Z_i = z]
$$

The other two terms cancel out due to our orthogonality condition derived above:
$$\lim_{z\rightarrow z_0^+} \mathbb{E}[Y_{ij}| Z_i = z] =  \lim_{z\rightarrow z_0^-} \mathbb{E}[Y_{ij}| Z_i = z]$$

Reshaping our equation, we get: 

$$
\alpha = \frac{\lim_{z \rightarrow z_0^+} \mathbb{E}[Y_i | Z_i = z] -\lim_{z \rightarrow z_0^-} \mathbb{E}[Y_i | Z_i = z]}{\lim_{z \rightarrow z_0^+} \mathbb{E}[D_i|Z_i = z] - \lim_{z \rightarrow z_0^-} \mathbb{E}[D_i|Z_i = z]}
 = \lim_{z \rightarrow z_0^+} \mathbb{E}[Y_i | Z_i = z] -\lim_{z \rightarrow z_0^-} \mathbb{E}[Y_i | Z_i = z]
$$

The denominator is equal to one, since we're confronted with a sharp discontinuity design problem.




# )

![image.png](attachment:image.png)

The author claims that households’ potential consumption outcome is a function of agricultural and social factors. The former is measured in terms of elevation and slope. The latter is measured in terms of the proportion of indigenous people log 1572 tribute rates et al. Table 1 shows that for these variables it is not possible to reject the null hypothesis of equality for the overwhelming majority of cases. Especially near the border (i.e. at the limit) there are no significant differences. Therefore, we cannot reject that
$$
\lim_{z \to z_0^+} 𝔼[Y_{ij}|Z_i = z] = \lim_{z \to z_0^-} 𝔼[Y_{ij}|Z_i = z] \\
$$
As we saw in 2b) this condition is crucial to derive $\alpha$. Intuitively, this condition states that all relevant factors besides treatment must vary smoothly at the mita boundary. In other words, relevant factors other the the RD variable $Z_i$ must not affect the conditional distribution of the outcome variable near the cut-off $z_0$. This is intuitive, because only under this assumption do individuals just outside the mita represent a valid counterfactual for those just inside the mita.

# )

![image.png](attachment:image.png)

a)

The coefficient of interest α_RD represents the average treatment effect for individuals at the threshold. In Dell (2010) $f(geographic location_d)$ is referred to as the regression discontinuity (RD) polynomial. For Panel C of Table 2, for example, it corresponds to the cubic polynomial of $Z_d$

$$
f_{z_0}(Z_d)=|Z_d|+|Z_d|^2+|Z_d|^3
$$

where Z_d in this case measures district d’s distance to the mita border. In the context of RD, this corresponds to what we refer to as control function. It owes its name to the idea that it “controls” for selection bias in the case of heterogenous treatment effects. As this last point suggests, including $f(geographic location_d)$ is therefore important for an unbiased estimation of α_RD.

b)

For Panel C the cutoff $z_0$ is defined as the mita border itself. More specifically, $Z_d = z_0 = 0$ indicates that the district is right on the mita border. Furthermore, upon inspection of the data sets (delldata_consumption.dta and delldata_childstunt.dta) it becomes evident that negative values of $Z_d$ imply that district d is within the mita zone. With respect to the data sets, $Z_d$ corresponds to column dist_mita_brdr. Hence we can let $Z_d ≤ z_0$ denote that individual i is inside the mita region. Then we can define $D_i$ as $D_i = 𝟙_{Zi ≤ z0}$. (Alternatively we can just multiply $Z_d$ by −1 and adopt the same notation as in class, namely: $D_i = 𝟙_{Z_i ≥ z_0}$).

c)

In [2]:
# Loading the two datasets:
stunt = pd.read_stata('../data/delldata_childstunt.dta')
consumption = pd.read_stata('../data/delldata_consumption.dta')

In [3]:
def prepare_dataset(dataset):
    """
    This function takes as input a dataset (stunt or consumption) and outputs the same dataset processed 
    and the adequate selection of covariates.
    """
    # I am creating the 2nd & 3rd degree polynomial as indicated in the footnote of Table 2 (with absolute values):
    dataset['dist'] = dataset.dist_mita_brdr.abs()
    dataset['dist2'] = dataset.dist**2
    dataset['dist3'] = dataset.dist**3
    # I create a mita boolean: 1 if the vaiable dist_mita_brdr is negative, 0 otherwise:
    dataset['mita'] = dataset.dist_mita_brdr.apply(lambda x: 1 if x<=0 else 0)
    # Also as indicated in the paper, in order to get conservative estimates, Dell did not use the Cusco data.
    # Therefore I am kicking them out as well:
    dataset = dataset[dataset.cusco!=1]
    # Defining the covariates for the regression:
    covariates = [
        "mita",
        "dist",
        "dist2",
        "dist3",
        "elv_sh",
        "slope",
        "bfe4_1",
        "bfe4_2",
        "bfe4_3"
    ]
    if 'hh_numb_infs' in dataset.columns:
        covariates += ["hh_numb_infs", "hh_numb_chd", "hh_numb_adts"]
    return dataset, covariates

In [7]:
def regression(dataset, covariates):
    """
    This function takes as input a processed dataset (stunt or consumption) and the adequate selection of covariates
    and prints regression outputs.
    """
    for dist_filter in [100, 75, 50]:
        cons_filtered = dataset[dataset.dist_mita_brdr.abs() < dist_filter]
        # Define districts to use for clustered standard errors:
        districts = cons_filtered.distr.values
        # Declaring dep. and indep. variables:
        target = 'stunt' if 'stunt' in cons_filtered else 'lhhequiv'
        y = cons_filtered[target]
        X = cons_filtered[covariates]
        X = sm.add_constant(X.values)
        # Running the regression:
        results = sm.OLS(y,X).fit(cov_type='cluster', cov_kwds={'groups':districts})
        # Printing the results:
        print(f'<{str(dist_filter)}km from mita border:')
        print(f'coefficient: {round(results.params[1],3)}')
        print(f'standard error: {round(results.bse[1],3)}')
        print('\n')

In [8]:
consumption, covariates = prepare_dataset(consumption)
regression(consumption, covariates)

<100km from mita border:
coefficient: -0.277
standard error: 0.078


<75km from mita border:
coefficient: -0.23
standard error: 0.089


<50km from mita border:
coefficient: -0.224
standard error: 0.092




In [9]:
stunt, covariates = prepare_dataset(stunt)
regression(stunt, covariates)

<100km from mita border:
coefficient: 0.073
standard error: 0.023


<75km from mita border:
coefficient: 0.061
standard error: 0.022


<50km from mita border:
coefficient: 0.064
standard error: 0.023




The coefficients, as well as the standard errors are identical to those found in the paper (compare below).

![image.png](attachment:image.png)

# )
![image.png](attachment:image.png)

The first main difference with respect 2b is that since we are dealing with heterogeneous treatment effects, our equation looks like: 

$$
Y_i = \alpha_i D_i + Y_{0i}
$$

Thus we can only hope to estimate an average treatment effect $𝔼[\alpha_i|Z_i = z_0]$.

In the case of sharp discountinuites the estimation of the parameter is still fairly simple:
$$
D_i = \mathbb{1}{z \geq z_0} \\
\begin{align}
𝔼[Y_i|Z_i = z] & = 𝔼[\alpha_i| Z_i = z]D_i + 𝔼[Y_{0i}| Z_i = z] 
\end{align}
$$

If we add and subtract $\pm𝔼[\alpha_i | Z_i = z_0]D_i$ we get...

$$
\begin{align}
𝔼[Y_i|Z_i = z] & = 𝔼[\alpha_i| Z_i = z]D_i + 𝔼[Y_{0i}| Z_i = z] \pm𝔼[\alpha_i | Z_i = z_0]D_i \\
& = \alpha_{RD}D_i + 𝔼[Y_{0i}|Z_i=z] + (𝔼[\alpha_i|Z_i = z] - 𝔼[\alpha_i | Z_i = z_0])D_i
\end{align}
$$

Let $k_{z0}(z) \equiv 𝔼[Y_{0i}|Z_i=z] + (𝔼[\alpha_i|Z_i = z] - 𝔼[\alpha_i | Z_i = z_0])D_i$ and we get:

$$
𝔼[Y_i|Z_i = z] = \alpha_{RD}D_i + k_{z_0}(z)\\
$$ 
Estimating this $\alpha_{RD}$ with OLS (Control function approach) we get $𝔼[\alpha_i|Z_i = z_0]$








Now, in the case of fuzzy RD design one can, as Dell did, assume conditional independence around $z_0$:

$$
(Y_{1i}, Y_{0i}) ⫫ D_i | Z_i = z
$$

Now we derive $\alpha$ in the same way we did in 2b):



$$
Y_i = \alpha D_i + Y_{0i}
$$

Taking limits and conditional expectations on $Z_i$:

$$ 
\begin{align}
\lim_{z \rightarrow z_0^+} \mathbb{E}[Y_i | Z_i = z] & = \alpha \lim_{z \rightarrow z_0^+} \mathbb{E}[D_i|Z_i = z] + \lim_{z \rightarrow z_0^+} \mathbb{E} [Y_{0i} | Z_i =z] \\
\lim_{z \rightarrow z_0^-} \mathbb{E}[Y_i | Z_i = z] & = \alpha \lim_{z \rightarrow z_0^-} \mathbb{E}[D_i|Z_i = z] + \lim_{z \rightarrow z_0^-} \mathbb{E} [Y_{0i} | Z_i =z]
\end{align}
$$

I then subtract the two equations from one another:

$$
\lim_{z \rightarrow z_0^+} \mathbb{E}[Y_i | Z_i = z] -\lim_{z \rightarrow z_0^-} \mathbb{E}[Y_i | Z_i = z] = \alpha \lim_{z \rightarrow z_0^+} \mathbb{E}[D_i|Z_i = z] -\alpha \lim_{z \rightarrow z_0^-} \mathbb{E}[D_i|Z_i = z]
$$

The other two terms cancel out due to our orthogonality condition derived above:
$$\lim_{z\rightarrow z_0^+} \mathbb{E}[Y_{ij}| Z_i = z] =  \lim_{z\rightarrow z_0^-} \mathbb{E}[Y_{ij}| Z_i = z]$$

Reshaping our equation, we get: 

$$
\alpha = \frac{\lim_{z \rightarrow z_0^+} \mathbb{E}[Y_i | Z_i = z] -\lim_{z \rightarrow z_0^-} \mathbb{E}[Y_i | Z_i = z]}{\lim_{z \rightarrow z_0^+} \mathbb{E}[D_i|Z_i = z] - \lim_{z \rightarrow z_0^-} \mathbb{E}[D_i|Z_i = z]}
 = \lim_{z \rightarrow z_0^+} \mathbb{E}[Y_i | Z_i = z] -\lim_{z \rightarrow z_0^-} \mathbb{E}[Y_i | Z_i = z]
$$

The denominator is equal to one, since we're confronted with a sharp discontinuity design problem.




Therefore, 

$$
\alpha = \frac{\lim_{z \rightarrow z_0^+} \E[Y_i | Z_i = z] -\lim_{z \rightarrow z_0^-} \E[Y_i | Z_i = z]}{\lim_{z \rightarrow z_0^+} \E[D_i|Z_i = z] - \lim_{z \rightarrow z_0^-} \E[D_i|Z_i = z]}
$$

Now since we are in the \emph{fuzzy} case the denominator does not equal 1.

Other authors question the assumption of $(Y_{1i}, Y_{0i}) \ci | Z_i$ creating instrumental variables to overcome this difficulty

![image.png](attachment:image.png)

# )
![image.png](attachment:image.png)

Maimonides’ rule prescibes maximum class sizes of 40 students and was originally proposed by twelfth century rabbinic scholar Maimonides. The rule induces a discontinuity in the relationship between total grade enrollment and class size as the authors of Angrist and Lavy (1999) nicely demonstrate in the below chart which is lifted from their paper. The chart also demonstrates that the induced RD is fuzzy in practice (the solid line does not exactly match the discontinuous dashed line). This is because there exist other drivers of variation in Isreali class sizes, as Angrist and Lavy (1999) point out.

![image.png](attachment:image.png)

# )
![image.png](attachment:image.png)

# )
![image.png](attachment:image.png)