Alex kappes <br>
Problem Set 7 <br>
EconS 512

**(1)** We can try this two ways. From the provided linear model

\begin{equation*}
Y_i = \beta_0 +B_1T_i + \mu_i + \xi_i
\end{equation*}

the estimator for $\boldsymbol{\beta}$ is 
\begin{equation*}
\hat{\boldsymbol{\beta}} = \left(\mathbf{T}'\mathbf{T}\right)^{-1}\mathbf{T}'\mathbf{Y} = \left(\mathbf{T}'\mathbf{T}\right)^{-1}\mathbf{T}'\left(\mathbf{T}\boldsymbol{\beta} + \boldsymbol{\mu} + \boldsymbol{\xi}\right).
\end{equation*}

Given the statistical information provided in the question, it follows that

\begin{align*}
\text{E}[\hat{\boldsymbol{\beta}}] &= \boldsymbol{\beta} + \text{E}[\left(\mathbf{T}'\mathbf{T}\right)^{-1}\mathbf{T}'\ \big\vert\ \boldsymbol{\mu}] + \text{E}[\left(\mathbf{T}'\mathbf{T}\right)^{-1}\mathbf{T}']\text{E}[\boldsymbol{\xi}] \\ 
&\Rightarrow \boldsymbol{\beta} + \text{E}[\left(\mathbf{T}'\mathbf{T}\right)^{-1}\mathbf{T}'\ \big\vert\ \boldsymbol{\mu}] \\
&\Rightarrow \boldsymbol{\beta} + Pr(T_i = 1\ \big\vert\ X_i < x^*, \mu_i = 0) + Pr(T_i = 1\ \big\vert\ X_i \geq x^*, \mu_i = 0) + Pr(T_i = 1\ \big\vert\ X_i < x^*, \mu_i = 1) + Pr(T_i = 1\ \big\vert\ X_i  \geq x^*, \mu_i = 1) \\
&= \boldsymbol{\beta} + \underbrace{(\frac{1}{2}p_0 + \frac{1}{2}p_1)}_{\text{bias}},
\end{align*}

where the bias persists in the probability limit, providing the expectation of $\hat{\beta}_1 \in \hat{\boldsymbol{\beta}}$ above. I do not think it is possible to get a consistent estimate of $\beta_1$ given that the bias persists in the limit, thus not $\rightarrow 0$ as $n \rightarrow \infty$. Should the estimator be unbiased, and $Cov[\boldsymbol{\beta}] \rightarrow 0$, the estimator would then be considered consistent. To identify the treatment, we can use the kernel estimator procedures below, or any of the other valid regression discontinuity methodologies.

**(2)** The following shows artificial data generation and evidence of regression discontinuity in the corresponding scatter plot.

In [10]:
import numpy as np
import pandas as pd
from scipy import stats
import plotly
import plotly.plotly as plt
import plotly.graph_objs as go

x_vec = stats.uniform.rvs(0, 10, 500)

def f1(x):
    return np.power(x, 1.5) + stats.norm.rvs(scale=0.5, size=len(x))

def f2(x):
    return np.log(np.power(x, 9))   + stats.norm.rvs(scale=0.5, size=len(x))

y1 = f1(x_vec[x_vec <= 5])
y2 = f2(x_vec[x_vec >= 5])

fx_trace0 = go.Scatter(
    x=x_vec[x_vec <= 5],
    y=y1,
    mode='markers',
    name='f(x1)'
)

fx_trace1 = go.Scatter(
    x=x_vec[x_vec >= 5],
    y=y2,
    mode='markers',
    name='f(x2)'
)

layout = {'shapes': [{'type': 'line',
                          'x0': 5, 'x1': 5,
                          'y0': 0, 'y1': 25,
                          'line': {'dash': 'dot'}
                      }],
          }

data = [fx_trace0, fx_trace1]
mapping = {'data' : data,
           'layout': layout}
plt.iplot(mapping, filename='kernel_mapping.html')

In [4]:
# estimation class
class estimate:
    def __init__(self, dep, indep):
        self.dep = np.asmatrix(dep)
        self.indep = np.asmatrix(indep)

    def params(self):
        return np.linalg.inv(self.indep.T * self.indep) * self.indep.T * self.dep

    def pred(self, b):
        return self.indep * b

Kernel estimates follow. Bandwidths are defined as the epsilon distance $(0.25, 0.75, 1.25)$ from $x^* = 5$.

In [21]:
# kernel estimation
eps_dist = [0.25, 0.5, 0.75]
x_star = 5

d_kern_low = pd.DataFrame()
for i in eps_dist:
    d_kern_low[i] = np.where((x_vec < x_star) &
                             (x_vec >= x_star - i),
                             1, 0)

d_kern_high = pd.DataFrame()
for i in eps_dist:
    d_kern_high[i] = np.where((x_vec > x_star) &
                              (x_vec <= x_star + i),
                              1, 0)

kern_l = pd.concat([pd.DataFrame(np.ones(len(x_vec))), d_kern_low], axis=1).rename(columns={0: 'ones'})
kern_h = pd.concat([pd.DataFrame(np.ones(len(x_vec))), d_kern_high], axis=1).rename(columns={0: 'ones'})
y = pd.concat([pd.DataFrame(y1), pd.DataFrame(y2)]).rename(columns={0: 'y'}
                                                          ).reset_index().drop(columns='index')

# sample means method for right and left x* limits used to compute kernel estimates
def get_kerns(epsilon):
    lim_x_right = np.sum(np.multiply(kern_h[epsilon], y['y'])) / np.sum(kern_h[epsilon])
    lim_x_left = np.sum(np.multiply(kern_l[epsilon], y['y'])) / np.sum(kern_l[epsilon])
    return lim_x_right - lim_x_left

kern_df = pd.DataFrame({eps_dist[0]: get_kerns(eps_dist[0]),
                        eps_dist[1]: get_kerns(eps_dist[1]),
                        eps_dist[2]: get_kerns(eps_dist[2])
                        }, index=['Kernels']).round(3)

kern_df

Unnamed: 0,0.25,0.5,0.75
Kernels,3.143,2.244,1.297


Local linear regression method follows.

In [22]:
# local linear estimation
ones_col = pd.DataFrame({'ones': np.ones(len(y))})

X = pd.concat([ones_col, pd.DataFrame(x_vec)], axis=1).rename(columns={0: 'x'})

def get_loclin(df, epsilon):
    loclist = df[df[epsilon] == 1][epsilon].index.tolist()
    loclin_y = y.loc[loclist]
    loclin_x_l = X.loc[loclist]
    return estimate(loclin_y, loclin_x_l).params()

# differencing (a, b) estimators from the right and left of x* 
loclin_est1 = get_loclin(kern_h, eps_dist[0])[1, 0] - get_loclin(kern_l, eps_dist[0])[1, 0]
loclin_est2 = get_loclin(kern_h, eps_dist[1])[1, 0] - get_loclin(kern_l, eps_dist[1])[1, 0]
loclin_est3 = get_loclin(kern_h, eps_dist[2])[1, 0] - get_loclin(kern_l, eps_dist[2])[1, 0]

loclin_df = pd.DataFrame({eps_dist[0]: loclin_est1,
                          eps_dist[1]: loclin_est2,
                          eps_dist[2]: loclin_est3
                          }, index=['local_lin']).round(3)

loclin_df

Unnamed: 0,0.25,0.5,0.75
local_lin,34.955,15.477,6.431


Polynomial (only out to $x^4$) regression follows.

In [23]:
# polynomial reg
X['xsq'] = np.power(X['x'], 2)
X['xcub'] = np.power(X['x'], 3)
X['xquart'] = np.power(X['x'], 4)
X['T'] = np.where(X['x'] > 5, 1, 0)

pol2_t_param = estimate(y, X[['ones', 'T', 'x', 'xsq']]).params()[1, 0]
pol3_t_param = estimate(y, X[['ones', 'T', 'x', 'xsq', 'xcub']]).params()[1, 0]
pol4_t_param = estimate(y, X[['ones', 'T', 'x', 'xsq', 'xcub', 'xquart']]).params()[1, 0]

poly_df = pd.DataFrame({'squared': pol2_t_param,
                        'cubed': pol3_t_param,
                        'quartic': pol4_t_param
                        }, index=['poly_treat']).round(3)

poly_df

Unnamed: 0,squared,cubed,quartic
poly_treat,2.11,2.184,2.255


The kernel estimates show a larger treatment effect for the smallest bandwidth specification, with decreases in effect as bandwidth size increases. Local linear treatment estimates display the same treatment effect progression. Polynomial specification results in a treatment effect that increases alongside bandwidth size. 