In [None]:
import stata_setup
stata_setup.config("C:/Program Files/Stata17/", "mp")

# Chapter 12

Loading the data set

In [None]:
from pystata import stata
stata.run('''
* Clear memory and load the data
clear
use ../Data/Card1995
''',quietly=True)

Construct relevant variables to be used for the analysis.

In [None]:
stata.run('''
set more off
gen exp = age76 - ed76 -6
gen exp2 = (exp^2)/100
gen age2 = (age76^2)/100
* Dropping observations with missing wage
drop if lwage76==.
''',quietly=True)

The _structural model_ of interest is

$$
%\begin{aligned}
%\texttt{lwage76} &= \beta_1\texttt{ed76} + \beta_2\texttt{exp} + \beta_3\texttt{exp2}\\
%                 & + \beta_4\texttt{reg76r} + \beta_5\texttt{smsa76r} + \beta_6+ e,
%\end{aligned}
\texttt{lwage76} = \beta_1\texttt{ed76} + \beta_2\texttt{exp} + \beta_3\texttt{exp2} + \beta_4\texttt{black} + \beta_5\texttt{reg76r} + \beta_6\texttt{smsa76r} + \beta_7+ e,
$$

where

* $\texttt{lwage76}$: log of weekly earnings.
* $\texttt{ed76}$: years of schooling.
* $\texttt{exp}$: (potential) years of work experience.
* $\texttt{exp2}$: $\texttt{exp}^2/100$.
* $\texttt{black}$: indicator if African American.
* $\texttt{reg76r}$: indicator for residence in the southern region of the U.S.
* $\texttt{smsa76r}$: indicator for residence in a standard metropolitan statistical area.

Available instruments for potential endogeneity:

* $\texttt{nearc4}$: grew up in same county as a 4-year college.
* $\texttt{nearc4a}$: grew up in same county as a 4-year _public_ college.
* $\texttt{nearc4b}$: grew up in same county as a 4-year _private_ college.
* $\texttt{age76}$: age in 1976 in years.
* $\texttt{age2}$: $\texttt{age76}^2/100$.

## (Estimated) Reduced Forms

### Scenario 1 - Options 1 & 2

**S1_lwage76_O1**: Regression of $\texttt{lwage76}$ on $\texttt{exp}$, $\texttt{exp2}$, $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$.

**S1_ed76_O1**: Regression of $\texttt{ed76}$ on $\texttt{exp}$, $\texttt{exp2}$, $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$.

**S1_lwage76_O2**: Regression of $\texttt{lwage76}$ on $\texttt{exp}$, $\texttt{exp2}$, $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$.

**S1_ed76_O2**: Regression of $\texttt{ed76}$ on $\texttt{exp}$, $\texttt{exp2}$, $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$.

In [None]:
stata.run('''
quietly reg lwage76 exp exp2 black reg76r smsa76r nearc4, robust
estimates store S1_lwage76_O1
quietly reg ed76 exp exp2 black reg76r smsa76r nearc4, robust
estimates store S1_ed76_O1
quietly reg lwage76 exp exp2 black reg76r smsa76r nearc4a nearc4b, robust
estimates store S1_lwage76_O2
quietly reg ed76 exp exp2 black reg76r smsa76r nearc4a nearc4b, robust
estimates store S1_ed76_O2
''',quietly=True)
%stata estimates table S1_lwage76_O1 S1_ed76_O1 S1_lwage76_O2 S1_ed76_O2, b(%9.3f)

### Scenario 2 - Options 1 & 2

**S2_lwage76_O1**: Regression of $\texttt{lwage76}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_ed76_O1**: Regression of $\texttt{ed76}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_exp_O1**: Regression of $\texttt{exp}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_exp2_O1**: Regression of $\texttt{exp2}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$, $\texttt{age76}$, $\texttt{age2}$.


In [None]:
stata.run('''
quietly reg lwage76 black reg76r smsa76r nearc4 age76 age2, robust
estimates store S2_lwage76_O1
quietly reg ed76 black reg76r smsa76r nearc4 age76 age2, robust
estimates store S2_ed76_O1
quietly reg exp black reg76r smsa76r nearc4 age76 age2, robust
estimates store S2_exp_O1
quietly reg exp2 black reg76r smsa76r nearc4 age76 age2, robust
estimates store S2_exp2_O1
''',quietly=True)
%stata estimates table S2_lwage76_O1 S2_ed76_O1 S2_exp_O1 S2_exp2_O1, b(%9.3f)

**S2_lwage76_O2**: Regression of $\texttt{lwage76}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_ed76_O2**: Regression of $\texttt{ed76}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_exp_O2**: Regression of $\texttt{exp}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_exp2_O2**: Regression of $\texttt{exp2}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$, $\texttt{age76}$, $\texttt{age2}$.

In [None]:
stata.run(
'''
quietly reg lwage76 black reg76r smsa76r nearc4a nearc4b age76 age2, robust
estimates store S2_lwage76_O2
quietly reg ed76 black reg76r smsa76r nearc4a nearc4b age76 age2, robust
estimates store S2_ed76_O2
quietly reg exp black reg76r smsa76r nearc4a nearc4b age76 age2, robust
estimates store S2_exp_O2
quietly reg exp2 black reg76r smsa76r nearc4a nearc4b age76 age2, robust
estimates store S2_exp2_O2
''',quietly=True)

%stata estimates table S2_lwage76_O2 S2_ed76_O2 S2_exp_O2 S2_exp2_O2

## Instrumental Variable Estimation ($\ell = k$)

In [None]:
stata.run(
'''
quietly ivregress 2sls lwage76 exp exp2 black reg76r smsa76r (ed76 = nearc4), robust
estimates store S1_O1
quietly ivregress 2sls lwage76 black reg76r smsa76r (ed76 exp exp2 = nearc4 age76 age2), robust perfect
estimates store S2_O1
''',quietly=True)

%stata estimates table S1_O1 S2_O1, b(%9.3f)

## Two-Stage Least Squares ($\ell>k$)

In [None]:
stata.run(
'''
quietly ivregress 2sls lwage76 exp exp2 black reg76r smsa76r (ed76 = nearc4a nearc4b), robust
estimates store S1_O2
quietly ivregress 2sls lwage76 black reg76r smsa76r (ed76 exp exp2 = nearc4a nearc4b age76 age2), robust perfect
estimates store S2_O2
''',quietly=True)

%stata estimates table S1_O2 S2_O2, b(%9.3f)

In [None]:
%stata estimates table S1_O2 S2_O2, b(%9.3f) se