In [1]:
import stata_setup
stata_setup.config("C:/Program Files/Stata17/", "mp")


  ___  ____  ____  ____  ____ ®
 /__    /   ____/   /   ____/      17.0
___/   /   /___/   /   /___/       MP—Parallel Edition

 Statistics and Data Science       Copyright 1985-2021 StataCorp LLC
                                   StataCorp
                                   4905 Lakeway Drive
                                   College Station, Texas 77845 USA
                                   800-STATA-PC        https://www.stata.com
                                   979-696-4600        stata@stata.com

Stata license: Single-user 4-core  perpetual
Serial number: 501706303466
  Licensed to: David Tomas Jacho-Chavez
               Emory University

Notes:
      1. Unicode is supported; see help unicode_advice.
      2. More than 2 billion observations are allowed; see help obs_advice.
      3. Maximum number of variables is set to 5,000; see help set_maxvar.


# Chapter 12

Loading the data set

In [2]:
from pystata import stata
stata.run('''
* Clear memory and load the data
clear
use "https://www.ssc.wisc.edu/~bhansen/econometrics/Card1995.dta", clear
''',quietly=True)




Construct relevant variables to be used for the analysis.

In [3]:
stata.run('''
set more off
gen exp = age76 - ed76 -6
gen exp2 = (exp^2)/100
gen age2 = (age76^2)/100
* Dropping observations with missing wage
drop if lwage76==.
''',quietly=True)




The _structural model_ of interest is

$$
%\begin{aligned}
%\texttt{lwage76} &= \beta_1\texttt{ed76} + \beta_2\texttt{exp} + \beta_3\texttt{exp2}\\
%                 & + \beta_4\texttt{reg76r} + \beta_5\texttt{smsa76r} + \beta_6+ e,
%\end{aligned}
\texttt{lwage76} = \beta_1\texttt{ed76} + \beta_2\texttt{exp} + \beta_3\texttt{exp2} + \beta_4\texttt{black} + \beta_5\texttt{reg76r} + \beta_6\texttt{smsa76r} + \beta_7+ e,
$$

where

* $\texttt{lwage76}$: log of weekly earnings.
* $\texttt{ed76}$: years of schooling.
* $\texttt{exp}$: (potential) years of work experience.
* $\texttt{exp2}$: $\texttt{exp}^2/100$.
* $\texttt{black}$: indicator if African American.
* $\texttt{reg76r}$: indicator for residence in the southern region of the U.S.
* $\texttt{smsa76r}$: indicator for residence in a standard metropolitan statistical area.

Available instruments for potential endogeneity:

* $\texttt{nearc4}$: grew up in same county as a 4-year college.
* $\texttt{nearc4a}$: grew up in same county as a 4-year _public_ college.
* $\texttt{nearc4b}$: grew up in same county as a 4-year _private_ college.
* $\texttt{age76}$: age in 1976 in years.
* $\texttt{age2}$: $\texttt{age76}^2/100$.

## (Estimated) Reduced Forms

### Scenario 1 - Options 1 & 2

**S1_lwage76_O1**: Regression of $\texttt{lwage76}$ on $\texttt{exp}$, $\texttt{exp2}$, $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$.

**S1_ed76_O1**: Regression of $\texttt{ed76}$ on $\texttt{exp}$, $\texttt{exp2}$, $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$.

**S1_lwage76_O2**: Regression of $\texttt{lwage76}$ on $\texttt{exp}$, $\texttt{exp2}$, $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$.

**S1_ed76_O2**: Regression of $\texttt{ed76}$ on $\texttt{exp}$, $\texttt{exp2}$, $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$.

In [4]:
stata.run('''
quietly reg lwage76 exp exp2 black reg76r smsa76r nearc4, robust
estimates store S1_lwage76_O1
quietly reg ed76 exp exp2 black reg76r smsa76r nearc4, robust
estimates store S1_ed76_O1
quietly reg lwage76 exp exp2 black reg76r smsa76r nearc4a nearc4b, robust
estimates store S1_lwage76_O2
quietly reg ed76 exp exp2 black reg76r smsa76r nearc4a nearc4b, robust
estimates store S1_ed76_O2
''',quietly=True)
%stata estimates table S1_lwage76_O1 S1_ed76_O1 S1_lwage76_O2 S1_ed76_O2, b(%9.3f)



--------------------------------------------------------------
    Variable | S1_lwag~1   S1_ed76~1   S1_lwag~2   S1_ed76~2  
-------------+------------------------------------------------
         exp |     0.053      -0.410       0.053      -0.413  
        exp2 |    -0.219       0.073      -0.215       0.093  
       black |    -0.264      -1.006      -0.264      -1.006  
      reg76r |    -0.143      -0.291      -0.138      -0.267  
     smsa76r |     0.185       0.404       0.184       0.400  
      nearc4 |     0.045       0.337                          
     nearc4a |                             0.064       0.430  
     nearc4b |                            -0.000       0.123  
       _cons |     5.957      16.659       5.956      16.657  
--------------------------------------------------------------


### Scenario 2 - Options 1 & 2

**S2_lwage76_O1**: Regression of $\texttt{lwage76}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_ed76_O1**: Regression of $\texttt{ed76}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_exp_O1**: Regression of $\texttt{exp}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_exp2_O1**: Regression of $\texttt{exp2}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4}$, $\texttt{age76}$, $\texttt{age2}$.


In [5]:
stata.run('''
quietly reg lwage76 black reg76r smsa76r nearc4 age76 age2, robust
estimates store S2_lwage76_O1
quietly reg ed76 black reg76r smsa76r nearc4 age76 age2, robust
estimates store S2_ed76_O1
quietly reg exp black reg76r smsa76r nearc4 age76 age2, robust
estimates store S2_exp_O1
quietly reg exp2 black reg76r smsa76r nearc4 age76 age2, robust
estimates store S2_exp2_O1
''',quietly=True)
%stata estimates table S2_lwage76_O1 S2_ed76_O1 S2_exp_O1 S2_exp2_O1, b(%9.3f)



--------------------------------------------------------------
    Variable | S2_lwag~1   S2_ed76~1   S2_exp_O1   S2_exp2~1  
-------------+------------------------------------------------
       black |    -0.239      -1.468       1.468       0.282  
      reg76r |    -0.142      -0.460       0.460       0.112  
     smsa76r |     0.186       0.835      -0.835      -0.176  
      nearc4 |     0.032       0.347      -0.347      -0.073  
       age76 |     0.182       1.061      -0.061      -0.555  
        age2 |    -0.249      -1.876       1.876       1.313  
       _cons |     3.101      -1.869      -4.131       6.099  
--------------------------------------------------------------


**S2_lwage76_O2**: Regression of $\texttt{lwage76}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_ed76_O2**: Regression of $\texttt{ed76}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_exp_O2**: Regression of $\texttt{exp}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$, $\texttt{age76}$, $\texttt{age2}$.

**S2_exp2_O2**: Regression of $\texttt{exp2}$ on $\texttt{black}$, $\texttt{reg76r}$, $\texttt{smsa76r}$, $\texttt{nearc4a}$, $\texttt{nearc4b}$, $\texttt{age76}$, $\texttt{age2}$.

In [6]:
stata.run(
'''
quietly reg lwage76 black reg76r smsa76r nearc4a nearc4b age76 age2, robust
estimates store S2_lwage76_O2
quietly reg ed76 black reg76r smsa76r nearc4a nearc4b age76 age2, robust
estimates store S2_ed76_O2
quietly reg exp black reg76r smsa76r nearc4a nearc4b age76 age2, robust
estimates store S2_exp_O2
quietly reg exp2 black reg76r smsa76r nearc4a nearc4b age76 age2, robust
estimates store S2_exp2_O2
''',quietly=True)

%stata estimates table S2_lwage76_O2 S2_ed76_O2 S2_exp_O2 S2_exp2_O2



------------------------------------------------------------------
    Variable | S2_lwage~2   S2_ed76_O2   S2_exp_O2    S2_exp2_O2  
-------------+----------------------------------------------------
       black | -.23859215   -1.4681378    1.4681378     .2820042  
      reg76r | -.13758305   -.42700612    .42700612    .10354738  
     smsa76r |  .18523367     .8284784    -.8284784   -.17380163  
     nearc4a |  .05073151    .46941488   -.46941488   -.10327114  
     nearc4b | -.00943173    .06598527   -.06598527   -.00180781  
       age76 |    .181765    1.0612174   -.06121738   -.55446262  
        age2 | -.24908546   -1.8771542    1.8771542     1.313498  
       _cons |  3.1007129    -1.869194    -4.130806    6.0993322  
------------------------------------------------------------------


## Instrumental Variable Estimation ($\ell = k$)

In [7]:
stata.run(
'''
quietly ivregress 2sls lwage76 exp exp2 black reg76r smsa76r (ed76 = nearc4), robust
estimates store S1_O1
quietly ivregress 2sls lwage76 black reg76r smsa76r (ed76 exp exp2 = nearc4 age76 age2), robust perfect
estimates store S2_O1
''',quietly=True)

%stata estimates table S1_O1 S2_O1, b(%9.3f)



--------------------------------------
    Variable |   S1_O1       S2_O1    
-------------+------------------------
        ed76 |     0.132       0.133  
         exp |     0.107       0.056  
        exp2 |    -0.228      -0.080  
       black |    -0.131      -0.103  
      reg76r |    -0.105      -0.098  
     smsa76r |     0.131       0.108  
       _cons |     3.753       4.066  
--------------------------------------


## Two-Stage Least Squares ($\ell>k$)

In [8]:
stata.run(
'''
quietly ivregress 2sls lwage76 exp exp2 black reg76r smsa76r (ed76 = nearc4a nearc4b), robust
estimates store S1_O2
quietly ivregress 2sls lwage76 black reg76r smsa76r (ed76 exp exp2 = nearc4a nearc4b age76 age2), robust perfect
estimates store S2_O2
''',quietly=True)

%stata estimates table S1_O2 S2_O2, b(%9.3f)



--------------------------------------
    Variable |   S1_O2       S2_O2    
-------------+------------------------
        ed76 |     0.161       0.160  
         exp |     0.119       0.047  
        exp2 |    -0.231      -0.032  
       black |    -0.102      -0.064  
      reg76r |    -0.095      -0.086  
     smsa76r |     0.116       0.083  
       _cons |     3.268       3.748  
--------------------------------------


In [9]:
%stata estimates table S1_O2 S2_O2, b(%9.3f) se


--------------------------------------
    Variable |   S1_O2       S2_O2    
-------------+------------------------
        ed76 |     0.161       0.160  
             |     0.040       0.041  
         exp |     0.119       0.047  
             |     0.018       0.025  
        exp2 |    -0.231      -0.032  
             |     0.037       0.127  
       black |    -0.102      -0.064  
             |     0.044       0.061  
      reg76r |    -0.095      -0.086  
             |     0.022       0.026  
     smsa76r |     0.116       0.083  
             |     0.026       0.041  
       _cons |     3.268       3.748  
             |     0.682       0.484  
--------------------------------------
                          Legend: b/se
