# **Wooldridge Notes on Treatment Effects Estimation**

This notebook provides a summary of the [slides](https://www.dropbox.com/sh/zj91darudf2fica/AADWlJIH9SI3XvtADXgYma0ka/ESTIMATE_DiD?dl=0&preview=slides_0_estimate_did_202112_v2.pdf&subfolder_nav_tracking=1) on treatment effects estimation made publicly available by Jeff Wooldridge, along with practical Stata examples of some of the estimation approaches presented.

For these practical examples, I use data from the Stata `teffects` command documentation, which is based on Cattaneo (2010).

**Notation:**
- $y$: outcome of interest is `bweight`
- $X$: covariates are `mmarried` , `mage`, `prenatal1`, `fbaby`
- $D$: treatment variable is `mbsmoke`

Set up the environment:

In [1]:
%%capture
import stata_setup
stata_setup.config("/Applications/Stata/", "mp")

**Prepare data:**

Here I also create new variables that follow the notation in Wooldridge more closely.

In [6]:
%%stata
use "https://www.stata-press.com/data/r17/cattaneo2" ,  clear

gen y  = bweight
gen D  = mbsmoke
gen x1 = mmarried
gen x2 = mage
gen x3 = prenatal1
gen x4 = fbaby


. use "https://www.stata-press.com/data/r17/cattaneo2" ,  clear


(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. 
. gen y  = bweight

. gen D  = mbsmoke

. gen x1 = mmarried

. gen x2 = mage

. gen x3 = prenatal1

. gen x4 = fbaby

. 


## Regression Adjustment (RA)

#### **Estimation using `teffects` command**:



In [7]:
%%stata
teffects ra (y x1-x4) (D), atet
di "Estimated ATT is: " "`=strofreal(_b[ATET:r1vs0.D])'" 


. teffects ra (y x1-x4) (D), atet



Iteration 0:   EE criterion =  2.425e-23  
Iteration 1:   EE criterion =  1.596e-26  

Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
------------------------------------------------------------------------------
             |               Robust
           y | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
           D |
   (1 vs 0)  |  -223.3017    22.7422    -9.82   0.000    -267.8755   -178.7278
-------------+----------------------------------------------------------------
POmean       |
           D |
          0  |   3360.961   12.75749   263.45   0.000     3335.957    3385.966
------------------------------------------------------------------------------

. di "Estimated ATT is: " "`=strofreal(_b[ATET:r1vs0.D])'" 
Estimated ATT is: -223.3017

. 


#### **Estimation by manual imputation of counterfactual:**

In [8]:
%%stata
// Estimate parameters using non-treated obs only
qui: reg y x1-x4 if D==0

// Predict counter-factual
predict y_0 if D==1

// Estimate ATT by averaging differences
gen te_manual = y - y_0 if D==1
sum te_manual
di "Estimated ATT is: " "`=strofreal(r(mean))'" 


. // Estimate parameters using non-treated obs only
. qui: reg y x1-x4 if D==0

. 
. // Predict counter-factual
. predict y_0 if D==1
(option xb assumed; fitted values)
(3,778 missing values generated)

. 
. // Estimate ATT by averaging differences
. gen te_manual = y - y_0 if D==1
(3,778 missing values generated)

. sum te_manual

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
   te_manual |        864   -223.3016    563.9532  -3025.057   1778.878

. di "Estimated ATT is: " "`=strofreal(r(mean))'" 
Estimated ATT is: -223.3016

. 


#### **Estimation by pooled OLS proposed by Wooldridge:**

Center covariates around mean of treatment group:

In [5]:
%%stata
cap drop x*_c	
foreach x of varlist x1-x4 {
	qui: sum `x' if D==1
	gen `x'_c = `x' - r(mean)
}


. cap drop x*_c   

. foreach x of varlist x1-x4 {
  2.         qui: sum `x' if D==1
  3.         gen `x'_c = `x' - r(mean)
  4. }

. 


Obtain ATT with pooled OLS regression:

Note that standard erros not valid, since they don't take into account the sampling variation for the centering of the covariates.

In [30]:
%%stata 
reg y D x1-x4 c.D#c.(x1_c-x4_c)
di "Estimated ATT is: " "`=strofreal(_b[D])'"


. reg y D x1-x4 c.D#c.(x1_c-x4_c)

      Source |       SS           df       MS      Number of obs   =     4,642
-------------+----------------------------------   F(9, 4632)      =     33.78
       Model |    95778474         9  10642052.7   Prob > F        =    0.0000
    Residual |  1.4591e+09     4,632  315005.562   R-squared       =    0.0616
-------------+----------------------------------   Adj R-squared   =    0.0598
       Total |  1.5549e+09     4,641  335032.156   Root MSE        =    561.25

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           D |  -223.3016   22.13099   -10.09   0.000    -266.6889   -179.9144
          x1 |   160.9513   24.51695     6.56   0.000     112.8864    209.0162
          x2 |   2.546828   1.932518     1.32   0.188    -1.241829    6.335484
          x3 | 

To obtain valid standard errors, use original covariates and `margins` command:


In [31]:
%%stata
qui: reg y i.D x1-x4 i.D#c.(x1-x4) , vce(robust)
margins , dydx(i.D) subpop(D) vce(uncond)
matrix b = r(b)
di "Estimated ATT is: " "`=strofreal(b[1, 2])'"



. qui: reg y i.D x1-x4 i.D#c.(x1-x4) , vce(robust)

. margins , dydx(i.D) subpop(D) vce(uncond)



Average marginal effects                               Number of obs   = 4,642
                                                       Subpop. no. obs =   864

Expression: Linear prediction, predict()
dy/dx wrt:  1.D

------------------------------------------------------------------------------
             |            Unconditional
             |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           D |
          0  |          0  (empty)
          1  |  -223.3017   22.76673    -9.81   0.000    -267.9353    -178.668
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

. matrix b = r(b)

. di "Estimated ATT is: " "`=strofreal(b[1, 2])'"
Estimated ATT is: -223.3017

. 
