# **Wooldridge (2021) on Diff-in-Diff Estimation**

This notebook provides a summary of the Supplemental Material in Wooldridge (2021), which goes over the various ways of implementing Differences-in-Differences estimators with panel data in Stata.

For the examples, I use data from the Stata `xtdidregress` command documentation, which comes from Moser and Voena (2012).

**Notation:**
- $y$: outcome of interest is `uspatents`
- $x$: time-invariant covariate is created manually
- $w$: dynamic treatment variable is `gotpatent`
- $t$: time variable is `year`
- $i$: panel unit identifier is `classid`

TODO: instead of using patent data, simulate dataset with known treatment effects

**Set up the environment:**

In [23]:
%%capture
import stata_setup
stata_setup.config("/Applications/Stata/", "mp")

**Prepare data:**

Define a program that prepares the data for estimation. Note that I create new variables with names that follow the notation in Wooldridge (2021) more closely.

In [24]:
%%stata -qui
cap program drop prep_data
program prep_data
	// Use example dataset from teffects documentation
	clear all
	use "https://www.stata-press.com/data/r17/patents", clear

	// Create variables following notation
	gen  y  = uspatents
	gen  w  = gotpatent
	gen  i  = classid
	gen  t  = year
	egen D  = max(w), by(i)
	
	// Create time-invariant covariate and update outcome variable to account for its effect on y
	gen x1 = rnormal(10, 2)
	bys i (t): replace x1 = x1[1]
	replace y = y + .07*x1
end




## **DiD estimator with two time periods (no covariates)**

Let's first restrict panel to two time periods (1915 and 1930, intervention started in 1919)

In [25]:
%%stata -qui
prep_data
keep if inlist(year, 1915, 1930)
replace t = 1 if year==1915
replace t = 2 if year==1930
xtset i t
gen D_y = D.y




Estimand of interest:
$$ \tau_{2, att} \equiv E[Y_2(1) - Y_2(0) | D= 1]$$

The **differences-in-differences estimator** is difference in average changes between treated and control groups.
$$ \hat\tau_{2, DD} = \Delta\bar Y_{treated} - \Delta\bar Y_{control}$$


In [26]:
%%stata
qui: sum D.y if D==1
loc Dy_bar1 = r(mean)
qui: sum D.y if D==0
loc Dy_bar0 = r(mean)
di "DD estimator of ATT is: " "`=strofreal(`Dy_bar1' - `Dy_bar0)')'"  


. qui: sum D.y if D==1

. loc Dy_bar1 = r(mean)

. qui: sum D.y if D==0

. loc Dy_bar0 = r(mean)

. di "DD estimator of ATT is: " "`=strofreal(`Dy_bar1' - `Dy_bar0)')'"  
DD estimator of ATT is: .1864459

. 


Not let's see the other ways in which we can obtain the same estimator in Stata.

#### **OLS using cross-section of changes**:


In [27]:
%%stata
reg D.y D , robust
di "CS OLS estimator of ATT is: " "`=strofreal(_b[D])'" 


. reg D.y D , robust

Linear regression                               Number of obs     =      7,248
                                                F(1, 7246)        =       7.38
                                                Prob > F          =     0.0066
                                                R-squared         =     0.0008
                                                Root MSE          =     1.4159

------------------------------------------------------------------------------
             |               Robust
         D.y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           D |   .1864459   .0686422     2.72   0.007     .0518873    .3210046
       _cons |   .1141493   .0171383     6.66   0.000     .0805532    .1477454
------------------------------------------------------------------------------

. di "CS OLS estimator of ATT is: " "`=strofreal(_b[D])'" 
CS OLS esti

#### **OLS using panel:**

In [28]:
%%stata
reg y w D 2.t  , vce(cluster i)
di "Pooled OLS estimator of ATT is: " "`=strofreal(_b[w])'"


. reg y w D 2.t  , vce(cluster i)

Linear regression                               Number of obs     =     14,496
                                                F(3, 7247)        =      27.86
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0025
                                                Root MSE          =      1.463

                                  (Std. err. adjusted for 7,248 clusters in i)
------------------------------------------------------------------------------
             |               Robust
           y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           w |   .1864459   .0686445     2.72   0.007     .0518826    .3210093
           D |  -.2535119   .0526064    -4.82   0.000    -.3566357   -.1503882
         2.t |   .1141493   .0171389     6.66   0.000     

#### **OLS on panel including time-invariant controls:**

Nothing changes!

In [29]:
%%stata
reg y w D 2.t x1 , vce(cluster i)
di "Estimate is the same when adding time-invariant control to pooled OLS: " "`=strofreal(_b[w])'"


. reg y w D 2.t x1 , vce(cluster i)

Linear regression                               Number of obs     =     14,496
                                                F(4, 7247)        =      53.87
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0134
                                                Root MSE          =      1.455

                                  (Std. err. adjusted for 7,248 clusters in i)
------------------------------------------------------------------------------
             |               Robust
           y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           w |   .1864459   .0686469     2.72   0.007      .051878    .3210139
           D |  -.2544483   .0515706    -4.93   0.000    -.3555417   -.1533548
         2.t |   .1141493   .0171395     6.66   0.000   

#### **Two-way Fixed Effects using panel:**

In [30]:
%%stata
xtreg y w i.t, fe vce(cluster i)
di "TWFE estimator of ATT is: " "`=strofreal(_b[w])'" 


. xtreg y w i.t, fe vce(cluster i)

Fixed-effects (within) regression               Number of obs     =     14,496
Group variable: i                               Number of groups  =      7,248

R-squared:                                      Obs per group:
     Within  = 0.0082                                         min =          2
     Between = 0.0007                                         avg =        2.0
     Overall = 0.0013                                         max =          2

                                                F(2,7247)         =      32.41
corr(u_i, Xb) = -0.0121                         Prob > F          =     0.0000

                                  (Std. err. adjusted for 7,248 clusters in i)
------------------------------------------------------------------------------
             |               Robust
           y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+---------------------------------------------------------

#### **Random Effects estimator:**
(Note that we have to control for $D$)

In [31]:
%%stata
xtreg y w i.t D, re vce(cluster i)
di "RE estimator of ATT is: " "`=strofreal(_b[w])'"


. xtreg y w i.t D, re vce(cluster i)

Random-effects GLS regression                   Number of obs     =     14,496
Group variable: i                               Number of groups  =      7,248

R-squared:                                      Obs per group:
     Within  = 0.0082                                         min =          2
     Between = 0.0007                                         avg =        2.0
     Overall = 0.0025                                         max =          2

                                                Wald chi2(3)      =      83.59
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                  (Std. err. adjusted for 7,248 clusters in i)
------------------------------------------------------------------------------
             |               Robust
           y | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+-------------------------------------------------------

#### **Stata `xtdidreg` command:**

In [32]:
%%stata
xtdidreg (y) (w), group(i) time(t) vce(cluster i)
matrix b = e(b)
di "DD estimator of ATT is: " "`=strofreal(b[1, 1])'"


. xtdidreg (y) (w), group(i) time(t) vce(cluster i)



Number of groups and treatment time

Time variable: t
Control:       w = 0
Treatment:     w = 1
-----------------------------------
             |   Control  Treatment
-------------+---------------------
Group        |
           i |      6912        336
-------------+---------------------
Time         |
     Minimum |         1          2
     Maximum |         1          2
-----------------------------------

Difference-in-differences regression                    Number of obs = 14,496
Data type: Longitudinal

                                  (Std. err. adjusted for 7,248 clusters in i)
------------------------------------------------------------------------------
             |               Robust
           y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
           w |
   (1 vs 0)  |   .1864459   .0686422     2.72   0.007     .0518873    .3210046
-------------------

## **Regression Adjustment (RA) estimator with T=2 and time-invariant covariates**

Under NA, CCT, and OL assumptions, we can apply standard methods for treatment effect analysis to 

$$ (\{\Delta Y_i , D_i, X_i): i=1, ..., N\}$$

### **Regression Adjustment estimator (imputation approach):**

$$ \hat\tau_{2, RA} = \Delta\bar Y_{treated} - (\hat\alpha_0 + \bar X_1\hat\beta_0)$$

where
- $\hat\alpha_0$ and $\hat\beta_0$ are estimated using untreated observations
- $\bar X_1$ is the average of the time-invariant covariate for the treated group

In [33]:
%%stata
// Estimate parameters using non-treated obs only
reg D_y x1 if D==0
	
// Predict counter-factual
cap drop D_y_0
predict D_y_0 if D==1
	
// Estimate ATT by averaging differences
cap drop te_manual
gen te_manual = D_y - D_y_0 if D==1
sum te_manual
di "Estimated ATT is: " "`=strofreal(r(mean))'" 


. // Estimate parameters using non-treated obs only
. reg D_y x1 if D==0

      Source |       SS           df       MS      Number of obs   =     6,912
-------------+----------------------------------   F(1, 6910)      =      0.01
       Model |  .023394628         1  .023394628   Prob > F        =    0.9145
    Residual |  14028.9127     6,910  2.03023339   R-squared       =    0.0000
-------------+----------------------------------   Adj R-squared   =   -0.0001
       Total |  14028.9361     6,911  2.02994301   Root MSE        =    1.4249

------------------------------------------------------------------------------
         D_y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
          x1 |   -.000908   .0084583    -0.11   0.915    -.0174888    .0156729
       _cons |   .1232328   .0863374     1.43   0.154     -.046015    .2924806
-------------------------------------------------------

Now let's go over other ways to obtain $\hat\tau_{2, RA}$ in Stata. We focus on the approaches that yield valid standard errors.

### **`teffects` command on changes:**

In [34]:
%%stata
teffects ra (D_y x1) (D), atet
di "Estimated ATT is: " "`=strofreal(_b[ATET:r1vs0.D])'" 


. teffects ra (D_y x1) (D), atet



Iteration 0:   EE criterion =  5.325e-33  
Iteration 1:   EE criterion =  8.267e-36  

Treatment-effects estimation                    Number of obs     =      7,248
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
------------------------------------------------------------------------------
             |               Robust
         D_y | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
           D |
   (1 vs 0)  |   .1864572   .0686257     2.72   0.007     .0519532    .3209612
-------------+----------------------------------------------------------------
POmean       |
           D |
          0  |    .114138   .0171348     6.66   0.000     .0805545    .1477216
------------------------------------------------------------------------------

. di "Estimated ATT is: " "`=strofreal(_b[ATET:r1vs0.D])'" 
Estimated ATT is: .1864572

. 


### **OLS using cross-section of changes:**


In [35]:
%%stata
qui: reg D_y i.D##c.x1 , vce(robust)
margins , dydx(i.D) subpop(D) vce(uncond)
matrix b = r(b)
di "Estimated ATT is: " "`=strofreal(b[1, 2])'"


. qui: reg D_y i.D##c.x1 , vce(robust)

. margins , dydx(i.D) subpop(D) vce(uncond)

Average marginal effects                               Number of obs   = 7,248
                                                       Subpop. no. obs =   336

Expression: Linear prediction, predict()
dy/dx wrt:  1.D

------------------------------------------------------------------------------
             |            Unconditional
             |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           D |
          0  |          0  (empty)
          1  |   .1864572   .0686447     2.72   0.007     .0518936    .3210208
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

. matrix b = r(b)

. di "Estimated ATT is: " "`=strofreal(b[1, 2])'"
Estimated ATT is: .1864572

. 


### **OLS using panel:**

In [37]:
%%stata
qui: reg y i.w##c.x1 D c.D#c.x1 2.t 2.t#c.x1 , vce(cluster i)
margins , dydx(i.w) subpop(D) vce(uncond)
matrix b = r(b)
di "Estimated ATT is: " "`=strofreal(b[1, 2])'"



. qui: reg y i.w##c.x1 D c.D#c.x1 2.t 2.t#c.x1 , vce(cluster i)



. margins , dydx(i.w) subpop(D) vce(uncond)

Average marginal effects                              Number of obs   = 14,496
                                                      Subpop. no. obs =    672

Expression: Linear prediction, predict()
dy/dx wrt:  1.w

                                  (Std. err. adjusted for 7,248 clusters in i)
------------------------------------------------------------------------------
             |            Unconditional
             |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         1.w |   .1864572    .068647     2.72   0.007      .051889    .3210254
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

. matrix b = r(b)

. di "Estimated ATT is: " "`=strofreal(b[1, 2])'"
Estimated ATT is: .1864572

. 
