# Credible Answers to Hard Questions: Differences-in-Differences for Natural Experiments (coding exercises - STATA)


## Tabla de contenidos

- [Chapter 1](#c-1)

- [Chapter 2](#c-2)

- [Chapter 3](#c-3)

- [Chapter 4](#c-4)

- [Chapter 5](#c-5)

- [Chapter 6](#c-6)

- [Chapter 7](#c-7)

- [Chapter 8](#c-8)



## Chapter 1 — Downloading and Preparing Data

This chapter covers the basic setup required to work with **all the databases in the DID Textbook**.

By the end of this chapter, the student will be able to:
- define their working folder,
- download the book's datasets,
- open any database included in the repository.

### 1.1 Working Folder

Before starting, Stata needs to know which folder to work in.

**Instruction**
Replace the path in the following command with an existing folder on your computer.

In [None]:
version 17.0
clear all
set more off

* Change this path to a folder on your computer: 

cd "/Users/karlavega/Documents/GitHub/did_book"

### 1.2 Downloading the Data

The replication data and files are downloaded directly from Stata.

In [None]:
ssc describe cc_xd_didtextbook
net get cc_xd_didtextbook

### 1.3 Loading Databases

From now on, the various databases in the book will be loaded using the `use` command.

Each chapter will indicate which dataset it corresponds to.

## Chapter 3 — Basic Difference in Differences

This chapter introduces the fundamental Difference-in-Differences (DiD) estimators:

- static TWFE regression,
- equivalence between TWFE and canonical DiD,
- event study and pre-trend test,
- extensions for evaluating assumptions and heterogeneity.

This chapter uses data from **Moser and Voena (2012)** to illustrate the basic Difference-in-Differences estimators.

Before starting the estimations, we load the corresponding dataset.

In [None]:
use "cc_xd_didtextbook_2025_9_30/Data sets/Moser and Voena 2012/moser_voena_didtextbook.dta", clear

### 3.1 Static TWFE Regression

We estimate a model with fixed unit (`subclass`) and time (`year`) effects.
Standard errors are grouped at the `subclass` level.

In [None]:
* Static TWFE
xtreg patents twea i.year, fe i(subclass) cluster(subclass)

* Equivalent implementation with reghdfe
reghdfe patents twea, absorb(subclass year) cluster(subclass)


### 3.2 Equivalence between TWFE and canonical DiD

In this case, the coefficient of `twea` coincides with the classical DiD estimator.

In [None]:
* Canonical DiD
reg patents treatmentgroup post twea, cluster(subclass)


### 3.3 Treatment Randomization Test

This test assesses whether the treatment and control groups differ before treatment.

In [None]:
* Pre-treatment period only
reg patents treatmentgroup if year<=1918, cluster(subclass)

### 3.4 Event-study TWFE

Dynamic effects are estimated using leads and lags. The key test is that the pre-treatment coefficients are jointly zero.

In [None]:
* Event-study TWFE
reg patents i.year treatmentgroup reltimeminus* reltimeplus*, cluster(subclass)

* Joint pre-tendency test
test reltimeminus1 reltimeminus2 reltimeminus3 reltimeminus4 reltimeminus5 /// 
reltimeminus6 reltimeminus7 reltimeminus8 reltimeminus9 reltimeminus10 /// 
reltimeminus11 reltimeminus12 reltimeminus13 reltimeminus14 /// 
reltimeminus15 reltimeminus16 reltimeminus17 reltimeminus18

The event-study graph is constructed from the estimated coefficients and their confidence intervals.

In [None]:
reg patents reltimeminus* reltimeplus* i.year treatmentgroup, cluster(subclass)

matrix temp = r(table)'
matrix res = J(40,4,0)
matrix res[19,1]=0

forvalues x = 1/18 {
    matrix res[19-`x',1]=-`x'
    matrix res[19-`x',2]=temp[`x',1]
    matrix res[19-`x',3]=temp[`x',5]
    matrix res[19-`x',4]=temp[`x',6]
}
forvalues x = 1/21 {
    matrix res[`x'+19,1]=`x'
    matrix res[`x'+19,2]=temp[`x'+18,1]
    matrix res[`x'+19,3]=temp[`x'+18,5]
    matrix res[`x'+19,4]=temp[`x'+18,6]
}

matrix res_post = res["r19".."r40","c1".."c4"]

preserve
    drop _all
    svmat res
    twoway (scatter res2 res1) ///
           (line res2 res1) ///
           (rcap res4 res3 res1), ///
           title("TWFE Event-study estimates") ///
           xtitle("Relative time") ytitle("Effect")
restore


### 3.5 Event-study without pre-treatment periods

The model is re-estimated excluding the leads, allowing for comparison with the dynamic DiD in the post-treatment period.

In [None]:
reg patents i.yearpost treatmentgroup reltimeplus*, cluster(subclass)

### 3.6 Undetected Linear Pretrends

This section assesses whether the results could be explained by differential linear trends not detected by standard tests.

In [None]:
local github https://raw.githubusercontent.com
net install pretrends, from(`github'/mcaceresb/stata-pretrends/main) replace

reghdfe patents reltimeminus* reltimeplus*, absorb(treatmentgroup year) cluster(subclass)
pretrends power 0.5, numpre(6)

### 3.7 Variance of the long-term effect

The variance of the results between groups is compared 14 years after treatment.

In [None]:
* Year = 1932
sdtest diffpatentswrt1918 if year==1932, by(treatmentgroup)

scalar sd_effects = r(sd_2) - r(sd_1)

reg diffpatentswrt1918 treatmentgroup if year==1932
di _b[treatmentgroup] - 1.96*sd_effects , /// 
_b[treatmentgroup] + 1.96*sd_effects

### 3.8 Placebo tests

Placebo tests are performed to assess spurious heterogeneity before treatment.

In [None]:
* Spot placebo
sdtest diffpatentswrt1918 if year==1904, by(treatmentgroup)

* Placebos in all years
forvalues i = 1900/1939 { 
sdtest diffpatentswrt1918 if year==`i', by(treatmentgroup)
}

## Chapter 4 — Extensions to the Basic DiD Model

This chapter extends the basic DiD estimators to address:

- baseline covariate control,
- interactive fixed effects,
- synthetic control methods,
- sensitivity analysis to pretrend violations.

### 4.1 Estimators with Controls

The initial level of patents (`patents1900`) is controlled, allowing its effect to vary over time.

In [None]:
* TWFE with patent control in 1900
reghdfe patents reltimeminus* reltimeplus*, /// 
absorb(year#patents1900 treatmentgroup) cluster(subclass)

* Joint pre-tendency test
test reltimeminus1 reltimeminus2 reltimeminus3 reltimeminus4 reltimeminus5 /// 
reltimeminus6 reltimeminus7 reltimeminus8 reltimeminus9 reltimeminus10 /// 
reltimeminus11 reltimeminus12 reltimeminus13 reltimeminus14 /// 
reltimeminus15 reltimeminus16 reltimeminus17 reltimeminus18

The event-study corresponding to the model with controls is built.

In [None]:
reghdfe patents reltimeminus* reltimeplus*, ///
    absorb(year#patents1900 treatmentgroup) cluster(subclass)

matrix temp = r(table)'
matrix res = J(40,4,0)
matrix res[19,1]=0

forvalues x = 1/18 {
    matrix res[19-`x',1]=-`x'
    matrix res[19-`x',2]=temp[`x',1]
    matrix res[19-`x',3]=temp[`x',5]
    matrix res[19-`x',4]=temp[`x',6]
}
forvalues x = 1/21 {
    matrix res[`x'+19,1]=`x'
    matrix res[`x'+19,2]=temp[`x'+18,1]
    matrix res[`x'+19,3]=temp[`x'+18,5]
    matrix res[`x'+19,4]=temp[`x'+18,6]
}

preserve
    drop _all
    svmat res
    twoway (scatter res2 res1) ///
           (line res2 res1) ///
           (rcap res4 res3 res1), ///
           title("Event-study TWFE con controles")
restore


It is verified whether the baseline covariate is correlated with the treated group.

In [None]:
reg patents1900 treatmentgroup if year==1900

A dynamic DiD is estimated by non-parametrically controlling for the initial covariate.

In [None]:
did_multiplegt_dyn patents subclass year twea, ///
    effects(21) placebo(18) trends_nonparam(patents1900)

### 4.2 Interactive Fixed Effects

Unobserved factors are allowed to vary over time with specific per-unit loads.

In [None]:
net install fect, from(https://raw.githubusercontent.com/xuyiqing/fect_stata/master/) replace
ssc install _gwtmean, replace

* Selection of the number of factors
fect patents, treat(twea) unit(subclass) time(year) method("ife") r(4) cv

*Final estimate
seed set 1
fect patents, treat(twea) unit(subclass) time(year) method("ife") r(2) se

The dynamic effects estimated by IFE are compared with those obtained under TWFE.

In [None]:
matrix res_ife = J(21,4,0)
forvalues x = 1/21 {
    matrix res_ife[`x',1]=`x'
    matrix res_ife[`x',2]=e(ATTs)[`x'+19,3]
    matrix res_ife[`x',3]=e(ATTs)[`x'+19,6]
    matrix res_ife[`x',4]=e(ATTs)[`x'+19,7]
}

matrix res_post = res_post[2..22,1..4]

preserve
    drop _all
    svmat res_ife
    svmat res_post
    twoway (scatter res_ife2 res_ife1) ///
           (line res_ife2 res_ife1) ///
           (line res_post2 res_post1), ///
           title("IFE vs TWFE")
restore


### 4.3 Synthetic Control

Dynamic effects are estimated using synthetic control, with bootstrap-based inference.

In [None]:
ssc install sdid_event, replace

set seed 1
sdid_event patents subclass year twea, method("sc") brep(200)


The exercise is repeated after removing the pre-treatment average per unit.

In [None]:
bys subclass: egen pre_mean = mean(patents) if year<=1918
gen patents_demeaned = patents - pre_mean

set seed 1
sdid_event patents_demeaned subclass year twea, method("sc") brep(200)


### 4.4 Synthetic DiD

DiD and synthetic control are combined into a single estimator.

In [None]:
set seed 1
sdid_event patents subclass year twea, brep(200)


### 4.5 Sensitivity Analysis (Rambachan and Roth)

The robustness of the estimated effects is evaluated in the face of potential violations of pre-trends.

In [None]:
reghdfe patents reltimeminus* reltimeplus*, ///
    absorb(year treatmentgroup) cluster(subclass)

matrix l_vec = J(21,1,0)
matrix l_vec[14,1]=1

honestdid, pre(1/18) post(19/39) mvec(0.5(0.5)2) l_vec(l_vec)


Sensitivity using only a specific lead and lag.

In [None]:
preserve
    keep if year==1918 | year==1904 | year==1932
    reghdfe patents reltimeminus14 reltimeplus14, ///
        absorb(year treatmentgroup) cluster(subclass)
    honestdid, pre(1) post(2) mvec(2(2)10)
restore


## Chapter 5 — TWFE and Weight Decomposition

This chapter examines how TWFE estimators combine comparisons between units and periods, using data from **Gentzkow et al. (2011)**.

Before we begin, we load the database corresponding to this chapter.

In [None]:
use "cc_xd_didtextbook_2025_9_30/Data sets/gentzkow et al 2011/gentzkowetal_didtextbook.dta", clear


### 5.1 TWFE Regression

A standard TWFE model is estimated and the estimator is decomposed into its implicit weights.

In [None]:
* TWFE estimation
areg prestout i.year numdailies, absorb(cnty90) cluster(cnty90)

* Weight decomposition
twowayfeweights prestout cnty90 year numdailies, type(feTR)

### 5.2 TWFE with State-Specific Trends

Each state is allowed to have its own temporal trend, controlling for unobserved heterogeneity.

In [None]:
* Create trend dummies by state
qui tab styr, gen(styr)

* Estimation with specific trends
qui areg prestout i.year i.styr numdailies, absorb(cnty90) cluster(cnty90)
di _b[numdailies], _se[numdailies]

* Decomposition with controls
twowayfeweights prestout cnty90 year numdailies, ///
type(feTR) controls(styr1-styr683)

### 5.3 First Difference Regression

The model is estimated using first differences with specific trends, and the estimator weights are re-analyzed.

In [None]:
* First Difference Estimation
areg changeprestout changedailies, absorb(styr) cluster(cnty90)

* Weight Decomposition
twowayfeweights changeprestout cnty90 year changedailies numdailies, ///

type(fdTR) controls(styr1-styr683)

* Correlation Test between Weights and Time
twowayfeweights changeprestout cnty90 year changedailies numdailies, ///

type(fdTR) controls(styr1-styr683) test_random_weights(year)

## Chapter 6 — Multiple Cohorts and Alternative Estimators

This chapter analyzes data with **stepwise treatments** and compares different Difference-in-Differences estimators proposed in the recent literature.

Before starting, the database used in this chapter is loaded.

In [None]:
use "cc_xd_didtextbook_2025_9_30/Data sets/Wolfers 2006/wolfers2006_didtextbook.dta", clear

### 6.1 Static TWFE Regression

A standard TWFE model with fixed effects per state and year is estimated,
weighted by state population.

In [None]:
*Static TWFE
reg div_rate udl i.state i.year [w=stpop], vce(cluster state)

### 6.2 Decomposition of the TWFE estimator

The implicit weights of the TWFE estimator are analyzed and it is assessed whether they are correlated with the duration of exposure.

In [None]:
twowayfeweights div_rate state year udl, ///
    type(feTR) test_random_weights(exposurelength) weight(stpop)


### 6.3 Test of randomization in treatment timing

This assesses whether the timing of treatment initiation can be considered exogenous.

In [None]:
reg div_rate i.early_late_never if cohort!=1956 & year<=1968 ///
    [w=stpop], vce(cluster state)

### 6.4 Event-study TWFE

Dynamic effects are estimated using an event-study TWFE and pre-trends are tested.

In [None]:
reg div_rate rel_time* i.state i.year [w=stpop], vce(cluster state)

* Joint pre-tendency test
test rel_timeminus1 rel_timeminus2 rel_timeminus3 rel_timeminus4 /// 
rel_timeminus5 rel_timeminus6 rel_timeminus7 rel_timeminus8 rel_timeminus9

### 6.5 Decomposition of the First Dynamic Effect

The first coefficient of the event study is decomposed
to analyze which comparisons identify it.

In [None]:
twowayfeweights div_rate state year rel_time1, ///
    type(feTR) test_random_weights(year) weight(stpop) ///
    other_treatments(rel_time2-rel_time16) ///
    controls(rel_timeminus1-rel_timeminus9)


### 6.6 Sun and Abraham (2021) Estimator

A robust event study is estimated for stepped treatments using adoption cohorts.

In [None]:
replace cohort = . if cohort==0

eventstudyinteract div_rate rel_time* [aweight=stpop], ///
    absorb(i.state i.year) ///
    cohort(cohort) control_cohort(controlgroup) ///
    vce(cluster state)


### 6.7 Callaway and Sant’Anna (2021) Estimator

Dynamic average effects are estimated by cohort using untreated groups as controls.

In [None]:
replace cohort = 0 if cohort==.

csdid div_rate [weight=stpop], ///
    ivar(state) time(year) gvar(cohort) ///
    notyet agg(event)


### 6.8 Estimador de de Chaisemartin y D’Haultfoeuille

Se estiman efectos dinámicos permitiendo heterogeneidad entre cohortes y a lo largo del tiempo.

In [None]:
did_multiplegt_dyn div_rate state year udl, ///
    effects(16) placebo(9) weight(stpop)


### 6.9 Estimador de Borusyak et al. (2021)

Se estima el efecto dinámico usando imputación bajo supuestos de tendencias paralelas.


In [None]:
replace cohort = . if cohort==0

did_imputation div_rate state year cohort ///
    [aweight=stpop], horizons(0/15) autosample minn(0) pre(9)


## Chapter 7 — Continuous Treatments and Robustness Tests

This chapter analyzes DiD estimators with continuous treatment (ntrgap) and studies various pre-trend and robustness tests, using data from Pierce and Schott (2016).


Before starting, the database corresponding to this chapter is loaded.

In [None]:
use "cc_xd_didtextbook_2025_9_30/Data sets/Pierce and Schott 2016/pierce_schott_didtextbook.dta", clear

### 7.1 TWFE Regressions

Simple TWFE regressions are estimated for different years, treating the NTR gap as a continuous treatment.

In [None]:
reg delta2001 ntrgap, vce(hc2, dfadjust)
reg delta2002 ntrgap, vce(hc2, dfadjust)
reg delta2004 ntrgap, vce(hc2, dfadjust)
reg delta2005 ntrgap, vce(hc2, dfadjust)


### 7.2 Weight Analysis

The implicit weights of the estimator are analyzed when the treatment is continuous.

In [None]:
twowayfeweights delta2001 indusid cons ntrgap ntrgap, type(fdTR)


### 7.3 Treatment Randomization Test

This test assesses whether the NTR gap is correlated with pre-treatment characteristics.

In [None]:
reg ntrgap lemp1997 lemp1998 lemp1999 lemp2000, vce(hc2, dfadjust)


### 7.4 Stute Test

The non-parametric Stute test is implemented to evaluate the validity of the DiD design.

In [None]:
stute_test delta2001 ntrgap, seed(1)
stute_test delta2002 ntrgap, seed(1)
stute_test delta2004 ntrgap, seed(1)
stute_test delta2005 ntrgap, seed(1)


A joint post-treatment effect test is performed.

In [None]:
preserve
    reshape long delta deltalintrend, i(indusid) j(year)
    stute_test delta ntrgap indusid year if year>=2001, seed(1)
restore


Se evalúa la presencia de *quasi-stayers* en la intensidad del tratamiento.


In [None]:
sort ntrgap
scalar stat_test_qs = ntrgap[1] / (ntrgap[2] - ntrgap[1])
di stat_test_qs


### 7.5 Pre-trend (linear) tests

Differences in trends are evaluated before treatment.


In [None]:
reg delta1999 ntrgap, vce(hc2, dfadjust)
reg delta1998 ntrgap, vce(hc2, dfadjust)
reg delta1997 ntrgap, vce(hc2, dfadjust)


### 7.6 Pre-trends with industry-specific trends

Industry-specific linear trends are permitted, both in parametric and non-parametric form.

In [None]:
* Parametric
reg deltalintrend1998 ntrgap, vce(hc2, dfadjust)
reg deltalintrend1997 ntrgap, vce(hc2, dfadjust)

* Non-parametric
stute_test deltalintrend1998 ntrgap, order(0) seed(1)
stute_test deltalintrend1997 ntrgap, order(0) seed(1)

Joint test of pre-trends with specific trends.

In [None]:
preserve 
reshape long delta deltalintrend, i(indusid) j(year) 
stute_test deltalintrend ntrgap indusid year if year<=1998, order(0) seed(1)
restore

### 7.7 Stute Test with Linear Trends (Post)

The robustness of the post-treatment effects is evaluated, allowing for linear trends.

In [None]:
stute_test deltalintrend2001 ntrgap, seed(1)
stute_test deltalintrend2002 ntrgap, seed(1)
stute_test deltalintrend2004 ntrgap, seed(1)
stute_test deltalintrend2005 ntrgap, seed(1)


Joint test of post-treatment effects with linear trends.

In [None]:
preserve
    reshape long delta deltalintrend, i(indusid) j(year)
    stute_test deltalintrend ntrgap indusid year if year>=2001, seed(1)
restore


### 7.8 Estimators with Linear Trends

Effects are estimated by explicitly allowing
linear trends by industry.

In [None]:
reg deltalintrend2001 ntrgap, vce(hc2, dfadjust)
reg deltalintrend2002 ntrgap, vce(hc2, dfadjust)
reg deltalintrend2004 ntrgap, vce(hc2, dfadjust)
reg deltalintrend2005 ntrgap, vce(hc2, dfadjust)


## Chapter 8 — Dynamics with Lagged Treatments

This chapter examines DiD models with **dynamic treatments** and **lagged effects**, again using data from **Gentzkow et al. (2011)**.

The previously used database is reloaded to ensure a clean environment before estimations.

In [None]:
use "cc_xd_didtextbook_2025_9_30/Data sets/gentzkow et al 2011/gentzkowetal_didtextbook.dta", clear

### 8.1 Is the change in newspapers as-good-as-random?

We evaluate whether the change in the number of newspapers can be considered exogenous, conditional on lagged variables.

In [None]:
reg changedailies lag_numdailies, cluster(cnty90)
reg changedailies lag_ishare_urb, cluster(cnty90)


### 8.2 TWFE with lagged treatments

A TWFE model is estimated that includes both the contemporaneous treatment and its lag.

In [None]:
* Estimation
areg prestout i.year numdailies lag_numdailies, /// 
absorb(cnty90) cluster(cnty90)

* Weight breakdown
twowayfeweights prestout cnty90 year numdailies, /// 
other_treatments(lag_numdailies) type(feTR)

twowayfeweights prestout cnty90 year lag_numdailies, /// 
other_treatments(numdailies) type(feTR)

### 8.3 Unnormalized Event Study

Unnormalized dynamic effects are estimated using the Chaisemartin and D’Haultfoeuille estimator.

In [None]:
did_multiplegt_dyn prestout cnty90 year numdailies, ///
    effects(4) placebo(4) effects_equal(all)


### 8.4 Trajectory Decomposition

The individual trajectories that make up the unnormalized average effects are analyzed.

In [None]:
did_multiplegt_dyn prestout cnty90 year numdailies, ///
    effects(1) design(0.8,console) graph_off

did_multiplegt_dyn prestout cnty90 year numdailies, ///
    effects(2) design(0.8,console) graph_off

did_multiplegt_dyn prestout cnty90 year numdailies, ///
    effects(4) design(0.8,console) graph_off


### 8.5 Normalized Event Study

Dynamic effects are normalized to facilitate interpretation and comparison between periods.

In [None]:
did_multiplegt_dyn prestout cnty90 year numdailies, ///
    effects(4) placebo(4) normalized normalized_weights ///
    effects_equal(all)


### 8.6 Lagged Effects of Treatment Test

This test assesses whether treatment lags affect the outcome.

In [None]:
did_multiplegt_dyn prestout cnty90 year numdailies ///
    if year<=first_change | same_treat_after_first_change==1, ///
    effects(2) effects_equal(all) same_switchers graph_off


### 8.7 Estimators without lagged treatment effects

Models are estimated that assume the absence of effects of past treatments on the current outcome.

In [None]:
egen election_number = group(year)

did_multiplegt_stat prestout cnty90 election_number numdailies, ///
    placebo(1) exact_match

tab lag_numdailies if year==first_change
tab lag_numdailies if changedailies!=0 & changedailies!=. & year!=1868
