<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Import-packages-and-data" data-toc-modified-id="Import-packages-and-data-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Import packages and data</a></span></li><li><span><a href="#Synthetic-Controls-and-Diff-in-diff" data-toc-modified-id="Synthetic-Controls-and-Diff-in-diff-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Synthetic Controls and Diff-in-diff</a></span></li><li><span><a href="#Stacked-Diff-in-diff" data-toc-modified-id="Stacked-Diff-in-diff-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Stacked Diff-in-diff</a></span></li></ul></div>

This file contains analysis for Goodman & Orchard (2021) in R.

## Import packages and data

In [None]:
library(synthdid)
library(dplyr)
library(did)

## Synthetic Controls and Diff-in-diff

In [None]:
# Load data and format
df <- read.csv('../data/temp_data/synth_prepped.csv')
df <- df[df['months_since_treat'] <= 22, c(1, 5, 6, 4)]
df['months_since_treat'] = df['months_since_treat'] - 19

df %>% head()

In [None]:
# Synthetic Control in Cook County

setup = panel.matrices(df)
tau.hat = sc_estimate(setup$Y, setup2$N0, setup2$T0)
se = sqrt(vcov(tau.hat, method='placebo'))
sprintf('point estimate: %1.2f', tau.hat)
sprintf('95%% CI (%1.2f, %1.2f)', tau.hat - 1.96 * se, tau.hat + 1.96 * se)
plot(tau.hat)

In [None]:
synthdid_units_plot(tau.hat)

In [None]:
# Synthetic DiD in Cook County

setup2 = panel.matrices(df)
tau.hat = synthdid_estimate(setup2$Y, setup2$N0, setup2$T0)
se = sqrt(vcov(tau.hat, method='placebo'))
sprintf('point estimate: %1.2f', tau.hat)
sprintf('95%% CI (%1.2f, %1.2f)', tau.hat - 1.96 * se, tau.hat + 1.96 * se)
plot(tau.hat)

In [None]:
synthdid_units_plot(tau.hat)

## Stacked Diff-in-diff

We'll use the methods of Callaway and Sant'Anna here.

In [None]:
df2 <- read.csv('../data/gen_data/panelist_nutrition_month.csv')
df2 %>% head()

In [None]:
# encode locality variable

df2['locality'] %>% table()
df2['locality_num'] = as.numeric(as.factor(df2[['locality']]))
df2['locality_num'] %>% table()

In [None]:
df2['yearmonth_treat'] %>% table(exclude = FALSE)

In [None]:
# add future date to `yearmonth_treat` for control obs

df2[is.na(df2$yearmonth_treat), 'yearmonth_treat'] = 2020

In [None]:
out <- att_gt(yname = "sugargrams",
              tname = "yearmonth",
              idname = "household_code",
              gname = "yearmonth_treat",
              xformla = ~1,
              data = df2,
              panel=TRUE,
              allow_unbalanced_panel=FALSE,
              control_group = c("nevertreated", "notyettreated"),
              anticipation = 0,
              weightsname = NULL,
              alp = 0.05,
              bstrap = TRUE,
              cband = TRUE,
              biters = 1000,
              clustervars = "household_code",
              est_method = "reg",
              print_details = TRUE)

In [None]:
agg.simple <- aggte(out, type = 'simple')
summary(agg.simple)