# SDID Example
hsujulia
30oct2022

This paper tries to implement the Synthetic Difference-in-Differences (SDiD, hereafter) following [Arkhangelsky, Athey, Hirschberg, Imbens, and Wager (2021) ](https://arxiv.org/pdf/1812.09970.pdf). Callout to the R Package written by Hirschberge, [synthdid](https://synth-inference.github.io/synthdid/).

At a high-level, we want to estimate the tuple $(\hat{\tau}^{sdid}, \hat{\mu} , \hat{\alpha}, \hat{\beta}) $ to minimize:
$$\sum^N_{i=1} \sum^T_{t=1} \big( Y_{it} - \mu - \alpha_i - \beta_t - W_{it} \tau \big)^2 \hat{\omega}^{sdid}_i \hat{\lambda}^{sdid}_t $$

Where $\omega_i$ are unit-varying weights, and $\lambda$ are time-varying weights.

For reference, we can write Difference-in-Differences (DiD) and synthetic control (SC) models as:
* SDiD: $(\hat{\tau}^{SDiD}, \hat{\mu} , \hat{\alpha}, \hat{\beta}) = \text{arg min} \Big \{ \sum^N_{i=1} \sum^T_{t=1} \big( Y_{it} - \mu - \alpha_i - \beta_t - W_{it} \tau \big)^2 \hat{\omega}^{sdid}_i \hat{\lambda}^{sdid}_t \Big \}$ 
* DiD: $(\hat{\tau}^{DiD}, \hat{\mu} , \hat{\alpha}, \hat{\beta}) = \text{arg min} \Big \{ \sum^N_{i=1} \sum^T_{t=1} \big( Y_{it} - \mu - \alpha_i - \beta_t - W_{it} \tau \big)^2 \Big \}$ 
* SC: $(\hat{\tau}^{SC}, \hat{\mu} , \hat{\alpha}, \hat{\beta}) = \text{arg min} \Big \{ \sum^N_{i=1} \sum^T_{t=1} \big( Y_{it} - \mu - \alpha_i - \beta_t - W_{it} \tau \big)^2 \hat{\omega}^{sc}_i  \Big \}$.
    - Note that if we use SC defined by Abadie, [Diamond, and Hainmueller (201)](https://web.stanford.edu/~jhain/Paper/JASA2010.pdf), $\alpha = 0$ 





## Implementation following SDiD Paper (Arkhangelsky et al.)
**Algorithm (called Algorithm 1 from the SDiD paper)**

Notation:
1. Outcome trend $Y_{it}$ for units $i$ and time periods $t$.
2. For $t = 1,..., T_{pre}$, no units are treated. For $t = T_{post}$ periods, some units started to be treated. 
2. $W_{it}$ indicates unit-time instances that are treated. 
3. $N_{co}$ control units, and $N_{tr}$ treated units.
4. This means that units $i=1,..., N_{co}$ are control units, and $i = N_{co}+1,...,N$ are treated.

### 1. Compute regularization parameter $\zeta$ 
The regularization parameter, $\zeta$ will be used in the second step. It can be directly calculated from the data, without any optimization. Following the SDiD paper. This is basically the variance in year-to-year change in pre-treatment time outcomes among units from the control donor units.
$$\zeta = (N_{tr} T_{post} )^{\frac{1}{4}} \big( \frac{1}{N_{co}(T_{pre}-1)}\sum^{N_{co}}_{i=1} \sum^{T_{pre}-1}_{t=1} (\Delta_{it} - \bar{\Delta})^2 \big) $$
$\Delta_{it} = Y_{i \{t+1\}} - Y_{it}$, and
$\bar{\Delta} = \frac{1}{N_{co}(T_{pre}-1)}\sum^{N_{co}}_{i=1} \sum^{T_{pre}-1}_{t=1} \Delta_{it} $
### 2. Compute unit weights $\omega^{sdid}$ 
Let $\omega^{sdid} = (\omega_0, \omega^{sdid}_i)$, which we estimate to minimize pre-treatment outcomes among control and treated units. We use the regularization term above with an L2 regularization (the squared one, not the absolute one) of the parameters and $\zeta$ above to allow a unique solution.

$$ (\hat{\omega}_0, \hat{\omega}^{sdid}_i) = \text{arg min} \Big\{ 
\sum^{T_{pre}}_{t=1} (
    \omega_0 + \sum^{N_{co}}_{i=1} \omega_i Y_{it} - \frac{1}{ N_{tr}} \sum^N_{i = N_{co}+1} Y_{it}
)^2 + \zeta^2 T_{pre} ||\omega||^2_2
\Big\} : \sum^{N_{co}}{i=1} \omega_i = 1$$ 

### 3. Compute unit weights $\lambda^{sdid}_t$ 

### 4. Compute SDiD Estimator 