# Topic Overview 
Differences-in-differences is a widely used method for causal inference that helps isolate treatment effects when both time and group differences may confound the outcome. By comparing the before-and-after changes in a treated group to those in a similar untreated group, DID estimates the treatment effect as the additional change experienced by the treated group beyond any general time trend or baseline group difference. This method relies on the crucial assumption that, in the absence of treatment, both groups would have followed parallel trends. While this counterfactual can’t be observed directly, analysts can examine pre-treatment patterns and conduct diagnostic tests to assess its plausibility. DID is especially useful in real-world contexts where randomized experiments are not feasible, making it a cornerstone technique in policy evaluation and observational studies. 

## Learning Objectives 
- Apply differences-in-differences to data. 
- Understand and test the parallel trends assumption. 
- Review some regression topics from Week 7. 

## 1.1 Lesson: Core Concepts and Assumptions of Differences-in-Differences
In differences-in-differences, we have two back doors or confounders: two different times and two different groups. If we naively record a treatment effect, we do not know whether this effect is partly due to the different times (some other factors changed with time, influencing the treated outcome at the later time) or due to the two groups (there is some difference between the treatment and control groups.) 

However, we can close both back doors at once. If the time back door operates the same on both groups, and if the group difference back door works the same at both times, then we have four different sets of samples (2 groups x 2 times) and 4 effect values to compute (a base case, a time difference, a group difference, and a treatment effect). The four different sets of samples are, therefore, enough to compute the four effect values. Essentially, the difference in the treated group across the two times equals the difference in the untreated group across the two times, plus the desired effect.

### Untreated Groups and Parallel Trends
The parallel trends assumption says that if no treatment had occurred, the change in outcome over time in the treatment group would have been the same as the change in outcome over time in the untreated group. In other words, the difference between treated and untreated groups would have been the same at the earlier and later times. 

Parallel trends cannot be observed because we can’t tell what “would have happened.” It’s about a counterfactual, so you just have to guess whether it’s true. 

Some tools for making this guess include:
1. We should have no reason to expect the untreated group to change around treatment time. 
2. The treated and untreated groups should be generally similar, apart from the treatment itself. 
3. The treated and untreated groups should have generally similar trajectories prior to the treatment.

No. 3 might involve looking at trends over time, if you have access to that kind of information. You could graph the prior trends over time and compare them by eye. You could also perform a placebo test, estimating differences-in-differences at times prior to the treatment event. If the placebo test detects an effect, it means that there is effectively a treatment at some unexpected time, which suggests that parallel trends are violated. 

You can also do DID on a log axis. This assumes a different kind of parallelism. For example, a linear trend like 

$$Y_{it} = \beta_i + \beta_1 \cdot t + \varepsilon_{it}$$ 

is different from a linear trend like

$$\log (Y_{it}) = \beta_i + \beta_1 \cdot t + \varepsilon_{it}$$
 
If one group increases from 1 to 10 and the other from 10 to 100, then on a log axis, the trend could be parallel, but on a non-log axis it isn’t parallel. It’s important to make the right choice here before doing DID.

### Two-Way Fixed Effects
The simplest way to estimate DID is via a regression:

$$Y = \alpha_{\text{group}} + \alpha_{\text{time}} + \beta_1 \cdot \text{Treated} + \varepsilon