# DiD Hypothesis Testing: Opioid Policy Interventions in Florida and Washington

This notebook tests four hypotheses using Difference-in-Differences (DiD) analysis for opioid policy interventions in Florida (2010) and Washington (2012). For each hypothesis, we fit an OLS regression model and display the results in the style of a standard regression summary.

**Hypotheses:**
1. Florida: Policy effect on opioid volume per capita
2. Florida: Policy effect on overdose mortality rate
3. Washington: Policy effect on opioid volume per capita
4. Washington: Policy effect on overdose mortality rate

---

## 1. Import Required Libraries
Import pandas, numpy, statsmodels, and other libraries needed for data analysis and regression modeling.

In [1]:
# Imports and display settings
import pandas as pd
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt
import seaborn as sns

pd.set_option("mode.copy_on_write", True)
sns.set_style("whitegrid")

## 2. Load Cleaned and Merged Data
Load the cleaned and merged dataset used for DiD analysis (e.g., final_merged_150k.csv) using pandas.

In [2]:
# Load the final merged dataset (150K threshold, end-use buyers)
df = pd.read_csv("../01_data/clean/final_merged_150k.csv")
print(f"Loaded {len(df)} observations")

Loaded 566 observations


## 3. Prepare Data for DiD Analysis

For each hypothesis, we create treatment, post, and interaction (treated_post) variables. We subset the data for each treated state and its control states, and define the pre- and post-policy periods.

In [3]:
# Prepare DiD variables and outcome variables for Florida and Washington
fl_control_states = [
    "North Carolina",
    "South Carolina",
    "Georgia",
    "Tennessee",
    "Mississippi",
]
wa_control_states = ["Montana", "Oregon", "Idaho", "Colorado", "California"]

# Florida DiD variables
fl_did = df[
    (df["STNAME"].isin(["Florida"] + fl_control_states))
    & (df["Year"] >= 2006)
    & (df["Year"] <= 2015)
].copy()
fl_did["treated"] = (fl_did["STNAME"] == "Florida").astype(int)
fl_did["post"] = (fl_did["Year"] > 2010).astype(int)
fl_did["treated_post"] = fl_did["treated"] * fl_did["post"]

# Washington DiD variables
wa_did = df[
    (df["STNAME"].isin(["Washington"] + wa_control_states))
    & (df["Year"] >= 2006)
    & (df["Year"] <= 2015)
].copy()
wa_did["treated"] = (wa_did["STNAME"] == "Washington").astype(int)
wa_did["post"] = (wa_did["Year"] > 2012).astype(int)
wa_did["treated_post"] = wa_did["treated"] * wa_did["post"]

# Create outcome variables
fl_did["MME_per_capita"] = fl_did["TOTAL_MME"] / fl_did["population"]
fl_did["mortality_per_100k"] = fl_did["Deaths"] / fl_did["population"] * 100000
wa_did["MME_per_capita"] = wa_did["TOTAL_MME"] / wa_did["population"]
wa_did["mortality_per_100k"] = wa_did["Deaths"] / wa_did["population"] * 100000

## 4. Florida: Opioid Volume DiD Model

Fit an OLS regression model for opioid volume per capita in Florida and its control states using the DiD specification. Display the regression summary output.

In [4]:
# Fit and display OLS regression for Florida opioid volume (DiD)
model_fl_opioid = smf.ols(
    formula="MME_per_capita ~ treated + post + treated_post", data=fl_did
).fit()
print(model_fl_opioid.summary())

                            OLS Regression Results                            
Dep. Variable:         MME_per_capita   R-squared:                       0.043
Model:                            OLS   Adj. R-squared:                  0.034
Method:                 Least Squares   F-statistic:                     4.890
Date:                Thu, 11 Dec 2025   Prob (F-statistic):            0.00244
Time:                        22:36:22   Log-Likelihood:                -2746.8
No. Observations:                 331   AIC:                             5502.
Df Residuals:                     327   BIC:                             5517.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                   coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------
Intercept     1026.7592     99.840     10.284   

## 5. Florida: Overdose Mortality DiD Model

Fit an OLS regression model for overdose mortality rate in Florida and its control states using the DiD specification. Display the regression summary output.

In [5]:
# Fit and display OLS regression for Florida overdose mortality (DiD)
model_fl_mortality = smf.ols(
    formula="mortality_per_100k ~ treated + post + treated_post", data=fl_did
).fit()
print(model_fl_mortality.summary())

                            OLS Regression Results                            
Dep. Variable:     mortality_per_100k   R-squared:                       0.058
Model:                            OLS   Adj. R-squared:                  0.049
Method:                 Least Squares   F-statistic:                     6.130
Date:                Thu, 11 Dec 2025   Prob (F-statistic):           0.000468
Time:                        22:36:22   Log-Likelihood:                -996.57
No. Observations:                 300   AIC:                             2001.
Df Residuals:                     296   BIC:                             2016.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                   coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------
Intercept       12.9416      0.755     17.146   

## 6. Washington: Opioid Volume DiD Model

Fit an OLS regression model for opioid volume per capita in Washington and its control states using the DiD specification. Display the regression summary output.

In [6]:
# Fit and display OLS regression for Washington opioid volume (DiD)
model_wa_opioid = smf.ols(
    formula="MME_per_capita ~ treated + post + treated_post", data=wa_did
).fit()
print(model_wa_opioid.summary())

                            OLS Regression Results                            
Dep. Variable:         MME_per_capita   R-squared:                       0.033
Model:                            OLS   Adj. R-squared:                  0.020
Method:                 Least Squares   F-statistic:                     2.618
Date:                Thu, 11 Dec 2025   Prob (F-statistic):             0.0517
Time:                        22:36:22   Log-Likelihood:                -1812.9
No. Observations:                 235   AIC:                             3634.
Df Residuals:                     231   BIC:                             3648.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                   coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------
Intercept     1169.6629     49.312     23.720   

## 7. Washington: Overdose Mortality DiD Model

Fit an OLS regression model for overdose mortality rate in Washington and its control states using the DiD specification. Display the regression summary output.

In [7]:
# Fit and display OLS regression for Washington overdose mortality (DiD)
model_wa_mortality = smf.ols(
    formula="mortality_per_100k ~ treated + post + treated_post", data=wa_did
).fit()
print(model_wa_mortality.summary())

                            OLS Regression Results                            
Dep. Variable:     mortality_per_100k   R-squared:                       0.029
Model:                            OLS   Adj. R-squared:                 -0.004
Method:                 Least Squares   F-statistic:                    0.8897
Date:                Thu, 11 Dec 2025   Prob (F-statistic):              0.450
Time:                        22:36:22   Log-Likelihood:                -248.79
No. Observations:                  93   AIC:                             505.6
Df Residuals:                      89   BIC:                             515.7
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                   coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------
Intercept       10.2771      0.718     14.314   

## 8. Summary Table: Full Regression Results


| State      | Outcome                | Level Change | p-value  |
|------------|------------------------|--------------|----------|
| Florida    | Shipments (MME/100k)   | -320.02      | 0.148    |
| Florida    | Deaths (per 100k)      | -3.86        | 0.0166*  |
| Washington | Shipments (MME/100k)   | 186.35       | 0.311    |
| Washington | Deaths (per 100k)      | 1.98         | 0.205    |

*p<0.05, **p<0.01, ***p<0.001. Standard errors not clustered in this summary.*