# Geo-Experimentation with Comparative Interrupted Time Series

This notebook demonstrates Comparative Interrupted Time Series (CITS) analysis using a geo-experimentation example. The analysis is based on the approach described in [Juan Orduz's blog post on time-based regression for geo-experiments](https://juanitorduz.github.io/time_based_regression_pymc/), which is partially based on the paper "Estimating Ad Effectiveness using Geo Experiments in a Time-Based Regression Framework" by Vaver & Koehler (2011).

## What is Comparative Interrupted Time Series?

**Comparative Interrupted Time Series (CITS)** extends the standard Interrupted Time Series (ITS) design by incorporating control units as predictors in the model. This approach provides stronger causal inference by accounting for common trends and shocks that affect both treated and control units.

Key characteristics of CITS:
- Uses the `InterruptedTimeSeries` class with control units as predictors
- Formula includes control unit observations: `treated ~ 1 + control_1 + control_2 + ...`
- Can include an intercept (unlike Synthetic Control)
- No sum-to-1 constraint on control weights (unlike Synthetic Control)
- Provides a middle ground between ITS and Synthetic Control methods

### Comparison: ITS vs CITS vs Synthetic Control

| Aspect | ITS | CITS | Synthetic Control |
|--------|-----|------|-------------------|
| **Control units** | None (or not in predictors) | Yes, as predictors | Yes, as predictors |
| **Intercept** | Typically yes | Yes | Typically no |
| **Weight constraint** | N/A | None | Sum to 1 |
| **Interpretation** | Counterfactual from time trends | Counterfactual from controls + trends | Weighted combination of controls |
| **Use case** | Single unit, strong trends | Multiple units, common trends | Multiple units, parallel trajectories |

:::{note}
When predictors in the model are **not** control units (e.g., just time, seasonality, or other covariates), you have standard **ITS**. When you include one or more control units as predictors, you have **CITS**.
:::

In [None]:
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import causalpy as cp

In [None]:
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = 'retina'
seed = 42
az.style.use("arviz-doc")

## The Geo-Experimentation Scenario

In this example, we analyze a marketing campaign experiment where:
- A company runs an advertising campaign in a subset of geographical regions (treated zipcodes)
- Other regions serve as controls (control zipcodes)
- The outcome of interest is the order rate (orders per customer)
- The campaign runs for a specific time period
- We want to estimate the causal impact of the campaign on order rates

This is a common scenario in digital marketing, retail, and platform businesses where randomization at the user level is not feasible, but geographical or regional-level experiments are possible.

## Load and Prepare Data

The dataset contains synthetic data for 100 zipcodes (33 treatment, 67 control) over 122 days (April 1 - July 31, 2022). A marketing campaign was run in the treatment zipcodes from May 2 to May 31, 2022.

In [None]:
# Load the zipcode data
df_raw = cp.load_data("zipcodes")

# Convert date column to datetime and set as index
df_raw["date"] = pd.to_datetime(df_raw["date"])

# Display basic information
print(f"Date range: {df_raw['date'].min()} to {df_raw['date'].max()}")
print(f"Total zipcodes: {df_raw['zipcode'].nunique()}")
print(f"Treatment zipcodes: {df_raw[df_raw['variant']=='treatment']['zipcode'].nunique()}")
print(f"Control zipcodes: {df_raw[df_raw['variant']=='control']['zipcode'].nunique()}")

df_raw.head()

### Aggregate Data by Treatment Group

For CITS analysis, we aggregate the data across zipcodes within each group (treatment and control). This gives us:
- One time series for the treated unit (aggregated across all treatment zipcodes)
- One time series for the control unit (aggregated across all control zipcodes)

We'll scale the data by dividing by population to get comparable rates.

In [None]:
# Aggregate by date and variant
df_agg = (
    df_raw.groupby(["date", "variant"])
    .agg({"orders": "sum", "population": "sum"})
    .reset_index()
)

# Calculate scaled order rate (orders per 1000 population)
df_agg["order_rate_scaled"] = (df_agg["orders"] / df_agg["population"]) * 1000

# Pivot to get treatment and control in separate columns
df_pivot = df_agg.pivot(index="date", columns="variant", values="order_rate_scaled")
df_pivot.columns.name = None
df_pivot = df_pivot.reset_index()

# Rename columns for clarity
df_pivot.columns = ["date", "control", "treatment"]

# Set date as index
df_pivot = df_pivot.set_index("date")

print("Aggregated data shape:", df_pivot.shape)
df_pivot.head()

### Visualize Pre-Treatment Trends

Before running the analysis, it's important to check that the treatment and control groups have similar pre-treatment trends. This parallel trends assumption is crucial for causal inference.

In [None]:
# Define the campaign period
campaign_start = pd.Timestamp("2022-05-02")
campaign_end = pd.Timestamp("2022-05-31")

# Plot the time series
fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(df_pivot.index, df_pivot["treatment"], label="Treatment", linewidth=2)
ax.plot(df_pivot.index, df_pivot["control"], label="Control", linewidth=2, alpha=0.7)
ax.axvline(campaign_start, color="red", linestyle="--", linewidth=2, label="Campaign start")
ax.axvline(campaign_end, color="orange", linestyle="--", linewidth=2, label="Campaign end")
ax.set_xlabel("Date")
ax.set_ylabel("Order rate (per 1000 population)")
ax.set_title("Treatment vs Control: Order Rates Over Time")
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## CITS Analysis with the InterruptedTimeSeries Class

Now we use CausalPy's `InterruptedTimeSeries` class to perform CITS analysis. The key difference from standard ITS is that we include the control unit as a predictor in our model formula:

```python
formula = "treatment ~ 1 + control"
```

This formula tells the model:
- Predict `treatment` (the treated unit's order rate)
- Using `control` (the control unit's order rate) as a predictor
- Plus an `intercept` (the `1`)

The model will learn the relationship between treatment and control in the pre-intervention period, then use this to forecast a counterfactual for the post-intervention period.

In [None]:
# Run CITS analysis
result = cp.InterruptedTimeSeries(
    df_pivot,
    treatment_time=campaign_start,
    formula="treatment ~ 1 + control",
    model=cp.pymc_models.LinearRegression(
        sample_kwargs={
            "target_accept": 0.95,
            "random_seed": seed,
            "progressbar": False,
            "draws": 2000,
        }
    ),
)

# Display model summary
result.summary()

### Interpret the Model Coefficients

The model coefficients tell us about the pre-intervention relationship between treatment and control:

- **Intercept**: The baseline difference between treatment and control when control = 0
- **Control coefficient**: How much the treatment unit's order rate changes for each unit change in the control unit's order rate

A coefficient close to 1 would indicate the treatment and control move in lockstep during the pre-period. The coefficient can be greater than 1 (treatment is more sensitive to factors affecting both) or less than 1 (treatment is less sensitive).

## Visualize Results

The plot shows:
1. **Top panel**: Observed data (black dots) vs model predictions
   - Pre-intervention fit (blue)
   - Counterfactual prediction (orange) - what would have happened without the campaign
   - The gap between observed and counterfactual is the causal impact
2. **Middle panel**: The causal impact at each time point
3. **Bottom panel**: The cumulative causal impact over time

In [None]:
fig, ax = result.plot()
plt.tight_layout()
plt.show()

## Effect Summary

Let's quantify the causal effect of the campaign using the `effect_summary()` method. This provides:
- Average impact per time period
- Cumulative impact over the entire post-intervention period
- Highest Density Intervals (HDIs) for uncertainty quantification
- Tail probabilities for hypothesis testing

In [None]:
summary = result.effect_summary(direction="increase", alpha=0.05)
print(summary.text)
print("\nDetailed statistics:")
summary.table

### Interpretation

The effect summary tells us:
- **Average impact**: The mean difference between observed and counterfactual order rates per day
- **Cumulative impact**: The total additional orders (scaled per 1000 population) attributable to the campaign
- **Credible intervals**: Range of plausible values for the true effect
- **Tail probability**: Probability that the effect is in the direction hypothesized (increase)

These statistics provide evidence for decision-making about the campaign's effectiveness.

## Model Diagnostics

It's important to check that our Bayesian model has converged properly and that the MCMC sampler has explored the posterior distribution effectively.

In [None]:
# Check model diagnostics
# Rhat should be close to 1.0, effective sample size should be large
az.summary(result.model.idata, var_names=["beta"])

## Key Takeaways

1. **CITS provides robust causal inference** by using control units to account for common trends and external shocks
2. **The `InterruptedTimeSeries` class is flexible** - it can handle standard ITS (no control predictors), CITS (control predictors), and various model specifications
3. **Pre-treatment fit is crucial** - a good model fit in the pre-period gives us confidence in the counterfactual predictions
4. **Uncertainty quantification matters** - Bayesian credible intervals tell us the range of plausible effects, not just a point estimate
5. **CITS sits between ITS and Synthetic Control** - it borrows strength from controls without the strict constraints of Synthetic Control

## When to Use CITS?

CITS is particularly useful when:
- You have good control units that track the treated unit in the pre-period
- You want to allow for an intercept shift between treatment and control
- You don't want to impose sum-to-1 constraints on control weights
- You have time-varying confounders that affect all units similarly
- You're analyzing geo-experiments, marketing tests, or policy evaluations with clear control groups

## References

:::{bibliography}
:filter: docname in docnames
:::