# Panel Fixed Effects (PyMC)

{term}Panel data` lets us follow the same units over time. Fixed effects regression
controls for time-invariant unit heterogeneity by estimating unit-specific intercepts
or by using the within transformation.

We introduce `PanelRegression`, show dummy-variable and within estimators, and
connect the approach to Difference-in-Differences ({doc}notebooks/did_pymc).

:::{note}
This notebook uses small sampling settings for quick execution.
:::


In [None]:
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import causalpy as cp

## Simulated panel data

We simulate a balanced panel with unit effects, time effects, and a treatment effect.
The goal is to recover the treatment effect while accounting for unit heterogeneity.


In [None]:
rng = np.random.default_rng(42)

n_units = 30
n_periods = 8
units = np.repeat(np.arange(n_units), n_periods)
times = np.tile(np.arange(n_periods), n_units)

unit_effects = rng.normal(0, 1.0, n_units)
time_effects = rng.normal(0, 0.5, n_periods)

x = rng.normal(0, 1.0, n_units * n_periods)
treatment = rng.binomial(1, 0.4, n_units * n_periods)
true_effect = 1.5

noise = rng.normal(0, 0.5, n_units * n_periods)

y = (
    unit_effects[units]
    + time_effects[times]
    + true_effect * treatment
    + 0.5 * x
    + noise
)

panel_df = pd.DataFrame(
    {
        "unit": units,
        "time": times,
        "x": x,
        "treatment": treatment,
        "y": y,
    }
)

panel_df.head()

## Dummy-variable fixed effects

Include unit and time indicators directly in the formula. This is straightforward and
lets us inspect unit-specific coefficients.


In [None]:
dummies_model = cp.pymc_models.LinearRegression(
    sample_kwargs={
        "chains": 2,
        "draws": 200,
        "tune": 200,
        "target_accept": 0.9,
        "progressbar": False,
        "random_seed": 42,
    }
)

result_dummies = cp.PanelRegression(
    data=panel_df,
    formula="y ~ C(unit) + C(time) + treatment + x",
    unit_fe_variable="unit",
    time_fe_variable="time",
    fe_method="dummies",
    model=dummies_model,
)

result_dummies.summary()

## Within transformation

The within estimator demeans the data by unit (and time, if provided), which avoids
creating large dummy matrices. This is preferred for large panels.


In [None]:
within_model = cp.pymc_models.LinearRegression(
    sample_kwargs={
        "chains": 2,
        "draws": 200,
        "tune": 200,
        "target_accept": 0.9,
        "progressbar": False,
        "random_seed": 42,
    }
)

result_within = cp.PanelRegression(
    data=panel_df,
    formula="y ~ treatment + x",
    unit_fe_variable="unit",
    time_fe_variable="time",
    fe_method="within",
    model=within_model,
)

result_within.summary()

## Diagnostics and visualization

The plotting helpers focus on covariate coefficients, unit effects (for dummy models),
trajectory subsets, and residual diagnostics.

:::{note}
Within-transformed models plot demeaned outcomes and predictions.
:::


In [None]:
result_dummies.plot_coefficients()

result_dummies.plot_unit_effects(label_extreme=2)

result_dummies.plot_trajectories(n_sample=6, select="random")

result_dummies.plot_residuals(kind="scatter")

## Two-way fixed effects

Setting `time_fe_variable` applies time fixed effects alongside unit effects. This is
closely related to two-way fixed effects models used in Difference-in-Differences
settings ({doc}notebooks/did_pymc).

For panel data foundations, see :footcite:p:`cunningham2021causal`.


## References

:::{bibliography}
:filter: docname in docnames
:::
