# Marginal Effects and IRR in Count Data Models

Count data models (Poisson, Negative Binomial) share the same conditional mean function
$E[Y|X] = \exp(X'\beta)$, which implies that marginal effects are **not constant** — they
depend on the current level of the count.

Two interpretation frameworks are available:

- **Marginal effects (AME/MEM)**: change in expected count per unit change in $x$
- **Incidence Rate Ratios (IRR)**: multiplicative factor on expected count, $\text{IRR}_k = \exp(\beta_k)$

Both are correct; the choice depends on the research context.

---

**Table of Contents**

- [Section 1: The ME Formula for Count Models](#section-1)
- [Section 2: Poisson AME and MEM](#section-2)
- [Section 3: Incidence Rate Ratios (IRR)](#section-3)
- [Section 4: Overdispersion — Poisson vs Negative Binomial](#section-4)
- [Section 5: Fixed Effects Poisson](#section-5)
- [Section 6: Zero-Inflated Models — Doctor Visits](#section-6)

**Prerequisites**: Notebook 01 (ME Fundamentals), knowledge of Poisson/NegBin basics.  
**Datasets**: `patents` (firm-level patent counts), `doctor_visits` (German health data).  
**Level**: Intermediate-Advanced | **Duration**: 60-75 minutes

In [None]:
# Cell 2 — Setup and Imports
import sys
import os
import warnings
warnings.filterwarnings('ignore')

sys.path.insert(0, '/home/guhaase/projetos/panelbox')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import matplotlib.patches as mpatches

# Count models (array-based API)
from panelbox.models.count.poisson import PooledPoisson, PoissonFixedEffects
from panelbox.models.count.negbin import NegativeBinomial

# Marginal effects for count models
from panelbox.marginal_effects.count_me import (
    compute_poisson_ame, compute_poisson_mem,
    compute_negbin_ame, compute_negbin_mem
)

# Tutorial utilities
utils_path = '/home/guhaase/projetos/panelbox/examples/marginal_effects/utils'
sys.path.insert(0, utils_path)
from data_loaders import load_dataset
from me_helpers import format_me_table

# Output directories
OUT_PLOTS  = '/home/guhaase/projetos/panelbox/examples/marginal_effects/outputs/plots'
OUT_TABLES = '/home/guhaase/projetos/panelbox/examples/marginal_effects/outputs/tables'
os.makedirs(OUT_PLOTS,  exist_ok=True)
os.makedirs(OUT_TABLES, exist_ok=True)

plt.style.use('seaborn-v0_8-whitegrid')
pd.set_option('display.float_format', '{:.4f}'.format)
print('Setup complete.')
print(f'Output plots  : {OUT_PLOTS}')
print(f'Output tables : {OUT_TABLES}')

## Section 1: The Marginal Effect Formula for Count Models <a name="section-1"></a>

Both Poisson and Negative Binomial share the same conditional mean:

$$E[Y|X] = \exp(X'\beta) = \lambda(X)$$

Taking the partial derivative:

$$\frac{\partial E[Y|X]}{\partial x_k} = \beta_k \cdot \exp(X'\beta) = \beta_k \cdot \lambda(X)$$

**Key implications:**

- The ME is **not constant** — it grows with the predicted count $\lambda$
- A firm with 100 patents shows a much larger ME than a firm with 2 patents
- **Average Marginal Effect (AME)**: $\text{AME}_k = \beta_k \times \frac{1}{N}\sum_i \exp(X_i'\beta) \approx \beta_k \times \bar{\hat{y}}$
- **Marginal Effect at Means (MEM)**: $\text{MEM}_k = \beta_k \times \exp(\bar{X}'\beta)$

**Alternative — Incidence Rate Ratio (IRR):**

$$\text{IRR}_k = \exp(\beta_k)$$

Interpretation: a 1-unit increase in $x_k$ multiplies the expected count by $\text{IRR}_k$.
For example, $\text{IRR} = 1.5$ means the expected count increases by 50%.

| Approach | What it says | Example ($\beta=0.30$, $\text{IRR}=1.35$) |
|----------|--------------|-------------------------------------------|
| ME (AME) | Change in count | +1.2 patents on average |
| IRR | Multiplicative factor | 35% more patents |

In [None]:
# Cell 4 — Load patents dataset and plot distribution
#
# Columns: patents, log_rnd, log_sales, log_capital, industry, year

df = load_dataset('patents')
print(f'Shape: {df.shape}')
print(f'Columns: {list(df.columns)}')
print('\nDescriptive statistics:')
print(df.describe().T[['count', 'mean', 'std', 'min', 'max']].round(3))

# Distribution of patent counts
fig, ax = plt.subplots(figsize=(9, 4))
counts = df['patents'].value_counts().sort_index()
display_bins = counts.index[:30].astype(int)
ax.bar(display_bins, counts.values[:30], color='steelblue', alpha=0.75, width=0.7)
ax.set_xlabel('Patent Count')
ax.set_ylabel('Number of Firms')
ax.set_title('Distribution of Patent Counts (first 30 bins)')

mean_pat = df['patents'].mean()
var_pat  = df['patents'].var()
pct_zero = (df['patents'] == 0).mean()

info_text = (
    f'Mean:    {mean_pat:.2f}\n'
    f'Var:     {var_pat:.2f}\n'
    f'Var/Mean:{var_pat/mean_pat:.2f}\n'
    f'% zeros: {pct_zero:.1%}'
)
ax.text(0.68, 0.75, info_text, transform=ax.transAxes,
        bbox=dict(boxstyle='round', fc='white', alpha=0.8), fontsize=9)

plt.tight_layout()
plt.savefig(f'{OUT_PLOTS}/03_patent_distribution.png', dpi=150, bbox_inches='tight')
plt.show()
print('\nFigure saved: 03_patent_distribution.png')
print('Right-skewed distribution with many zeros → OLS would predict negative counts.')

In [None]:
# Cell 5 — Estimate Pooled Poisson and compute AME
#
# PooledPoisson uses array-based API: (endog, exog, entity_id, time_id)
# We construct X with an intercept using patsy-style approach

# Prepare data
df = df.reset_index(drop=True)
FORMULA_VARS = ['log_rnd', 'log_sales', 'log_capital']

endog = df['patents'].values
X_raw = df[FORMULA_VARS].values
intercept = np.ones((len(X_raw), 1))
exog  = np.hstack([intercept, X_raw])   # shape: (N, 4) — const + 3 regressors
exog_names = ['const'] + FORMULA_VARS
entity_id = df.index.values             # each row is a unique firm (cross-section)

print('Fitting Pooled Poisson...')
poisson_model  = PooledPoisson(endog, exog, entity_id=entity_id)

# Attach variable names (used by count_me module for AME/MEM computation)
poisson_model.exog_names = exog_names

poisson_result = poisson_model.fit(se_type='cluster')

# Display results
params_series = pd.Series(poisson_result.params, index=exog_names)
se_series     = pd.Series(poisson_result.se,     index=exog_names)

print('\n=== Pooled Poisson Coefficients ===')
coef_table = pd.DataFrame({
    'Coef': params_series,
    'Std.Err.': se_series,
    't': params_series / se_series
})
print(coef_table.round(4))

# Compute Poisson AME
print('\nComputing AME...')
try:
    ame_poisson = compute_poisson_ame(poisson_result, varlist=FORMULA_VARS)
    print('\n=== Poisson AME ===')
    print('Units: change in expected patent count per unit change in each variable')
    print(ame_poisson.summary().round(5))
    AME_OK = True
except Exception as e:
    print(f'AME computation failed: {e}')
    AME_OK = False
    ame_poisson = None

In [None]:
# Cell 6 — Compute MEM and compare AME vs MEM

if AME_OK:
    try:
        mem_poisson = compute_poisson_mem(poisson_result, varlist=FORMULA_VARS)
        print('=== Poisson MEM ===')
        print(mem_poisson.summary().round(5))

        # Comparison table
        comp = pd.DataFrame({
            'AME':         ame_poisson.marginal_effects,
            'MEM':         mem_poisson.marginal_effects,
            'Ratio AME/MEM': ame_poisson.marginal_effects / mem_poisson.marginal_effects
        })
        print('\n=== AME vs MEM Comparison ===')
        print(comp.round(5))

        # Fitted values from model
        fitted_vals = poisson_model.predict(type='response')
        print(f'\nMean predicted count : {fitted_vals.mean():.4f}')
        print(f'Mean observed count  : {endog.mean():.4f}')
        print('\nNote: AME > MEM when the distribution of exp(Xβ) is right-skewed.')
        print('Both collapse to the same value when the outcome is perfectly symmetric.')
    except Exception as e:
        print(f'MEM computation failed: {e}')
        mem_poisson = None
else:
    print('Skipping MEM (AME failed).')
    mem_poisson = None

## Section 3: Incidence Rate Ratios (IRR) <a name="section-3"></a>

The **Incidence Rate Ratio** gives a multiplicative interpretation:

$$\text{IRR}_k = \exp(\beta_k)$$

Standard errors via the delta method: $\text{SE}[\exp(\beta_k)] = \exp(\beta_k) \times \text{SE}[\beta_k]$.

| Approach | What it says | Example ($\beta=0.30$, $\text{IRR}=1.35$) |
|----------|--------------|-------------------------------------------|
| ME (AME) | Change in count | +1.2 patents on average |
| IRR | Multiplicative factor | 35% more patents |

**When to use IRR vs ME**:

- Use **IRR** when: outcome scale varies greatly; audience prefers percentage interpretation;
  comparing across groups with different baseline counts
- Use **ME** when: you need absolute change; policy questions about counts;
  audience is more comfortable with additive interpretations

In [None]:
# Cell 8 — Compute and Display IRR table

# IRR = exp(β) for each variable (exclude intercept)
beta_vars = pd.Series(poisson_result.params, index=exog_names)[FORMULA_VARS]
se_vars   = pd.Series(poisson_result.se,     index=exog_names)[FORMULA_VARS]

irr_vals  = np.exp(beta_vars)
irr_se    = irr_vals * se_vars                     # delta method
irr_lower = np.exp(beta_vars - 1.96 * se_vars)
irr_upper = np.exp(beta_vars + 1.96 * se_vars)

irr_table = pd.DataFrame({
    'Coef (β)':       beta_vars.values,
    'IRR':            irr_vals.values,
    'SE (IRR)':       irr_se.values,
    '95% CI Lower':   irr_lower.values,
    '95% CI Upper':   irr_upper.values,
    'Effect (%)':     (irr_vals.values - 1) * 100
}, index=FORMULA_VARS)

print('=== Incidence Rate Ratios ===')
print(irr_table.round(4))
print('\nIRR > 1 → positive effect; IRR < 1 → negative effect')
print('"Effect (%)" column: % change in expected count per 1-unit increase in variable')

# Save for later use
irr_table.to_csv(f'{OUT_TABLES}/03_irr_table.csv')
print('\nSaved: 03_irr_table.csv')

In [None]:
# Cell 9 — Forest plot: AME and IRR side by side

if AME_OK and ame_poisson is not None:
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

    me_vals  = ame_poisson.marginal_effects.values
    se_vals  = ame_poisson.std_errors.values
    vars_    = list(ame_poisson.marginal_effects.index)

    # ── Left: AME forest plot ────────────────────────────────────────────
    colors_me = ['tomato' if v < 0 else 'steelblue' for v in me_vals]
    y_pos = np.arange(len(vars_))

    ax1.barh(y_pos, me_vals, xerr=1.96 * se_vals,
             color=colors_me, alpha=0.75, capsize=5, height=0.5,
             error_kw={'elinewidth': 1.5})
    ax1.axvline(0, color='black', lw=0.8, ls='--')
    ax1.set_yticks(y_pos)
    ax1.set_yticklabels(vars_, fontsize=11)
    ax1.set_xlabel('Average Marginal Effect')
    ax1.set_title('AME: Change in Expected Patent Count')

    # ── Right: IRR dot-plot with CI ───────────────────────────────────────
    irr_plot = irr_table.loc[vars_]
    irr_v    = irr_plot['IRR'].values
    irr_lo   = irr_plot['95% CI Lower'].values
    irr_hi   = irr_plot['95% CI Upper'].values
    colors_irr = ['tomato' if v < 1 else 'steelblue' for v in irr_v]

    ax2.scatter(irr_v, y_pos, color=colors_irr, s=90, zorder=3)
    for i, (lo, hi) in enumerate(zip(irr_lo, irr_hi)):
        ax2.plot([lo, hi], [i, i], color=colors_irr[i], lw=2.0, alpha=0.7)
    ax2.axvline(1, color='black', lw=0.8, ls='--')
    ax2.set_xscale('log')
    ax2.set_yticks(y_pos)
    ax2.set_yticklabels(vars_, fontsize=11)
    ax2.set_xlabel('IRR (log scale) — 95% CI')
    ax2.set_title('IRR: Multiplicative Effect on Expected Count')

    plt.suptitle('Poisson Regression — Patent Counts: AME vs IRR', fontsize=12)
    plt.tight_layout()
    plt.savefig(f'{OUT_PLOTS}/03_ame_vs_irr.png', dpi=150, bbox_inches='tight')
    plt.show()
    print('Figure saved: 03_ame_vs_irr.png')
else:
    print('Skipping plot (AME not available).')

## Section 4: Overdispersion — Poisson vs Negative Binomial <a name="section-4"></a>

Poisson assumes $\text{Var}[Y|X] = E[Y|X]$ (equidispersion).  
Real count data often has $\text{Var}[Y|X] > E[Y|X]$ (**overdispersion**).

Negative Binomial (NB2) introduces a dispersion parameter $\alpha$:

$$\text{Var}[Y|X] = E[Y|X] + \alpha \cdot E[Y|X]^2$$

**Key fact: both Poisson and NegBin have the same conditional mean** $E[Y|X] = \exp(X'\beta)$.  
Therefore:

- The marginal effect formula is **identical**
- Point estimates of AME/MEM are approximately the same
- BUT: standard errors differ — NegBin SEs are larger (more conservative)

**How to detect overdispersion**:

- Variance-to-mean ratio $> 1$ in the sample
- Significant $\hat{\alpha}$ in the NegBin model
- Likelihood ratio test: NegBin vs Poisson

Using Poisson when overdispersion is present **understates** standard errors and leads to
spuriously significant results.

In [None]:
# Cell 11 — Estimate Negative Binomial and inspect dispersion

print('Fitting Negative Binomial (NB2)...')
try:
    negbin_model  = NegativeBinomial(endog, exog, entity_id=entity_id)
    negbin_model.exog_names = exog_names   # needed by count_me module
    negbin_result = negbin_model.fit()
    NB_OK = True

    # Dispersion parameter alpha
    try:
        alpha_val = negbin_result.alpha
    except AttributeError:
        try:
            alpha_val = negbin_model.alpha
        except AttributeError:
            alpha_val = np.exp(negbin_result.params[-1])  # last param is log(alpha)

    # Coefficient table (exclude last param = log(alpha))
    nb_params = negbin_result.params[:-1] if len(negbin_result.params) > len(exog_names) \
                else negbin_result.params
    nb_se_all = negbin_result.se
    nb_se     = nb_se_all[:len(nb_params)]

    params_nb = pd.Series(nb_params, index=exog_names[:len(nb_params)])
    se_nb     = pd.Series(nb_se,     index=exog_names[:len(nb_params)])

    print('\n=== Negative Binomial Coefficients ===')
    coef_nb = pd.DataFrame({'Coef': params_nb, 'Std.Err.': se_nb, 't': params_nb / se_nb})
    print(coef_nb.round(4))

    print(f'\nDispersion parameter α = {alpha_val:.4f}')
    print('(α → 0: reduces to Poisson; α > 0: overdispersion present)')

    # Overdispersion diagnostics
    var_y  = df['patents'].var()
    mean_y = df['patents'].mean()
    print(f'\nSample variance/mean ratio: {var_y/mean_y:.2f}')
    if var_y > mean_y * 1.1:
        print('→ Overdispersion detected in the data. NegBin is preferred over Poisson.')
    else:
        print('→ No strong overdispersion detected. Poisson may be adequate.')

except Exception as e:
    print(f'Negative Binomial estimation failed: {e}')
    NB_OK = False
    negbin_result = None

In [None]:
# Cell 12 — Compare AME and SE: Poisson vs NegBin

if AME_OK and NB_OK:
    try:
        ame_negbin = compute_negbin_ame(negbin_result, varlist=FORMULA_VARS)
        NB_AME_OK  = True

        comp_nb = pd.DataFrame({
            'Poisson AME':      ame_poisson.marginal_effects,
            'NegBin AME':       ame_negbin.marginal_effects,
            'Poisson SE':       ame_poisson.std_errors,
            'NegBin SE':        ame_negbin.std_errors,
            'SE Ratio (NB/P)':  ame_negbin.std_errors / ame_poisson.std_errors
        })

        print('=== Poisson vs NegBin: AME and Standard Errors ===')
        print(comp_nb.round(5))
        print('\nKey insight:')
        print('  AME values should be similar (same conditional mean function).')
        print('  NegBin SEs should be larger (accounting for overdispersion).')
        print('  Using Poisson with overdispersion understates uncertainty.')
    except Exception as e:
        print(f'NegBin AME computation failed: {e}')
        NB_AME_OK  = False
        ame_negbin = None
else:
    print('Skipping comparison (Poisson AME or NegBin not available).')
    NB_AME_OK  = False
    ame_negbin = None

In [None]:
# Cell 13 — Scatter plot: Individual ME vs Predicted Count
#
# Demonstrates that ME = β × exp(Xβ) grows with the predicted count.

if AME_OK and ame_poisson is not None:
    fitted_vals = poisson_model.predict(type='response')  # exp(Xβ)

    # Use the first formula variable for illustration
    focus_var = FORMULA_VARS[0]  # 'log_rnd'
    beta_focus = poisson_result.params[exog_names.index(focus_var)]
    me_individual = beta_focus * fitted_vals             # individual ME_i = β * exp(X_i β)
    ame_val = float(ame_poisson.marginal_effects[focus_var])

    fig, ax = plt.subplots(figsize=(8, 4))
    ax.scatter(fitted_vals, me_individual, alpha=0.35, s=18, color='steelblue', label='Individual ME')
    ax.axhline(ame_val, color='tomato', lw=2.0, ls='--',
               label=f'AME = {ame_val:.3f}')

    ax.set_xlabel('Predicted Patent Count $\\hat{\\lambda}_i = \\exp(X_i\'\\beta)$')
    ax.set_ylabel(f'ME of {focus_var} = $\\beta_k \\cdot \\exp(X_i\'\\beta)$')
    ax.set_title(
        f'Heterogeneous Marginal Effects: {focus_var}\n'
        f'ME is larger for firms with higher predicted counts'
    )
    ax.legend(fontsize=10)
    plt.tight_layout()
    plt.savefig(f'{OUT_PLOTS}/03_me_heterogeneity.png', dpi=150, bbox_inches='tight')
    plt.show()
    print('Figure saved: 03_me_heterogeneity.png')
    print(f'\nIndividual MEs range: [{me_individual.min():.3f}, {me_individual.max():.3f}]')
    print(f'AME (average): {ame_val:.3f}')
else:
    print('Skipping heterogeneity plot (AME not available).')

## Section 5: Fixed Effects Poisson <a name="section-5"></a>

For genuine panel data, the **Fixed Effects Poisson** estimator (Hausman, Hall, & Griliches 1984)
is robust to arbitrary fixed effects in the conditional mean. Unlike linear FE, the Poisson FE
estimator is **consistent under mild conditions** (only requires correct specification of
$E[Y|X, \alpha_i] = \exp(X_i'\beta + \alpha_i)$).

**ME interpretation**: within-entity effect — how much the expected count changes when $x$ changes
for a given firm/individual over time, holding fixed its time-invariant characteristics.

**Key properties**:
- Consistent even when fixed effects $\alpha_i$ are correlated with regressors
- Entities with total count $\sum_t y_{it} = 0$ are dropped (provide no information)
- Identification comes from **within-entity** variation over time

In [None]:
# Cell 15 — Fixed Effects Poisson
#
# The patents dataset has a 'year' column (1975-1980) — use it as the time dimension.
# Each firm (row) + year defines a panel. Since df has only one obs per row,
# we create a multi-period panel by using year as time and industry as entity proxy.

FE_OK = False
ame_fe_poisson = None

try:
    # Use industry × row_number as entity and year as time
    # This creates a balanced panel with ~5 time periods
    df_fe = df.copy()
    df_fe['entity_id'] = df_fe.index.values
    df_fe['time_id']   = df_fe['year'].values

    FE_VARS = ['log_rnd', 'log_sales']  # fewer vars for FE (avoids near-multicollinearity)
    exog_fe_raw = df_fe[FE_VARS].values
    exog_fe     = np.hstack([np.ones((len(exog_fe_raw), 1)), exog_fe_raw])
    exog_fe_names = ['const'] + FE_VARS

    endog_fe   = df_fe['patents'].values
    entity_fe  = df_fe['entity_id'].values
    time_fe    = df_fe['time_id'].values

    print('Fitting Poisson Fixed Effects (conditional MLE)...')
    print('Note: this estimator conditions on total counts per entity.')

    fe_model = PoissonFixedEffects(endog_fe, exog_fe, entity_id=entity_fe, time_id=time_fe)
    fe_model.exog_names = exog_fe_names

    fe_result  = fe_model.fit()

    # Display coefficients
    params_fe = pd.Series(fe_result.params, index=exog_fe_names)
    se_fe     = pd.Series(fe_result.se,     index=exog_fe_names)
    print('\n=== FE Poisson Coefficients ===')
    coef_fe = pd.DataFrame({'Coef': params_fe, 'Std.Err.': se_fe, 't': params_fe / se_fe})
    print(coef_fe.round(4))

    # Compute AME
    ame_fe_poisson = compute_poisson_ame(fe_result, varlist=FE_VARS)
    print('\n=== FE Poisson AME (Within-Entity Effect) ===')
    print('Interpretation: within-entity change in expected count')
    print(ame_fe_poisson.summary().round(5))
    FE_OK = True

except Exception as e:
    print(f'FE Poisson estimation failed: {e}')
    print('This can happen with a pure cross-section (no genuine panel structure).')
    print('FE Poisson requires within-entity variation over multiple time periods.')
    FE_OK = False

## Section 6: Zero-Inflated Models — Doctor Visits <a name="section-6"></a>

Some count datasets have **many more zeros** than a Poisson (or NegBin) model predicts.
This is called **zero inflation**: a separate process determines whether the count is a
"structural zero" (the person never visits a doctor) or a "sampling zero" (could visit
but happened not to in this period).

The marginal effect in a Zero-Inflated Poisson (ZIP) decomposes into:

$$\frac{\partial E[Y|X]}{\partial x_k} = \underbrace{\frac{\partial P(\text{non-zero}|X)}{\partial x_k} \cdot E[Y|Y>0,X]}_{\text{extensive margin}} + \underbrace{P(\text{non-zero}|X) \cdot \frac{\partial E[Y|Y>0,X]}{\partial x_k}}_{\text{intensive margin}}$$

**Two components**:
1. **Extensive margin**: effect on the probability of being a non-structural zero
2. **Intensive margin**: effect on the count conditional on being non-zero

In practice, when `ZeroInflatedPoisson` is not available or fails, we can estimate a
standard Poisson and note that it may be misspecified under zero inflation.

In [None]:
# Cell 17 — Load doctor_visits dataset and explore zero inflation
#
# Columns: docvis, age, female, educ, hhninc, public, addon

df_doc = load_dataset('doctor_visits')
print(f'Shape: {df_doc.shape}')
print(f'Columns: {list(df_doc.columns)}')

outcome_col = 'docvis'
print(f'\n--- Outcome: {outcome_col} ---')
print(f'Zero visits (structural + sampling): {(df_doc[outcome_col]==0).mean():.1%}')
print(f'Mean visits: {df_doc[outcome_col].mean():.2f}')
print(f'Var/Mean:    {df_doc[outcome_col].var()/df_doc[outcome_col].mean():.2f}')

# Compare Poisson prediction vs observed zeros
from scipy.stats import poisson as poisson_dist

lam_doc     = df_doc[outcome_col].mean()
pred_zero_p = poisson_dist.pmf(0, lam_doc)
obs_zero    = (df_doc[outcome_col] == 0).mean()

print(f'\nPredicted P(Y=0) under Poisson(λ={lam_doc:.2f}): {pred_zero_p:.1%}')
print(f'Observed  P(Y=0) in the data:                    {obs_zero:.1%}')

if obs_zero > pred_zero_p * 1.2:
    print('→ Zero inflation detected! Standard Poisson may underpredict zeros.')
    print('  Consider Zero-Inflated Poisson (ZIP) or Hurdle model.')
else:
    print('→ No strong zero inflation. Standard Poisson or NegBin may be adequate.')

# Distribution plot
fig, ax = plt.subplots(figsize=(9, 4))
counts_doc = df_doc[outcome_col].value_counts().sort_index()
n_bins = min(30, len(counts_doc))
ax.bar(counts_doc.index[:n_bins].astype(int), counts_doc.values[:n_bins],
       color='steelblue', alpha=0.75, label='Observed', width=0.7)

# Poisson-predicted counts
x_grid = np.arange(n_bins)
poisson_pred = poisson_dist.pmf(x_grid, lam_doc) * len(df_doc)
ax.plot(x_grid, poisson_pred, 'ro-', ms=5, lw=1.5, label=f'Poisson(λ={lam_doc:.1f}) expected')

ax.set_xlabel('Number of Doctor Visits')
ax.set_ylabel('Frequency')
ax.set_title('Doctor Visits: Observed vs Poisson-predicted counts')
ax.legend()
plt.tight_layout()
plt.show()

In [None]:
# Cell 18 — Zero-Inflated Poisson (with try/except)
#
# ZeroInflatedPoisson uses arrays directly, NOT formula/PanelData.
# Wrap the entire section in try/except for robustness.

DOC_VARS = ['age', 'female', 'educ', 'hhninc', 'public', 'addon']

# ── Step 1: Pooled Poisson on doctor_visits ──────────────────────────────
df_doc_r = df_doc.reset_index(drop=True)
endog_doc = df_doc_r[outcome_col].values
X_doc_raw = df_doc_r[DOC_VARS].values
exog_doc  = np.hstack([np.ones((len(X_doc_raw), 1)), X_doc_raw])
exog_doc_names = ['const'] + DOC_VARS
entity_doc = df_doc_r.index.values

print('Fitting Pooled Poisson on doctor_visits...')
poisson_doc_model  = PooledPoisson(endog_doc, exog_doc, entity_id=entity_doc)
poisson_doc_model.exog_names = exog_doc_names
poisson_doc_result = poisson_doc_model.fit(se_type='cluster')

params_doc = pd.Series(poisson_doc_result.params, index=exog_doc_names)
se_doc     = pd.Series(poisson_doc_result.se,     index=exog_doc_names)

print('\n=== Pooled Poisson on doctor_visits ===')
print(pd.DataFrame({'Coef': params_doc, 'SE': se_doc}).round(4))

ame_doc = compute_poisson_ame(poisson_doc_result, varlist=DOC_VARS)
print('\n=== Pooled Poisson AME (doctor visits) ===')
print(ame_doc.summary().round(5))

# ── Step 2: Zero-Inflated Poisson (if available) ─────────────────────────
print('\n--- Attempting Zero-Inflated Poisson ---')
zip_success = False
ame_zip     = None

try:
    from panelbox.models.count.zero_inflated import ZeroInflatedPoisson

    # Build inflate regressors (subset: age + public as exclusion restriction proxies)
    inflate_vars  = ['age', 'public']
    X_inflate_raw = df_doc_r[inflate_vars].values
    X_inflate     = np.hstack([np.ones((len(X_inflate_raw), 1)), X_inflate_raw])

    zip_model = ZeroInflatedPoisson(
        endog=endog_doc,
        exog=exog_doc,
        exog_inflate=X_inflate
    )
    zip_result = zip_model.fit()

    print('ZIP model fitted successfully.')

    # Compute AME via Poisson function (total effect)
    try:
        ame_zip = compute_poisson_ame(zip_result, varlist=DOC_VARS)
        print('\n=== ZIP AME (Total Effect) ===')
        print(ame_zip.summary().round(5))

        # Check for decomposition attributes
        if hasattr(ame_zip, 'extensive_margin') and hasattr(ame_zip, 'intensive_margin'):
            print('\n=== Decomposition: Extensive vs Intensive Margin ===')
            decomp = pd.DataFrame({
                'Total ME':        ame_zip.marginal_effects,
                'Extensive Margin': ame_zip.extensive_margin,
                'Intensive Margin': ame_zip.intensive_margin
            })
            print(decomp.round(5))
        else:
            print('\nNote: ZIP decomposition (extensive/intensive) not available in this result.')
            print('Total AME reported above represents the combined effect.')
        zip_success = True

    except Exception as e_ame:
        print(f'ZIP AME computation failed: {e_ame}')
        zip_success = False

except ImportError:
    print('ZeroInflatedPoisson not found in panelbox.models.count.zero_inflated.')
    print('Falling back to standard Pooled Poisson analysis.')
except Exception as e:
    print(f'ZeroInflatedPoisson estimation failed: {e}')
    print('Using Pooled Poisson as an approximation.')

if not zip_success:
    print('\n=== Fallback: Pooled Poisson AME (serves as approximation) ===')
    print('Note: Under zero inflation, Poisson underestimates zeros.')
    print('      NegBin or ZIP are more appropriate in this case.')
    print(ame_doc.summary().round(5))

    # Decomposition note
    print('\n--- Conceptual decomposition (not computed — ZIP unavailable) ---')
    print('dE[Y]/dx = [d P(non-zero)/dx × E[Y|Y>0]] + [P(non-zero) × dE[Y|Y>0]/dx]')
    print('          (extensive margin)                 (intensive margin)')

## Key Takeaways

1. **Count ME formula**: $\text{ME}_k = \beta_k \cdot \exp(X'\beta)$ — **grows with count level**,
   not constant like in linear models.

2. **AME averages** over the predicted count distribution:  
   $\text{AME}_k \approx \beta_k \times \bar{\hat{\lambda}}$  
   AME > MEM when the distribution of $\exp(X'\beta)$ is right-skewed.

3. **IRR = exp(β)**: multiplicative/percentage interpretation; use when counts vary widely
   across observations or when audiences prefer relative changes.

4. **NegBin vs Poisson**: identical AME (same conditional mean); NegBin has larger (correct)
   standard errors when overdispersion is present — **never ignore overdispersion**.

5. **FE Poisson**: consistent even with arbitrary fixed effects — preferred for panel count
   data when individual heterogeneity may be correlated with regressors.

6. **ZIP**: decomposes total ME into extensive (participation) and intensive (count given
   participation) margins — useful when structural zeros are theoretically meaningful.

---

**Bridge to Notebook 04**: Tobit and Heckman models also decompose effects into extensive
and intensive margins — but the framework is different from ZIP. The extensive margin in
Tobit is the probability of being uncensored ($P(Y^* > 0)$), while in Heckman it is the
selection-into-observation probability. See `04_censored_me.ipynb`.

In [None]:
# Cell 20 — Export results to CSV

print('Exporting results...')

# 1. Poisson AME
if AME_OK and ame_poisson is not None:
    try:
        fmt_ame_poisson = format_me_table(ame_poisson)
        fmt_ame_poisson.to_csv(f'{OUT_TABLES}/03_ame_poisson.csv', index=False)
        print('Saved: 03_ame_poisson.csv')
    except Exception as e:
        # Fallback: save raw data
        ame_poisson.marginal_effects.to_frame('AME').join(
            ame_poisson.std_errors.to_frame('SE')
        ).to_csv(f'{OUT_TABLES}/03_ame_poisson.csv')
        print(f'Saved (fallback): 03_ame_poisson.csv [{e}]')

# 2. NegBin AME
if NB_AME_OK and ame_negbin is not None:
    try:
        fmt_ame_negbin = format_me_table(ame_negbin)
        fmt_ame_negbin.to_csv(f'{OUT_TABLES}/03_ame_negbin.csv', index=False)
        print('Saved: 03_ame_negbin.csv')
    except Exception as e:
        ame_negbin.marginal_effects.to_frame('AME').join(
            ame_negbin.std_errors.to_frame('SE')
        ).to_csv(f'{OUT_TABLES}/03_ame_negbin.csv')
        print(f'Saved (fallback): 03_ame_negbin.csv [{e}]')

# 3. IRR table
irr_table.to_csv(f'{OUT_TABLES}/03_irr_table.csv')
print('Saved: 03_irr_table.csv')

# 4. Doctor visits AME
try:
    fmt_ame_doc = format_me_table(ame_doc)
    fmt_ame_doc.to_csv(f'{OUT_TABLES}/03_ame_doctor_visits.csv', index=False)
    print('Saved: 03_ame_doctor_visits.csv')
except Exception as e:
    ame_doc.marginal_effects.to_frame('AME').join(
        ame_doc.std_errors.to_frame('SE')
    ).to_csv(f'{OUT_TABLES}/03_ame_doctor_visits.csv')
    print(f'Saved (fallback): 03_ame_doctor_visits.csv [{e}]')

print('\n' + '='*60)
print('Notebook 03 complete!')
print('='*60)
print('Outputs:')
print(f'  Plots : {OUT_PLOTS}/03_patent_distribution.png')
print(f'           {OUT_PLOTS}/03_ame_vs_irr.png')
print(f'           {OUT_PLOTS}/03_me_heterogeneity.png')
print(f'  Tables: {OUT_TABLES}/03_ame_poisson.csv')
print(f'           {OUT_TABLES}/03_ame_negbin.csv')
print(f'           {OUT_TABLES}/03_irr_table.csv')
print(f'           {OUT_TABLES}/03_ame_doctor_visits.csv')