# Interaction Effects in Nonlinear Models: The Ai & Norton (2003) Correction

**Series:** Marginal Effects Tutorial | **Notebook 05 of 06**  
**Level:** Advanced | **Estimated time:** 60-75 minutes  
**Prerequisites:** Notebooks 01-02 (ME Fundamentals and Discrete ME)

---

## The Problem

In OLS, including an interaction term $x_1 \times x_2$ makes interpretation simple:
$$\frac{\partial^2 E[Y|X]}{\partial x_1 \partial x_2} = \beta_{12}$$

This is constant and equals the interaction coefficient. Many researchers apply the same reasoning to nonlinear models such as Logit and Probit — **and this is wrong**.

**Ai & Norton (2003)** documented this error across hundreds of published economics articles. They showed that in Logit and Probit models:
$$\frac{\partial^2 P}{\partial x_1 \partial x_2} \neq \beta_{12}$$

The correct interaction effect:
- Is NOT equal to $\beta_{12}$ (except in linear models)
- **Varies across observations** (heterogeneous)
- Can have a **different sign** than $\beta_{12}$
- Requires the **delta method** or bootstrap for standard errors

---

## Table of Contents

1. [The Ai & Norton Problem](#section1) — Why $\beta_{12}$ fails in nonlinear models
2. [Computing Correct Interaction Effects](#section2) — Cross-partial derivatives
3. [Continuous × Continuous Interaction](#section3) — Heatmap visualization
4. [Dummy × Continuous Interaction](#section4) — Group-specific curves
5. [Dummy × Dummy Interaction](#section5) — 2×2 grid
6. [Statistical Significance](#section6) — Z-statistics and LR tests
7. [Key Takeaways](#takeaways)

---

**Reference:** Ai, C., & Norton, E. C. (2003). Interaction terms in logit and probit models. *Economics Letters*, 80(1), 123-129.

In [None]:
# Cell 2: Setup and imports
import sys
import os
sys.path.insert(0, '/home/guhaase/projetos/panelbox')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
from scipy.stats import norm

# PanelBox imports
try:
    import panelbox as pb
    print(f"panelbox version: {pb.__version__ if hasattr(pb, '__version__') else 'loaded'}")
except ImportError as e:
    print(f"Warning: {e}")

from panelbox.models.discrete.binary import PooledLogit, PooledProbit
from panelbox.marginal_effects.interactions import (
    InteractionEffectsResult,
    compute_interaction_effects,
    test_interaction_significance
)

# Local utilities (absolute path for reliability)
me_utils_path = '/home/guhaase/projetos/panelbox/examples/marginal_effects/utils'
if me_utils_path not in sys.path:
    sys.path.insert(0, me_utils_path)

try:
    from data_loaders import load_dataset
    print("data_loaders imported successfully")
except ImportError as e:
    print(f"Warning: could not import data_loaders: {e}")
    def load_dataset(name, seed=42):
        raise RuntimeError(f"data_loaders not available: {e}")

# Plotting style
plt.style.use('seaborn-v0_8-whitegrid')
pd.set_option('display.float_format', '{:.4f}'.format)

# Ensure output directories exist
import pathlib
outputs_base = pathlib.Path('/home/guhaase/projetos/panelbox/examples/marginal_effects/outputs')
(outputs_base / 'plots').mkdir(parents=True, exist_ok=True)
(outputs_base / 'tables').mkdir(parents=True, exist_ok=True)

# Helper: get parameter value trying multiple name alternatives
def get_param(params, *names):
    """Get parameter value trying multiple name alternatives. Returns 0.0 if not found."""
    for name in names:
        if name in params.index:
            return float(params[name])
    return 0.0

print("Setup complete.")

<a id='section1'></a>
## Section 1: The Common Error — Why $\beta_{12} \neq$ Interaction Effect

### In OLS (the safe case)

$$E[Y|X] = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{12}(x_1 \times x_2)$$
$$\frac{\partial^2 E[Y|X]}{\partial x_1 \partial x_2} = \beta_{12} \quad \leftarrow \text{simple and constant}$$

### In Logit (the problematic case)

$$P(Y=1|X) = \Lambda(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{12}(x_1 \times x_2))$$
$$\frac{\partial^2 P}{\partial x_1 \partial x_2} = \beta_{12} \Lambda(1-\Lambda) + \beta_1 \beta_2 \Lambda(1-\Lambda)(1-2\Lambda)$$

This expression is **complex** and **varies with** $X$.

### Three Problems with Using $\beta_{12}$ Directly

| Problem | Description |
|---------|-------------|
| **Direction** | The cross-partial can have a **different sign** than $\beta_{12}$ |
| **Magnitude** | The scale is wrong — off by a factor that depends on $X$ |
| **Heterogeneity** | The true effect **varies** across observations; $\beta_{12}$ is one number |

### Cross-Partial Formulas by Model Type

| Model | Formula |
|-------|--------|
| **Logit** | $\beta_{12}\Lambda(1-\Lambda) + \beta_1\beta_2\Lambda(1-\Lambda)(1-2\Lambda)$ |
| **Probit** | $-\phi(X\beta)[\beta_{12} + \beta_1\beta_2 X\beta]$ |
| **Poisson** | $\lambda(\beta_{12} + \beta_1\beta_2)$ |

The correct approach: compute $\partial^2 E[Y]/\partial x_1 \partial x_2$ for **each observation**, then summarize (mean, distribution, proportion significant).

In [None]:
# Cell 4: Numerical example showing beta12 vs true cross-partial
np.random.seed(42)
N = 2000

x1 = np.random.normal(0, 1, N)
x2 = np.random.binomial(1, 0.5, N)  # dummy variable

# True parameters (known, for comparison)
b0, b1, b2, b12 = -1.5, 0.8, 0.5, -0.6

xb = b0 + b1*x1 + b2*x2 + b12*(x1*x2)
prob = 1 / (1 + np.exp(-xb))
y = np.random.binomial(1, prob, N)

df_ex = pd.DataFrame({
    'id': range(N),
    'time': 1,
    'y': y,
    'x1': x1,
    'x2': x2,
    'x1x2': x1 * x2
})

try:
    model_ex = PooledLogit(
        'y ~ x1 + x2 + x1x2',
        data=df_ex, entity_col='id', time_col='time'
    ).fit()
    print("=" * 50)
    print("Logit with Interaction Term (Synthetic Data)")
    print("=" * 50)

    params_ex = model_ex.params

    # Get b12 estimate (intercept is 'Intercept' in PanelBox)
    b12_hat = None
    for name in ['x1x2', 'x1:x2']:
        if name in params_ex.index:
            b12_hat = float(params_ex[name])
            break
    if b12_hat is None:
        b12_hat = float(params_ex.iloc[-1])
        print(f"(Using last param as interaction: {params_ex.index[-1]})")

    print(f"\nEstimated parameters:")
    print(params_ex.to_string())
    print(f"\nbeta_12 (interaction coefficient) = {b12_hat:.4f}")
    print(f"True beta_12                       = {b12:.4f}")
    print(f"\nIs beta_12 the interaction effect? NO — see below.")

except Exception as e:
    print(f"Model estimation failed: {e}")
    b12_hat = b12

# Demonstrate analytically that cross-partial != beta12
print("\n" + "=" * 50)
print("Analytical Cross-Partial vs beta_12")
print("=" * 50)

Lambda_fn  = lambda z: 1 / (1 + np.exp(-z))
Lambda_pdf = lambda z: Lambda_fn(z) * (1 - Lambda_fn(z))

# Evaluate at four representative points using TRUE parameters
scenarios = [
    ('x1=0, x2=0', 0, 0),
    ('x1=0, x2=1', 0, 1),
    ('x1=1, x2=1', 1, 1),
    ('x1=-1, x2=0', -1, 0),
]

print(f"\n{'Scenario':<20} {'xb':>8} {'P(Y=1)':>8} {'Cross-partial':>15} {'beta12':>10}")
print("-" * 65)
for label, x1v, x2v in scenarios:
    xb_val = b0 + b1*x1v + b2*x2v + b12*(x1v*x2v)
    L = Lambda_fn(xb_val)
    Lpdf = Lambda_pdf(xb_val)
    cp = b12 * Lpdf + b1 * b2 * Lpdf * (1 - 2*L)
    print(f"{label:<20} {xb_val:>8.3f} {L:>8.3f} {cp:>15.5f} {b12:>10.5f}")

print("\nConclusion: Cross-partials vary across observations AND differ from beta_12")

<a id='section2'></a>
## Section 2: Computing Correct Interaction Effects

### The `compute_interaction_effects()` Function

PanelBox implements the Ai & Norton (2003) correction via `compute_interaction_effects()`, which:
1. Extracts parameters and covariate matrix from the fitted model
2. Computes the cross-partial $\partial^2 P/\partial x_1 \partial x_2$ for **every observation**
3. Returns an `InteractionEffectsResult` object with summary statistics

```python
from panelbox.marginal_effects.interactions import compute_interaction_effects

ie = compute_interaction_effects(
    model_result,          # fitted PooledLogit / PooledProbit result
    var1='educ',           # first interacting variable
    var2='kidslt6',        # second interacting variable
    method='delta'         # 'delta', 'bootstrap', or None
)
```

### What `InteractionEffectsResult` Contains

| Attribute | Type | Description |
|-----------|------|-------------|
| `.cross_partial` | `np.ndarray` (N,) | Cross-partial per observation |
| `.mean_effect` | `float` | Mean over all observations |
| `.std_effect` | `float` | Standard deviation |
| `.z_statistics` | `np.ndarray` or `None` | z = cross-partial / SE (if SE computed) |
| `.prop_positive` | `float` | Fraction with positive effect |
| `.prop_negative` | `float` | Fraction with negative effect |
| `.significant_positive` | `float` | Fraction with z > 1.96 |
| `.significant_negative` | `float` | Fraction with z < -1.96 |

The key insight: **the mean cross-partial $\neq$ $\beta_{12}$**.

In [None]:
# Cell 6: Load Mroz dataset and estimate Logit with educ x kidslt6 interaction
print("Loading Mroz dataset...")
df = load_dataset('mroz')
print(f"Shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print(f"\nOutcome 'inlf': mean = {df['inlf'].mean():.3f}, "
      f"{int(df['inlf'].sum())} women in labor force")

# Add panel identifiers
df = df.copy()
df['id']   = range(len(df))
df['time'] = 1

# Create interaction term: education x young children (kidslt6)
df['educ_kidslt6'] = df['educ'] * df['kidslt6']

print(f"\nInteraction variable 'educ_kidslt6':")
print(f"  Mean: {df['educ_kidslt6'].mean():.3f}")
print(f"  % zero (no young kids): {(df['educ_kidslt6'] == 0).mean():.1%}")

# Estimate Logit without interaction (baseline)
logit_base = None
try:
    logit_base = PooledLogit(
        'inlf ~ educ + age + kidslt6 + nwifeinc',
        data=df, entity_col='id', time_col='time'
    ).fit()
    print("\nBaseline Logit (no interaction): estimated successfully")
    print(f"  Log-likelihood: {logit_base.llf:.4f}")
except Exception as e:
    print(f"Baseline logit failed: {e}")

# Estimate Logit with interaction term: educ x kidslt6
logit_int = None
beta_int   = None
int_param_name = 'educ_kidslt6'

try:
    logit_int = PooledLogit(
        'inlf ~ educ + age + kidslt6 + nwifeinc + educ_kidslt6',
        data=df, entity_col='id', time_col='time'
    ).fit()
    print("\n" + "=" * 50)
    print("Logit with Education x Young Children Interaction")
    print("=" * 50)
    print(logit_int.summary())

    # Get the interaction coefficient
    for name in ['educ_kidslt6', 'educ:kidslt6']:
        if name in logit_int.params.index:
            beta_int = float(logit_int.params[name])
            int_param_name = name
            break
    if beta_int is None:
        print(f"Available params: {list(logit_int.params.index)}")
        int_param_name = logit_int.params.index[-1]
        beta_int = float(logit_int.params.iloc[-1])

    print(f"\nbeta_{int_param_name} = {beta_int:.4f}")
    print("(This is NOT the interaction effect — see next cell)")

except Exception as e:
    print(f"Logit with interaction failed: {e}")
    import traceback; traceback.print_exc()

In [None]:
# Cell 7: Compute interaction effects via compute_interaction_effects()
ie_result = None

if logit_int is not None:
    try:
        ie_result = compute_interaction_effects(
            model_result=logit_int,
            var1='educ',
            var2='kidslt6',
            method='delta'    # uses delta method for SEs
        )

        print("=" * 70)
        print("Interaction Effect: Education x Young Children (Ai & Norton Correction)")
        print("=" * 70)
        print(ie_result.summary())

        # Use cross_partial (real attribute name)
        ie_values = ie_result.cross_partial

        print("\n--- Distribution of Individual Interaction Effects ---")
        print(f"  N observations : {len(ie_values)}")
        print(f"  Mean           : {ie_values.mean():+.5f}")
        print(f"  Std. deviation : {ie_values.std():.5f}")
        print(f"  Minimum        : {ie_values.min():+.5f}")
        print(f"  Maximum        : {ie_values.max():+.5f}")
        print(f"  % positive     : {(ie_values > 0).mean():.1%}")
        print(f"  % negative     : {(ie_values < 0).mean():.1%}")

        if beta_int is not None:
            print(f"\n  beta_12 (coefficient)  = {beta_int:+.5f}")
            print(f"  Mean cross-partial     = {ie_values.mean():+.5f}")
            print(f"  These are different! (ratio = {ie_values.mean()/beta_int:.2f}x)")

        if ie_result.z_statistics is not None:
            z = ie_result.z_statistics
            print(f"\n  % |z| > 1.96 (significant): {(np.abs(z) > 1.96).mean():.1%}")
            print(f"  % significant positive    : {ie_result.significant_positive:.1%}")
            print(f"  % significant negative    : {ie_result.significant_negative:.1%}")

    except Exception as e:
        print(f"compute_interaction_effects failed: {e}")
        import traceback
        traceback.print_exc()
        ie_result = None
        ie_values = np.array([])
else:
    print("Skipping: logit_int model not available.")
    ie_values = np.array([])

In [None]:
# Cell 8: Visualize distribution of interaction effects
if ie_result is not None and len(ie_values) > 0:
    fig, axes = plt.subplots(1, 2, figsize=(12, 4))

    # --- Left: Histogram of cross-partials ---
    axes[0].hist(ie_values, bins=40, color='steelblue', alpha=0.75, edgecolor='white')
    axes[0].axvline(ie_values.mean(), color='tomato', lw=2, ls='--',
                    label=f'Mean = {ie_values.mean():.4f}')
    if beta_int is not None:
        axes[0].axvline(beta_int, color='black', lw=2,
                        label=f'$\\beta_{{12}}$ = {beta_int:.4f}')
    axes[0].axvline(0, color='gray', lw=0.8, ls=':')
    axes[0].set_xlabel(
        r'Cross-Partial $\partial^2 P / \partial \mathrm{educ}\, \partial \mathrm{kidslt6}$')
    axes[0].set_ylabel('Count')
    axes[0].set_title(
        'Distribution of Interaction Effects\n(one value per observation — heterogeneous)')
    axes[0].legend(fontsize=9)

    # --- Right: Cross-partial vs predicted probability ---
    # predicted_prob is a numpy array stored in ie_result
    prob_hat = np.array(ie_result.predicted_prob).flatten()

    axes[1].scatter(prob_hat, ie_values, alpha=0.3, s=15, color='steelblue')
    axes[1].axhline(0, color='black', lw=0.8, ls='--')
    axes[1].axhline(ie_values.mean(), color='tomato', lw=1.5, ls='--',
                    label=f'Mean IE = {ie_values.mean():.4f}')
    axes[1].set_xlabel('Predicted P(LFP = 1)')
    axes[1].set_ylabel(
        r'Cross-Partial $\partial^2 P/\partial \mathrm{educ}\, \partial \mathrm{kidslt6}$')
    axes[1].set_title('Interaction Effect vs Predicted Probability\n'
                      '(effect varies with X — not constant)')
    axes[1].legend(fontsize=9)

    plt.suptitle('Ai & Norton (2003) Correction: True Interaction Effects\n'
                 'Logit — LFP: Education x Young Children',
                 fontsize=11, fontweight='bold')
    plt.tight_layout()

    out_path = str(outputs_base / 'plots' / '05_interaction_distribution.png')
    plt.savefig(out_path, dpi=150, bbox_inches='tight')
    print(f"Plot saved: {out_path}")
    plt.show()
else:
    print("Skipping histogram: interaction effects not computed.")

<a id='section3'></a>
## Section 3: Continuous × Continuous Interaction

When both $x_1$ and $x_2$ are continuous, the cross-partial answers:
> "Does a one-unit increase in $x_1$ change the marginal effect of $x_2$?"

In Logit, the answer depends on the full linear predictor $X\beta$, which varies across individuals.

### Example: Education × Experience

Does the effect of labor market experience on LFP probability depend on the woman's education level?

**Visualization:** a contour heatmap showing the cross-partial over a grid of (education, experience) values, holding other covariates at their means. This reveals the **interaction surface** — where the effect is strongest (or changes sign).

In [None]:
# Cell 10: Continuous x Continuous — Education x Experience
logit_cc = None
ie_cc     = None

if 'exper' in df.columns:
    print("'exper' found in dataset. Estimating educ x exper interaction model...")

    df['educ_exper'] = df['educ'] * df['exper']

    try:
        logit_cc = PooledLogit(
            'inlf ~ educ + exper + age + kidslt6 + nwifeinc + educ_exper',
            data=df, entity_col='id', time_col='time'
        ).fit()
        print(f"Logit CC estimated. Log-likelihood: {logit_cc.llf:.4f}")

        # Show relevant coefficients
        print("\nKey parameters:")
        for var in ['educ', 'exper', 'educ_exper']:
            if var in logit_cc.params.index:
                print(f"  {var:15s}: {logit_cc.params[var]:+.4f} (SE={logit_cc.bse[var]:.4f})")

    except Exception as e:
        print(f"Logit CC estimation failed: {e}")
        logit_cc = None

    # Compute interaction effects
    if logit_cc is not None:
        try:
            ie_cc = compute_interaction_effects(
                model_result=logit_cc,
                var1='educ',
                var2='exper',
                method='delta'
            )
            print(f"\nInteraction Effects (educ x exper):")
            print(f"  Mean cross-partial : {ie_cc.mean_effect:+.5f}")
            print(f"  Std deviation      : {ie_cc.std_effect:.5f}")
            print(f"  % positive         : {ie_cc.prop_positive:.1%}")
        except Exception as e:
            print(f"compute_interaction_effects (CC) failed: {e}")
            ie_cc = None

    # Build heatmap over (educ, exper) grid
    if logit_cc is not None:
        try:
            params_cc = logit_cc.params

            educ_grid  = np.linspace(df['educ'].min(),  df['educ'].max(),  30)
            exper_grid = np.linspace(df['exper'].min(), df['exper'].max(), 30)
            E, X_g = np.meshgrid(educ_grid, exper_grid)

            # Hold other vars at means
            age_mean      = df['age'].mean()
            kidslt6_mean  = df['kidslt6'].mean()
            nwifeinc_mean = df['nwifeinc'].mean()

            cp_grid = np.zeros_like(E)

            for i in range(len(exper_grid)):
                for j in range(len(educ_grid)):
                    xb_val = (
                        get_param(params_cc, 'Intercept', 'const', 'intercept')
                        + get_param(params_cc, 'educ')       * E[i, j]
                        + get_param(params_cc, 'exper')      * X_g[i, j]
                        + get_param(params_cc, 'age')        * age_mean
                        + get_param(params_cc, 'kidslt6')    * kidslt6_mean
                        + get_param(params_cc, 'nwifeinc')   * nwifeinc_mean
                        + get_param(params_cc, 'educ_exper') * E[i, j] * X_g[i, j]
                    )
                    L = 1 / (1 + np.exp(-xb_val))
                    Lpdf = L * (1 - L)
                    b1_est  = get_param(params_cc, 'educ')
                    b2_est  = get_param(params_cc, 'exper')
                    b12_est = get_param(params_cc, 'educ_exper')
                    cp_grid[i, j] = (b12_est * Lpdf
                                     + b1_est * b2_est * Lpdf * (1 - 2*L))

            # Plot heatmap
            fig, axes_cc = plt.subplots(1, 2, figsize=(14, 5))

            # Contour plot
            vmax = max(abs(cp_grid.min()), abs(cp_grid.max()))
            im = axes_cc[0].contourf(E, X_g, cp_grid, levels=20, cmap='RdBu_r',
                                     vmin=-vmax, vmax=vmax)
            plt.colorbar(im, ax=axes_cc[0],
                         label=r'Cross-Partial $\partial^2 P/\partial educ\, \partial exper$')
            axes_cc[0].contour(E, X_g, cp_grid, levels=[0], colors='black', linewidths=1.5)
            axes_cc[0].set_xlabel('Education (years)')
            axes_cc[0].set_ylabel('Experience (years)')
            axes_cc[0].set_title(
                'Interaction Effect Surface\nLogit — LFP: Education x Experience')

            # Distribution of individual cross-partials
            if ie_cc is not None:
                axes_cc[1].hist(ie_cc.cross_partial, bins=40, color='steelblue',
                                alpha=0.75, edgecolor='white')
                axes_cc[1].axvline(ie_cc.mean_effect, color='tomato', lw=2, ls='--',
                                   label=f'Mean = {ie_cc.mean_effect:.4f}')
                axes_cc[1].axvline(0, color='gray', lw=0.8, ls=':')
                axes_cc[1].set_xlabel(
                    r'Cross-Partial $\partial^2 P/\partial educ \, \partial exper$')
                axes_cc[1].set_ylabel('Count')
                axes_cc[1].set_title(
                    'Distribution of Interaction Effects\n(Continuous x Continuous)')
                axes_cc[1].legend(fontsize=9)

            plt.suptitle('Continuous x Continuous Interaction: Education x Experience',
                         fontsize=11, fontweight='bold')
            plt.tight_layout()

            out_path = str(outputs_base / 'plots' / '05_interaction_heatmap_conteduc_exper.png')
            plt.savefig(out_path, dpi=150, bbox_inches='tight')
            print(f"Heatmap saved: {out_path}")
            plt.show()

        except Exception as e:
            print(f"Heatmap generation failed: {e}")
            import traceback; traceback.print_exc()

else:
    print("Note: 'exper' variable not found in dataset.")
    print("Skipping continuous x continuous heatmap.")

<a id='section4'></a>
## Section 4: Dummy × Continuous Interaction

When $x_1$ is a **dummy** (0/1) and $x_2$ is **continuous**, the interaction effect answers:
> "Does the marginal effect of $x_2$ differ between the two groups defined by $x_1$?"

### Example: Young Children × Education

- $x_1 = \text{kidslt6}$ (1 = has young children)
- $x_2 = \text{educ}$ (years of education)

**Question:** Does the effect of an additional year of education on LFP probability depend on whether the woman has young children?

**Visualization:** Two predicted probability curves (one per group). The gap between them shows the interaction — if curves are parallel, there is no interaction effect. Non-parallel curves reveal heterogeneous effects.

In [None]:
# Cell 12: Dummy x Continuous — Young Children x Education
if logit_int is not None:
    try:
        educ_range    = np.linspace(df['educ'].min(), df['educ'].max(), 100)
        age_mean      = df['age'].mean()
        nwifeinc_mean = df['nwifeinc'].mean()

        params_int = logit_int.params

        def predict_logit_group(educ_val, kids_val):
            """Predict P(LFP=1) for given education and kids status."""
            xb = (
                get_param(params_int, 'Intercept', 'const', 'intercept')
                + get_param(params_int, 'educ')    * educ_val
                + get_param(params_int, 'age')     * age_mean
                + get_param(params_int, 'kidslt6') * kids_val
                + get_param(params_int, 'nwifeinc') * nwifeinc_mean
                + get_param(params_int, 'educ_kidslt6', 'educ:kidslt6') * educ_val * kids_val
            )
            return 1 / (1 + np.exp(-xb))

        prob_no_kids  = [predict_logit_group(e, 0) for e in educ_range]
        prob_has_kids = [predict_logit_group(e, 1) for e in educ_range]
        gap = [p0 - p1 for p0, p1 in zip(prob_no_kids, prob_has_kids)]

        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

        # Predicted probability curves by group
        ax1.plot(educ_range, prob_no_kids,  color='steelblue', lw=2.5,
                 label='No young children (kidslt6 = 0)')
        ax1.plot(educ_range, prob_has_kids, color='tomato',    lw=2.5, ls='--',
                 label='Has young children (kidslt6 = 1)')
        ax1.set_xlabel('Years of Education')
        ax1.set_ylabel('Predicted P(LFP = 1)')
        ax1.set_title('Predicted Probabilities by Group\n(other vars at means)')
        ax1.legend(fontsize=9)
        ax1.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1))

        # Gap plot: interaction effect as a function of education
        ax2.plot(educ_range, gap, color='purple', lw=2.5)
        ax2.fill_between(educ_range, gap, 0, alpha=0.2, color='purple')
        ax2.axhline(0, color='black', lw=0.8, ls='--')
        ax2.set_xlabel('Years of Education')
        ax2.set_ylabel('Gap: P(no kids) minus P(has kids)')
        ax2.set_title('Interaction Effect Varies with Education\n'
                      '(Difference between the two curves)')
        ax2.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1))

        plt.suptitle('Dummy x Continuous Interaction: Young Children x Education\n'
                     'Logit — LFP Model', fontsize=11, fontweight='bold')
        plt.tight_layout()

        out_path = str(outputs_base / 'plots' / '05_dummy_continuous_interaction.png')
        plt.savefig(out_path, dpi=150, bbox_inches='tight')
        print(f"Plot saved: {out_path}")
        plt.show()

        print("\n--- Summary: Effect of young children at various education levels ---")
        for e_val in [8, 10, 12, 14, 16]:
            gap_val = predict_logit_group(e_val, 0) - predict_logit_group(e_val, 1)
            print(f"  educ={e_val}: gap = {gap_val:+.3f} ({gap_val:.1%})")

    except Exception as e:
        print(f"Dummy x Continuous visualization failed: {e}")
        import traceback; traceback.print_exc()
else:
    print("Skipping: logit_int not available.")

<a id='section5'></a>
## Section 5: Dummy × Dummy Interaction

When both $x_1$ and $x_2$ are dummies, there are four possible combinations:

| | $x_2 = 0$ | $x_2 = 1$ |
|---|---|---|
| $x_1 = 0$ | $P_{00}$ | $P_{01}$ |
| $x_1 = 1$ | $P_{10}$ | $P_{11}$ |

The interaction effect is: $(P_{11} - P_{10}) - (P_{01} - P_{00})$

This is the **difference-in-differences** of predicted probabilities. In nonlinear models, this is NOT equal to $\beta_{12}$.

### Example: High Education × Young Children

- $x_1 = \text{high\_educ}$ (above median education)
- $x_2 = \text{kidslt6}$ (has young children)

**Visualization:** Line plot with two lines (kidslt6=0 and kidslt6=1) across education groups. Non-parallel lines indicate an interaction effect.

In [None]:
# Cell 14: Dummy x Dummy — High Education x Young Children
logit_dd = None
ie_dd    = None

try:
    # Create binary education dummy
    df['high_educ'] = (df['educ'] >= df['educ'].median()).astype(float)
    df['high_educ_kidslt6'] = df['high_educ'] * df['kidslt6']

    print(f"Education median: {df['educ'].median():.0f} years")
    print(f"% high_educ = 1: {df['high_educ'].mean():.1%}")

    logit_dd = PooledLogit(
        'inlf ~ high_educ + kidslt6 + high_educ_kidslt6 + age + nwifeinc',
        data=df, entity_col='id', time_col='time'
    ).fit()

    print(f"\nLogit DD estimated. Log-likelihood: {logit_dd.llf:.4f}")
    print("\nParameters:")
    print(logit_dd.params.to_string())

except Exception as e:
    print(f"Logit DD estimation failed: {e}")
    import traceback; traceback.print_exc()
    logit_dd = None

if logit_dd is not None:
    try:
        params_dd     = logit_dd.params
        age_mean      = df['age'].mean()
        nwifeinc_mean = df['nwifeinc'].mean()

        def pred_dd(high_e, kids):
            """Predict P(LFP=1) for dummy x dummy scenario."""
            xb = (
                get_param(params_dd, 'Intercept', 'const', 'intercept')
                + get_param(params_dd, 'high_educ')                      * high_e
                + get_param(params_dd, 'kidslt6')                        * kids
                + get_param(params_dd, 'high_educ_kidslt6',
                            'high_educ:kidslt6')                         * high_e * kids
                + get_param(params_dd, 'age')                            * age_mean
                + get_param(params_dd, 'nwifeinc')                       * nwifeinc_mean
            )
            return 1 / (1 + np.exp(-xb))

        # 2x2 grid of predicted probabilities
        p00 = pred_dd(0, 0)  # Low educ, No kids
        p01 = pred_dd(0, 1)  # Low educ, Has kids
        p10 = pred_dd(1, 0)  # High educ, No kids
        p11 = pred_dd(1, 1)  # High educ, Has kids

        print("\n=== Predicted Probabilities: 2x2 Grid ===")
        print(f"  Low Educ,  No Young Kids  : {p00:.1%}")
        print(f"  Low Educ,  Has Young Kids : {p01:.1%}")
        print(f"  High Educ, No Young Kids  : {p10:.1%}")
        print(f"  High Educ, Has Young Kids : {p11:.1%}")

        # Difference-in-differences
        did = (p11 - p10) - (p01 - p00)
        print(f"\n  DiD = (P11 - P10) - (P01 - P00) = {did:+.3f}")
        print(f"  (This is the correct interaction effect for dummies)")

        # Get interaction coefficient for comparison
        b12_dd = get_param(params_dd, 'high_educ_kidslt6', 'high_educ:kidslt6')
        print(f"  beta_12 (coefficient) = {b12_dd:+.4f}")

    except Exception as e:
        print(f"Prediction grid failed: {e}")
        p00, p01, p10, p11 = 0.5, 0.3, 0.7, 0.5
        b12_dd = 0.0

    # Compute interaction effects (Ai & Norton)
    try:
        ie_dd = compute_interaction_effects(
            model_result=logit_dd,
            var1='high_educ',
            var2='kidslt6',
            method='delta'
        )
        print(f"\n=== Interaction Effect (Ai & Norton) ===")
        print(ie_dd.summary())
    except Exception as e:
        print(f"compute_interaction_effects (DD) failed: {e}")
        ie_dd = None

    # 2x2 visualization
    try:
        fig, ax = plt.subplots(figsize=(7, 5))

        groups_x = [0, 1]
        labels_x = ['Low Education', 'High Education']
        no_kids  = [p00, p10]
        has_kids = [p01, p11]

        ax.plot(groups_x, no_kids,  'o-',  color='steelblue', lw=2.5, ms=10,
                label='No Young Children (kidslt6 = 0)')
        ax.plot(groups_x, has_kids, 's--', color='tomato',    lw=2.5, ms=10,
                label='Has Young Children (kidslt6 = 1)')

        # Add value labels
        for x_val, y_val in zip([0, 1], no_kids):
            ax.annotate(f'{y_val:.1%}', xy=(x_val, y_val),
                        xytext=(0, 8), textcoords='offset points',
                        ha='center', color='steelblue', fontsize=10)
        for x_val, y_val in zip([0, 1], has_kids):
            ax.annotate(f'{y_val:.1%}', xy=(x_val, y_val),
                        xytext=(0, -16), textcoords='offset points',
                        ha='center', color='tomato', fontsize=10)

        ax.set_xticks([0, 1])
        ax.set_xticklabels(labels_x, fontsize=11)
        ax.set_ylabel('Predicted P(LFP = 1)', fontsize=11)
        ax.set_title('Dummy x Dummy Interaction: Education x Young Children\n'
                     'Non-parallel lines indicate interaction effect present', fontsize=11)
        ax.legend(fontsize=10)
        ax.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1))
        ax.set_xlim(-0.3, 1.3)
        plt.tight_layout()

        out_path = str(outputs_base / 'plots' / '05_dummy_dummy_interaction.png')
        plt.savefig(out_path, dpi=150, bbox_inches='tight')
        print(f"Plot saved: {out_path}")
        plt.show()

    except Exception as e:
        print(f"Dummy x Dummy visualization failed: {e}")
else:
    print("Skipping: logit_dd not available.")

<a id='section6'></a>
## Section 6: Testing Statistical Significance of Interaction Effects

Since interaction effects are **heterogeneous** (vary across observations), reporting only the mean is insufficient. Key statistics to report:

| Statistic | Meaning |
|-----------|--------|
| **Mean** | Average interaction effect across all observations |
| **Std. deviation** | How much heterogeneity exists |
| **% significant** | Fraction of observations with $|z| > 1.96$ |
| **LR test** | Tests whether the interaction term improves model fit |

### The LR Test vs. the Cross-Partial Test

The **LR test** uses $\beta_{12}$ — it tests whether the interaction is in the model, not directly whether the cross-partial is non-zero. The distinction matters because even $\beta_{12} = 0$ can produce non-zero cross-partials through the $\beta_1 \beta_2$ term in the formula.

The `test_interaction_significance()` function compares models **with and without** the interaction term using:
- Likelihood Ratio test (LR statistic, p-value)
- AIC and BIC comparison

In [None]:
# Cell 16: Statistical significance testing
# test_interaction_significance requires BOTH: model_with and model_without interaction

if logit_int is not None and logit_base is not None:
    try:
        test_result = test_interaction_significance(
            model_with_interaction=logit_int,
            model_without_interaction=logit_base,
            var1='educ',
            var2='kidslt6'
        )

        print("=" * 60)
        print("Interaction Significance Test")
        print("Model: Logit — LFP ~ educ + age + kidslt6 + nwifeinc")
        print("=" * 60)

        if isinstance(test_result, dict):
            print(f"\nLikelihood Ratio Test:")
            print(f"  LR statistic : {test_result.get('lr_statistic', 'N/A'):.4f}")
            print(f"  p-value      : {test_result.get('lr_pvalue', 'N/A'):.4f}")
            print(f"\nInformation Criteria:")
            print(f"  Delta AIC    : {test_result.get('delta_aic', 'N/A'):.4f}")
            print(f"  Delta BIC    : {test_result.get('delta_bic', 'N/A'):.4f}")
            print(f"  (negative = interaction model preferred)")
            print(f"\nInteraction Effect Statistics:")
            print(f"  Mean effect  : {test_result.get('avg_interaction_effect', 'N/A'):.5f}")
            print(f"  Std effect   : {test_result.get('interaction_std', 'N/A'):.5f}")
            print(f"\nPrefer interaction model? {test_result.get('prefer_interaction', 'N/A')}")
        else:
            # If it has a .summary() method
            print(test_result.summary() if hasattr(test_result, 'summary') else str(test_result))

    except Exception as e:
        print(f"test_interaction_significance failed: {e}")
        import traceback
        traceback.print_exc()
else:
    print("LR test skipped: requires both logit_int and logit_base models.")

# --- Z-statistics distribution (from delta method) ---
print("\n" + "=" * 60)
print("Distribution of Z-Statistics for Individual Interaction Effects")
print("=" * 60)

if ie_result is not None and ie_result.z_statistics is not None:
    z_vals = ie_result.z_statistics
    pct_sig = (np.abs(z_vals) > 1.96).mean()

    print(f"\nZ-statistics summary:")
    print(f"  Mean   : {z_vals.mean():+.3f}")
    print(f"  Std    : {z_vals.std():.3f}")
    print(f"  Min    : {z_vals.min():.3f}")
    print(f"  Max    : {z_vals.max():.3f}")
    print(f"  |z| > 1.96 : {pct_sig:.1%}")
    print(f"  % sig positive: {ie_result.significant_positive:.1%}")
    print(f"  % sig negative: {ie_result.significant_negative:.1%}")

    fig, ax = plt.subplots(figsize=(8, 4))
    ax.hist(z_vals, bins=40, color='steelblue', alpha=0.75, edgecolor='white')
    ax.axvline( 1.96, color='tomato', lw=1.5, ls='--', label='±1.96 (5% critical values)')
    ax.axvline(-1.96, color='tomato', lw=1.5, ls='--')
    ax.axvline(0, color='black', lw=0.8)
    ax.set_xlabel('Z-Statistic for Individual Interaction Effect')
    ax.set_ylabel('Count')
    ax.set_title(f'Distribution of Z-Statistics\n'
                 f'{pct_sig:.1%} of observations have |z| > 1.96')
    ax.legend()
    plt.tight_layout()

    out_path = str(outputs_base / 'plots' / '05_interaction_zstats.png')
    plt.savefig(out_path, dpi=150, bbox_inches='tight')
    print(f"Plot saved: {out_path}")
    plt.show()

elif ie_result is not None:
    print("\nNote: z_statistics = None (SE computation returned None).")
    print("This may occur if the delta method gradient computation was incomplete.")
    print("Try method='bootstrap' for an alternative SE approach.")
else:
    print("\nInteraction effects not available — z-statistic plot skipped.")

In [None]:
# Cell 17: Logit vs Probit interaction comparison
print("=" * 60)
print("Logit vs Probit: Comparing Interaction Effects")
print("=" * 60)

probit_int  = None
probit_base = None
ie_probit   = None

# Estimate Probit with the same interaction specification as Logit
try:
    probit_base = PooledProbit(
        'inlf ~ educ + age + kidslt6 + nwifeinc',
        data=df, entity_col='id', time_col='time'
    ).fit()
    print(f"Probit baseline estimated. Log-likelihood: {probit_base.llf:.4f}")
except Exception as e:
    print(f"Probit baseline failed: {e}")
    probit_base = None

try:
    probit_int = PooledProbit(
        'inlf ~ educ + age + kidslt6 + nwifeinc + educ_kidslt6',
        data=df, entity_col='id', time_col='time'
    ).fit()
    print(f"Probit with interaction estimated. Log-likelihood: {probit_int.llf:.4f}")
except Exception as e:
    print(f"Probit with interaction failed: {e}")
    probit_int = None

# Compute Probit interaction effects
if probit_int is not None:
    try:
        ie_probit = compute_interaction_effects(
            model_result=probit_int,
            var1='educ',
            var2='kidslt6',
            method='delta'
        )
        print(f"Probit interaction effects computed.")
    except Exception as e:
        print(f"Probit IE computation failed: {e}")
        ie_probit = None

# Build comparison table
print("\n--- Comparison Table ---")
rows = []

if logit_int is not None and ie_result is not None:
    b12_logit = get_param(logit_int.params, 'educ_kidslt6', 'educ:kidslt6')
    rows.append({
        'Model': 'Logit',
        'beta_12': b12_logit,
        'Mean cross-partial': ie_result.mean_effect,
        'Std cross-partial': ie_result.std_effect,
        '% positive': ie_result.prop_positive,
    })

if probit_int is not None and ie_probit is not None:
    b12_probit = get_param(probit_int.params, 'educ_kidslt6', 'educ:kidslt6')
    rows.append({
        'Model': 'Probit',
        'beta_12': b12_probit,
        'Mean cross-partial': ie_probit.mean_effect,
        'Std cross-partial': ie_probit.std_effect,
        '% positive': ie_probit.prop_positive,
    })

if rows:
    comp = pd.DataFrame(rows).set_index('Model')
    print(comp.round(5).to_string())
    print("\nNote: beta_12 differs between Logit and Probit due to different link function scales.")
    print("The cross-partial means should be similar in direction")
    print("but may differ in magnitude due to different distributional assumptions.")
else:
    print("No models available for comparison.")

# Visual comparison
if ie_result is not None and ie_probit is not None:
    try:
        fig, axes_cmp = plt.subplots(1, 2, figsize=(12, 4), sharey=False)

        for ax_i, (ie_obj, model_name, color) in enumerate([
            (ie_result, 'Logit', 'steelblue'),
            (ie_probit, 'Probit', 'darkorange')
        ]):
            axes_cmp[ax_i].hist(ie_obj.cross_partial, bins=35,
                                color=color, alpha=0.75, edgecolor='white')
            axes_cmp[ax_i].axvline(ie_obj.mean_effect, color='black', lw=2, ls='--',
                                   label=f'Mean = {ie_obj.mean_effect:.4f}')
            axes_cmp[ax_i].axvline(0, color='gray', lw=0.8, ls=':')
            axes_cmp[ax_i].set_xlabel(
                r'Cross-Partial $\partial^2 P/\partial educ \, \partial kidslt6$')
            axes_cmp[ax_i].set_ylabel('Count')
            axes_cmp[ax_i].set_title(f'{model_name} Interaction Effects\n'
                                     f'({ie_obj.prop_positive:.1%} positive)')
            axes_cmp[ax_i].legend(fontsize=9)

        plt.suptitle('Logit vs Probit: Distribution of Interaction Effects\n'
                     'LFP: Education x Young Children', fontsize=11, fontweight='bold')
        plt.tight_layout()

        out_path = str(outputs_base / 'plots' / '05_logit_probit_comparison.png')
        plt.savefig(out_path, dpi=150, bbox_inches='tight')
        print(f"Comparison plot saved: {out_path}")
        plt.show()

    except Exception as e:
        print(f"Comparison plot failed: {e}")
elif ie_result is not None:
    print("\nProbit model unavailable — skipping comparison plot.")
    print("(Logit results remain valid.)")

<a id='takeaways'></a>
## Key Takeaways

### The Core Insight

| In OLS | In Logit/Probit/Poisson |
|--------|------------------------|
| $\beta_{12}$ = interaction effect | $\beta_{12} \neq$ interaction effect |
| Constant across all observations | **Varies** across all observations |
| Same sign as $\partial^2 E[Y]/\partial x_1 \partial x_2$ | May have **different sign** |

### The Eight Rules

1. **$\beta_{12} \neq$ interaction effect** in nonlinear models (Logit, Probit, Poisson)
2. The correct measure is the **cross-partial** $\partial^2 E[Y]/\partial x_1 \partial x_2$
3. Cross-partials **vary across observations** — report mean AND distribution
4. For Logit: $\partial^2 \Lambda/\partial x_1 \partial x_2 = \beta_{12}\Lambda(1-\Lambda) + \beta_1\beta_2\Lambda(1-\Lambda)(1-2\Lambda)$
5. **Sign** can differ between $\beta_{12}$ and the cross-partial
6. Use `compute_interaction_effects()` in PanelBox for the correct computation
7. **Report:** mean, SD, % positive, % statistically significant
8. **Visualize:** histogram, scatter vs predicted prob, conditional curves

### What to Report in Practice

```
"The interaction effect between education and young children (kidslt6)
in the Logit model has a mean cross-partial of [X], with [Y]% of
observations showing a negative interaction effect and [Z]% having
|z| > 1.96. The LR test for the interaction term yields chi2(1) = [W],
p = [p]."
```

### PanelBox Workflow

```python
import pandas as pd
from panelbox.models.discrete.binary import PooledLogit
from panelbox.marginal_effects.interactions import (
    compute_interaction_effects,
    test_interaction_significance
)

# Prepare panel identifiers (PooledLogit takes a DataFrame directly)
df['id']   = range(len(df))
df['time'] = 1
df['x1x2'] = df['x1'] * df['x2']   # create interaction term

# 1. Estimate model WITH interaction term
logit_with = PooledLogit(
    'y ~ x1 + x2 + x1x2',
    data=df, entity_col='id', time_col='time'
).fit()

# 2. Compute Ai & Norton cross-partials
ie = compute_interaction_effects(logit_with, var1='x1', var2='x2', method='delta')

# 3. Inspect results
print(ie.summary())
print(f"Mean: {ie.mean_effect:.4f}, % sig: {ie.significant_positive + ie.significant_negative:.1%}")

# 4. LR test (requires model WITHOUT interaction)
logit_without = PooledLogit(
    'y ~ x1 + x2',
    data=df, entity_col='id', time_col='time'
).fit()
test = test_interaction_significance(
    model_with_interaction=logit_with,
    model_without_interaction=logit_without,
    var1='x1',
    var2='x2'
)
print(test)  # dict with lr_statistic, lr_pvalue, delta_aic, delta_bic, etc.
```

---

**Next notebook:** Notebook 06 — Best Practices for Reporting Marginal Effects  
Synthesizes all results into communication strategies for different audiences.

In [None]:
# Cell 19: Export results
print("Saving results...")

# Build summary dataframe
records = []

if ie_result is not None:
    records.append({
        'model': 'Logit',
        'interaction': 'educ x kidslt6',
        'mean_effect': ie_result.mean_effect,
        'std_effect': ie_result.std_effect,
        'min_effect': ie_result.min_effect,
        'max_effect': ie_result.max_effect,
        'pct_positive': ie_result.prop_positive,
        'pct_negative': ie_result.prop_negative,
        'pct_sig_positive': ie_result.significant_positive if ie_result.significant_positive is not None else float('nan'),
        'pct_sig_negative': ie_result.significant_negative if ie_result.significant_negative is not None else float('nan'),
        'beta_12': get_param(logit_int.params, 'educ_kidslt6', 'educ:kidslt6') if logit_int is not None else float('nan'),
    })

if ie_probit is not None:
    records.append({
        'model': 'Probit',
        'interaction': 'educ x kidslt6',
        'mean_effect': ie_probit.mean_effect,
        'std_effect': ie_probit.std_effect,
        'min_effect': ie_probit.min_effect,
        'max_effect': ie_probit.max_effect,
        'pct_positive': ie_probit.prop_positive,
        'pct_negative': ie_probit.prop_negative,
        'pct_sig_positive': ie_probit.significant_positive if ie_probit.significant_positive is not None else float('nan'),
        'pct_sig_negative': ie_probit.significant_negative if ie_probit.significant_negative is not None else float('nan'),
        'beta_12': get_param(probit_int.params, 'educ_kidslt6', 'educ:kidslt6') if probit_int is not None else float('nan'),
    })

if ie_cc is not None:
    records.append({
        'model': 'Logit',
        'interaction': 'educ x exper',
        'mean_effect': ie_cc.mean_effect,
        'std_effect': ie_cc.std_effect,
        'min_effect': ie_cc.min_effect,
        'max_effect': ie_cc.max_effect,
        'pct_positive': ie_cc.prop_positive,
        'pct_negative': ie_cc.prop_negative,
        'pct_sig_positive': ie_cc.significant_positive if ie_cc.significant_positive is not None else float('nan'),
        'pct_sig_negative': ie_cc.significant_negative if ie_cc.significant_negative is not None else float('nan'),
        'beta_12': get_param(logit_cc.params, 'educ_exper', 'educ:exper') if logit_cc is not None else float('nan'),
    })

if ie_dd is not None:
    records.append({
        'model': 'Logit',
        'interaction': 'high_educ x kidslt6',
        'mean_effect': ie_dd.mean_effect,
        'std_effect': ie_dd.std_effect,
        'min_effect': ie_dd.min_effect,
        'max_effect': ie_dd.max_effect,
        'pct_positive': ie_dd.prop_positive,
        'pct_negative': ie_dd.prop_negative,
        'pct_sig_positive': ie_dd.significant_positive if ie_dd.significant_positive is not None else float('nan'),
        'pct_sig_negative': ie_dd.significant_negative if ie_dd.significant_negative is not None else float('nan'),
        'beta_12': get_param(logit_dd.params, 'high_educ_kidslt6', 'high_educ:kidslt6') if logit_dd is not None else float('nan'),
    })

if records:
    ie_summary = pd.DataFrame(records)
    csv_path = str(outputs_base / 'tables' / '05_interaction_summary.csv')
    ie_summary.to_csv(csv_path, index=False)
    print(f"Results saved to: {csv_path}")
    print(f"\nSummary table:")
    print(ie_summary.round(5).to_string())
else:
    print("No results to save (all models failed).")

# List all saved outputs
print("\n=" * 50)
print("All outputs:")
for subdir in ['plots', 'tables']:
    p = outputs_base / subdir
    for f in sorted(p.glob('05_*')):
        size_kb = f.stat().st_size / 1024
        print(f"  {f.relative_to(outputs_base.parent.parent)} ({size_kb:.1f} KB)")

print("\nNotebook 05 complete.")