When a regression uses a proxy
Swap validated truths into the proxy design matrix, row by row, and watch the coefficient move.
-
Swap path: traces
$\hat{\beta}_1$ as a function of the fraction of validated rows swapped in. Flat means the proxy is fine. Steep means it isn't. - SIM (Swap Importance Measure): a Shapley-value decomposition that identifies which rows drive the proxy-induced distortion.
- Portable risk score: predicts SIM from observables so the analyst can define a proxy-safe domain on the full sample, not just the validation subset.
- Audit gate: a held-out validation check that accepts or rejects the proposed domain. If it rejects, fall back to the validation-only estimate.
The framework is diagnostic first. It tells you whether the proxy matters, where it matters, and whether borrowing proxy-only observations is worth the bias cost. It does not claim distribution-free inference or universal consistency.
At typical validation sample sizes (
├── validation_swaps.tex # Paper (LaTeX source)
├── validation_swaps.pdf # Paper (compiled)
├── replicate_final.py # Replication script (all tables and figures)
├── fig1_swap_paths.png # Figure 1: swap paths across scenarios
├── fig2_sim_anatomy.png # Figure 2: SIM vs leverage × proxy error
└── README.md
Requires Python 3.8+ with numpy, scipy, and matplotlib. No other dependencies.
# Quick check (~90 seconds)
python replicate_final.py --fast
# Full replication (~4 minutes, reproduces all paper numbers)
python replicate_final.py
# Custom settings
python replicate_final.py --n_sims 500 --outdir my_resultsOutputs are written to results/ (or the directory specified by --outdir):
| File | Contents |
|---|---|
fig1_swap_paths.png |
Swap paths across three scenarios |
fig2_sim_anatomy.png |
SIM vs leverage × proxy error |
fig3_bias_bars.png |
Bias by method and scenario |
table_7_2_main.csv |
Table 1: main results |
table_7_4_precision.csv |
Precision comparison (confirms the arithmetic in Section 7.2) |
table_7_5a_audit_threshold.csv |
Table 2: audit threshold sensitivity |
table_7_5b_val_size.csv |
Table 3: validation size sensitivity |
The simulations use three DGPs designed to illustrate different failure modes:
Benign. Small homogeneous measurement error. The proxy works everywhere. The pipeline confirms this and lets the analyst proceed.
Heterogeneous. Proxy quality varies with an observable covariate (
Ugly. The proxy captures only part of
@unpublished{partialcredit2025,
title={Partial Credit: Diagnosing Proxy Covariates with Validation Swaps},
year={2025}
}