ImputationDiD: thread vcov_type as narrow {hc1} contract per BJS Theorem 3 (Phase 1b interstitial #3) by igerber · Pull Request #492 · igerber/diff-diff

igerber · 2026-05-25T15:12:36Z

Summary

Thread vcov_type through ImputationDiD as a permanently narrow {"hc1"} contract per Borusyak-Jaravel-Spiess (2024) Theorem 3 (Phase 1b interstitial Add robust parallel trends testing with Wasserstein distance #3, mirrors CS callaway-santanna: fix cluster= silent no-op + narrow vcov_type contract #487 + TD Phase 1b interstitial #2: TripleDifference vcov_type narrow contract #488 template for IF-based estimators).
Default cluster=None summary() now renders unit-cluster CR1 label (was misleadingly generic HC1); bootstrap fits suppress the analytical variance-family label and render "Inference method: bootstrap" instead.
Defensive n_clusters<2 / n_psu<2 NaN guard on the bootstrap path closes the BLAS-roundoff zero-SE failure mode predicted to recur on IF-based estimators; cluster=<col> + replicate-weight SurveyDesign raises NotImplementedError mirroring the CS+TD fail-closed guard.

Methodology references

Method name(s): ImputationDiD (Phase 1b interstitial Add robust parallel trends testing with Wasserstein distance #3 — vcov_type contract narrowing + summary label correction + bootstrap NaN guard).
Paper / source link(s): Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event-Study Designs: Robust and Efficient Estimation. Review of Economic Studies, 91(6), 3253-3285. Theorem 3 (equation 7) conservative variance.
Any intentional deviations from the source (and why): The narrow vcov_type = {"hc1"} contract is library-architectural, not paper-prescribed. Analytical-sandwich families {classical, hc2, hc2_bm} and conley spatial-HAC are rejected at __init__ because the per-unit IF aggregation has no equivalent single design matrix on which hat-matrix leverage or Bell-McCaffrey Satterthwaite DOF can be defined. Documented in docs/methodology/REGISTRY.md IF-vs-sandwich taxonomy + two new ImputationDiD **Note:** bullets.

Validation

Tests added/updated: tests/test_imputation.py::TestImputationDiDVcovType (+41 new tests across 7 surfaces — default / cluster / TSL-survey / replicate-survey + bootstrap × cluster + bootstrap × survey bit-equality ALL parametrized over aggregate ∈ {None, "event_study", "group"} with per-horizon and per-group SE override branches pinned; fit()-time revalidation after set_params bypass; bootstrap n_psu<2 / n_clusters<2 NaN propagation including coef_var NaN; pretrends=True × vcov_type='hc1' × cluster bit-equality; introspection on default attr / get_params / Results carries / to_dict / summary label default+cluster+bootstrap-suppressed / cluster_name suppression under survey / fit-clone idempotence / convenience function; input rejection on classical/hc2/hc2_bm/conley/unknown with distinct methodology-keyword pins; cluster + replicate_weights rejection). Full pytest tests/test_imputation.py suite: 128 passed.
Backtest / simulation / notebook evidence (if applicable): N/A — bit-equal preservation of the existing variance machinery (no methodology change); R parity unchanged.

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

github-actions · 2026-05-25T15:20:21Z

Overall Assessment
✅ Looks good

Executive Summary

No unmitigated P0/P1 issues found in the changed ImputationDiD variance path.
The new permanent vcov_type={"hc1"} contract is documented in the Methodology Registry and enforced at both construction and fit time; I did not find an undocumented deviation from Borusyak-Jaravel-Spiess Theorem 3.
The new bootstrap n_clusters<2 / n_psu<2 guard correctly returns NaN inference instead of misleading near-zero SEs, and I did not find a partial-NaN propagation bug in the changed path.
cluster + replicate_weights now fails closed instead of silently no-oping, which is the right behavior for replicate-weight survey designs.
Test review is static only: this environment does not have numpy or pytest, so I could not rerun the added suite locally.

Methodology

Severity: P3. Impact: none; the permanent vcov_type={"hc1"} narrowing, unit-cluster CR1 labeling, replicate-weight fail-closed behavior, and the degenerate-bootstrap NaN handling are all documented and match the implementation, so I do not see a methodology defect here. Concrete fix: none. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1313-L1315, diff_diff/imputation.py:L148-L177, diff_diff/imputation.py:L250-L253, diff_diff/imputation.py:L302-L323, diff_diff/imputation_bootstrap.py:L275-L395, diff_diff/imputation_results.py:L231-L262.

Code Quality

Severity: none. Impact: no new partial-NaN or incomplete-parameter-propagation issue found in the changed path. Concrete fix: none. vcov_type is threaded through constructor state, get_params(), fit-time validation, result metadata, summary labeling, and the convenience wrapper in diff_diff/imputation.py:L148-L177, diff_diff/imputation.py:L889-L934, diff_diff/imputation.py:L2436-L2503, and diff_diff/imputation.py:L2522-L2588.

Performance

Severity: none. Impact: the added work is limited to lightweight validation, metadata threading, and early-exit guards; no meaningful regression risk is visible in the diff. Concrete fix: none.

Maintainability

Severity: none. Impact: no maintainability blocker stood out in the touched code. Concrete fix: none.

Tech Debt

Severity: P3. Impact: none; the remaining vcov_type="conley" work is explicitly tracked, so that rejection path is documented deferred work rather than an untracked gap. Concrete fix: none. Reference: TODO.md:L114-L114.

Security

Severity: none. Impact: no security findings in the changed files. Concrete fix: none.

Documentation/Tests

Severity: none. Impact: the added tests cover the main changed surfaces: default/cluster/survey/replicate parity, fit-time revalidation, fail-closed cluster + replicate_weights, and the n_clusters<2 / n_psu<2 NaN path. Concrete fix: none. Reference: tests/test_imputation.py:L2354-L2875.
Severity: P3. Impact: I could not execute the tests in this review environment because python here lacks numpy and pytest, so runtime validation is static rather than a local rerun. Concrete fix: rerun the targeted Imputation suite in the normal project test environment or CI.

igerber · 2026-05-25T16:51:09Z

/ai-review

github-actions · 2026-05-25T17:19:59Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 9529bb2d61418eab9e5ed59cc6fed5079d514cd4

Overall Assessment

✅ Looks good

Executive Summary

No unmitigated P0/P1 issues found in the changed ImputationDiD variance path, and I did not find a regression relative to the prior clean re-review baseline.
The new permanent vcov_type={"hc1"} contract is documented in the Methodology Registry and implemented consistently at construction, fit-time revalidation, results metadata, and summary rendering, so I do not see an undocumented deviation from Borusyak-Jaravel-Spiess Theorem 3.
The new bootstrap n_clusters<2 / n_psu<2 fail-closed path correctly propagates NaN inference instead of surfacing misleading degenerate bootstrap output.
cluster + replicate_weights now fails closed, and the deferred conley work is properly tracked in TODO.md, so neither is a blocker.
One P3 documentation gap remains: the LLM guide’s shared staggered-results section was not updated for ImputationDiD’s new to_dict() surface and variance-metadata fields.
Runtime validation here is static only because this environment does not have numpy, pandas, scipy, or pytest.

Methodology

Severity: P3. Impact: none. The PR’s methodology-affecting changes are documented rather than silent: the narrow vcov_type={"hc1"} contract, unit-cluster labeling for cluster=None, replicate-weight fail-closed behavior, and degenerate-bootstrap NaN handling are all explicitly captured in the registry, and the implementation matches that documented contract. Concrete fix: none. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1309-L1315, diff_diff/imputation.py:L148-L183, diff_diff/imputation.py:L250-L315, diff_diff/imputation.py:L889-L929, diff_diff/imputation_bootstrap.py:L275-L395, diff_diff/imputation_results.py:L231-L262.

Code Quality

Severity: none. Impact: none. I did not find a new incomplete-parameter-propagation or NaN-propagation defect in the changed code; vcov_type is carried through constructor state, fit-time validation, get_params(), result metadata, and the convenience wrapper. Concrete fix: none. References: diff_diff/imputation.py:L148-L183, diff_diff/imputation.py:L250-L315, diff_diff/imputation.py:L2436-L2589.

Performance

Severity: none. Impact: none. The added work is limited to lightweight validation, metadata plumbing, and early-exit degenerate-bootstrap guards; I do not see a meaningful regression risk in the touched path. Concrete fix: none. References: diff_diff/imputation_bootstrap.py:L325-L395.

Maintainability

Severity: none. Impact: none. The new behavior follows the existing Callaway-Sant’Anna / TripleDifference narrow-contract pattern and stays localized to estimator validation, bootstrap handling, and results formatting. Concrete fix: none.

Tech Debt

Severity: P3. Impact: none. The remaining vcov_type="conley" gap is explicitly tracked in TODO.md, so this is documented deferred work rather than an untracked omission introduced by the PR. Concrete fix: none for this PR; follow the existing TODO row. References: TODO.md:L107-L114.

Security

Severity: none. Impact: none. I did not find a security issue in the changed files. Concrete fix: none.

Documentation/Tests

Severity: P3. Impact: low. ImputationDiDResults now exposes new public variance metadata and a new to_dict() method, but the shared staggered-results guide still advertises only summary(), print_summary(), and to_dataframe(), so guide consumers will miss the newly added surface. Concrete fix: update the shared results block or add an Imputation-specific results subsection covering to_dict(), vcov_type, cluster_name, and n_clusters. References: diff_diff/guides/llms-full.txt:L1265-L1284, diff_diff/imputation_results.py:L145-L152, diff_diff/imputation_results.py:L476-L521.
Severity: none. Impact: none on PR correctness. The added tests are broad and well-targeted for the changed surfaces: constructor/fit-time validation, default/cluster/survey/replicate parity, bootstrap parity, degenerate-bootstrap NaN propagation, summary labels, and convenience-wrapper threading. Concrete fix: none. References: tests/test_imputation.py:L2354-L2877.
Severity: P3. Impact: review is static-only. I could not execute the targeted Imputation test module locally because this environment lacks numpy, pandas, scipy, and pytest. Concrete fix: rerun tests/test_imputation.py in the normal project environment or CI.

github-actions · 2026-05-25T18:13:25Z

🔁 AI review rerun (requested by @igerber)

Head SHA: cfbfeaeb438251ff68ed49b1821acaa4aa49b4bf

Overall Assessment

✅ Looks good

Highest unmitigated severity is P2.

Executive Summary

Re-review focus: the prior P3 documentation gap is addressed; diff_diff/guides/llms-full.txt now documents ImputationDiD.vcov_type, the new variance metadata, and to_dict().
I did not find an unmitigated P0/P1 in the changed estimator/math/inference paths.
The new bootstrap degenerate-design guard is correctly fail-closed: the new helper returns NaN SE/CI/p-value payloads and the fit path propagates those through overall, event-study, and group inference.
One P2 remains: replicate-weight survey fits still leak cluster_name/n_clusters metadata and can print Number of clusters, even though the PR’s new public contract says those fields are suppressed under survey designs.
Validation here is static only; this environment does not have numpy, pandas, scipy, or pytest.

Methodology

No findings. The narrow vcov_type={"hc1"} contract is consistent with the existing ImputationDiD variance implementation: _compute_conservative_variance() still uses the Theorem 3 clustered sum-of-squares path, and REGISTRY.md explicitly documents the broader cluster=<col> / survey routing as a library synthesis rather than a paper change. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1309-L1315, diff_diff/imputation.py:L1454-L1503, diff_diff/imputation.py:L2462-L2503. (academic.oup.com)

Code Quality

Severity: P2. Impact: replicate-weight survey fits violate the new results-metadata contract introduced by this PR. ImputationDiD.fit() suppresses cluster_name / n_clusters only when resolved_survey.psu is not None, so replicate-weight survey designs keep cluster_name="unit" (or the bare cluster= value) and n_clusters; summary() then still prints Number of clusters even though the guide now says both fields are None under survey designs. This does not change estimates, but it misreports the new public surface. Concrete fix: change the suppression gate to resolved_survey is not None (or survey_metadata is not None) instead of resolved_survey.psu is not None, and add a replicate-weight regression asserting cluster_name is None, n_clusters is None, and summary omits the cluster count. References: diff_diff/imputation.py:L898-L907, diff_diff/imputation_results.py:L245-L262, diff_diff/guides/llms-full.txt:L1284-L1292, tests/test_imputation.py:L2441-L2470, tests/test_imputation.py:L2812-L2832.

Performance

No findings. The changed code is validation/metadata plumbing plus early degenerate-bootstrap exits.

Maintainability

No findings. The new vcov_type validation and bootstrap NaN helper are localized and follow the existing narrow-contract pattern used on the other IF-based estimators.

Tech Debt

No findings. The vcov_type="conley" exclusion is explicitly tracked in TODO.md, so I did not count it against this PR. References: TODO.md:L107-L114.

Security

No findings.

Documentation/Tests

No findings on the prior re-review item: the guide update appears to resolve the earlier missing-results-surface documentation. References: diff_diff/guides/llms-full.txt:L377-L404, diff_diff/guides/llms-full.txt:L1284-L1292.
Severity: P3. Impact: review is static-only. I could not run the targeted Imputation tests because this environment lacks numpy, pandas, scipy, and pytest. Concrete fix: rerun tests/test_imputation.py in CI or a normal project environment.

github-actions · 2026-05-25T18:22:42Z

🔁 AI review rerun (requested by @igerber)

Head SHA: b295273f9f792b94e36f0b2342b788c1b231296d

Overall Assessment

✅ Looks good

Highest unmitigated severity is P3.

Executive Summary

Re-review focus: the prior replicate-survey metadata issue is fixed. ImputationDiD.fit() now suppresses cluster_name / n_clusters for any survey-backed fit, and summary() no longer leaks misleading cluster metadata on replicate-weight designs. References: diff_diff/imputation.py:L889-L909, diff_diff/imputation_results.py:L231-L263, tests/test_imputation.py:L2834-L2861.
I did not find an unmitigated P0/P1 in the changed estimator, variance, bootstrap, or results-surface paths.
Methodology alignment looks sound: this PR narrows the public vcov_type surface to the existing Theorem 3 influence-function variance route and documents that library choice in REGISTRY.md, rather than changing the underlying estimator math. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1313-L1315, diff_diff/imputation.py:L2463-L2505. citeturn2view0
The new degenerate-bootstrap guard is correctly fail-closed: n_clusters < 2 / n_psu < 2 now returns NaN SE / p-value / CI payloads and the fit path propagates those NaNs through overall, event-study, and group outputs. References: diff_diff/imputation_bootstrap.py:L275-L395, diff_diff/imputation.py:L846-L887, tests/test_imputation.py:L2582-L2630.
Validation here was static only. This environment does not have numpy, pandas, scipy, or pytest, so I could not rerun the added Imputation tests locally.

Methodology

Severity: P3 (informational). Impact: None for correctness. The narrowed vcov_type={"hc1"} contract is a documented library-surface restriction around ImputationDiD’s existing Theorem 3 influence-function variance, not an undocumented deviation from Borusyak-Jaravel-Spiess. Theorem 3 defines the conservative variance from unit/cluster IF contributions and extends the same construction to vector-valued estimands, which is the relevant source-material anchor for this PR. Concrete fix: None. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1313-L1315, diff_diff/imputation.py:L2463-L2505. citeturn2view0

Code Quality

No findings. The previous metadata leak is resolved in the changed code and covered by regression tests. References: diff_diff/imputation.py:L889-L909, diff_diff/imputation_results.py:L231-L263, tests/test_imputation.py:L2812-L2861.

Performance

No findings. The new bootstrap checks are early exits on degenerate designs and do not add meaningful steady-state cost. References: diff_diff/imputation_bootstrap.py:L359-L395.

Maintainability

No findings. The vcov_type contract is centralized in one validator and reused at both construction and fit-time; the results-surface additions are similarly localized. References: diff_diff/imputation.py:L177-L183, diff_diff/imputation.py:L250-L253, diff_diff/imputation.py:L2438-L2505, diff_diff/imputation_results.py:L476-L521.

Tech Debt

Severity: P3 (informational). Impact: None for this PR. vcov_type="conley" remains unsupported, but that deferral is explicitly tracked, so this PR fail-closes instead of silently advertising unsupported methodology. Concrete fix: None in this PR; follow the tracked TODO. References: TODO.md:L107-L114, docs/methodology/REGISTRY.md:L1313-L1315.

Security

No findings.

Documentation/Tests

Severity: P3 (informational). Impact: Reviewer confidence only. The docs and test updates are appropriately broad for the changed surface, including the guide/registry updates and targeted regressions for cluster, survey, bootstrap, and NaN-propagation behavior, but I could not execute them locally because the scientific Python stack is missing in this environment. Concrete fix: rely on CI / a normal project environment for runtime verification. References: diff_diff/guides/llms-full.txt:L377-L408, diff_diff/guides/llms-full.txt:L1284-L1292, tests/test_imputation.py:L2416-L2861.

github-actions · 2026-05-25T20:16:43Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 1249ded6434656c027680684bdb18aafdceef1a6

Overall Assessment

✅ Looks good

Highest unmitigated severity is P3. I did not find an unmitigated P0 or P1 in the changed ImputationDiD estimator, variance, bootstrap, or results-surface paths.

Executive Summary

Re-review focus: the prior survey-metadata issue remains resolved. Survey-backed ImputationDiD fits now suppress cluster_name / n_clusters, and the regression tests cover both analytical and replicate-weight survey paths. References: diff_diff/imputation.py:L889-L935, diff_diff/imputation_results.py:L226-L263, tests/test_imputation.py:L2812-L2858.
Affected method: ImputationDiD. The PR narrows the public vcov_type surface to the existing Theorem 3 influence-function variance path rather than changing the estimator math. Borusyak-Jaravel-Spiess’ Theorem 3 is a conservative clustered-variance result, and the paper notes the same strategy extends to vector-valued estimands. citeturn0view0
The new bootstrap degenerate-design guard is fail-closed: n_clusters < 2 / n_psu < 2 now return NaN SE / p-value / CI payloads and propagate those NaNs through overall, event-study, and group outputs instead of silently reporting zero-ish SEs. References: diff_diff/imputation_bootstrap.py:L275-L395, diff_diff/imputation.py:L846-L887, tests/test_imputation.py:L2584-L2625.
The cluster + replicate_weights rejection is methodologically appropriate and documented, so it is not a blocker under the review rules. References: diff_diff/imputation.py:L302-L323, docs/methodology/REGISTRY.md:L1314-L1315.
Runtime validation was limited: this environment does not have pytest, numpy, pandas, or scipy, so I could not execute the added tests.

Methodology

Severity: P3 (informational). Impact: None for correctness. The PR’s vcov_type={"hc1"} narrowing aligns with the library’s documented IF-vs-sandwich taxonomy and does not alter the underlying BJS estimator or Theorem 3 variance computation. The paper’s theorem is the right source anchor for conservative clustered inference here, and the library-side rejection of {classical, hc2, hc2_bm} plus deferred conley support is explicitly documented in the registry, so this is not an undocumented methodology deviation. Concrete fix: None. References: diff_diff/imputation.py:L148-L183, diff_diff/imputation.py:L250-L323, diff_diff/imputation.py:L2463-L2505, docs/methodology/REGISTRY.md:L346-L363, docs/methodology/REGISTRY.md:L1313-L1315. citeturn0view0

Code Quality

No findings. The prior survey-backed metadata leak remains fixed in the changed code. References: diff_diff/imputation.py:L889-L935, diff_diff/imputation_results.py:L226-L263, tests/test_imputation.py:L2812-L2858.

Performance

No findings. The added vcov_type validation and degenerate-bootstrap checks are front-door/early-exit guards with negligible steady-state cost. References: diff_diff/imputation.py:L250-L253, diff_diff/imputation_bootstrap.py:L366-L395.

Maintainability

No findings. The vcov_type contract is centralized in one validator and reused at both construction and fit-time, and the new metadata is threaded through a single results object plus to_dict(). References: diff_diff/imputation.py:L2438-L2505, diff_diff/imputation_results.py:L476-L521.

Tech Debt

Severity: P3 (informational). Impact: None for this PR. vcov_type="conley" remains unsupported for ImputationDiD, but that limitation is explicitly tracked in TODO.md and documented in REGISTRY.md, so it is mitigated under the project rules. Concrete fix: None in this PR; follow the tracked TODO.md row. References: TODO.md:L107-L114, docs/methodology/REGISTRY.md:L1313-L1315.

Security

No findings.

Documentation/Tests

Severity: P3 (informational). Impact: Reviewer confidence only. The documentation and tests are appropriately broad for the changed surface: constructor/guide text, registry notes, and targeted regressions for default/cluster/survey/replicate/bootstrap behavior, fit-time revalidation, replicate-weight rejection, and NaN propagation. I could not execute them locally because the scientific Python stack is missing in this environment. Concrete fix: rely on CI or a project environment with pytest, numpy, pandas, and scipy. References: diff_diff/guides/llms-full.txt:L377-L408, diff_diff/guides/llms-full.txt:L1284-L1292, tests/test_imputation.py:L2355-L2906.

@staticmethod

… 3 (Phase 1b interstitial #3) Phase 1b interstitial #3 for ImputationDiD. Mirrors the CallawaySantAnna (PR #487) + TripleDifference (PR #488) template for IF-based estimators: vcov_type is permanently narrow to {"hc1"} because the per-unit influence function aggregation (Borusyak-Jaravel-Spiess 2024 Theorem 3) has no single design matrix on which hat-matrix leverage or Bell-McCaffrey Satterthwaite DOF can be defined. Source surface (diff_diff/): - imputation.py: vcov_type param + @staticmethod _validate_vcov_type + fit()-time revalidation + cluster+replicate-weights NotImplementedError guard + Results metadata resolution (cluster_name=unit by default for the Theorem 3 unit-clustered IF variance; suppressed under ANY survey design — analytical OR replicate — because replicate variance ignores cluster/PSU entirely) - imputation_results.py: vcov_type/cluster_name/n_clusters fields, new to_dict() method, summary() variance line via shared _format_vcov_label (default cluster=None renders "CR1 cluster-robust at <unit>, G=<n>"; bootstrap fits suppress the analytical label and render "Inference method: bootstrap" instead, mirroring DiDResults.summary() gate at results.py:213-226) - imputation_bootstrap.py: dual-site n_clusters<2 / n_psu<2 NaN guards via new _build_nan_bootstrap_results helper (closes the BLAS-roundoff zero-SE class predicted to recur on IF-based estimators) Tests: 42 new tests in TestImputationDiDVcovType covering default / cluster / TSL-survey / replicate-survey + bootstrap × cluster + bootstrap × survey bit-equality (ALL parametrized over aggregate ∈ {None, "event_study", "group"} with per-horizon and per-group SE override branches pinned); fit()-time revalidation after set_params bypass; bootstrap n_psu<2 / n_clusters<2 NaN propagation including coef_var NaN; pretrends=True × vcov_type='hc1' × cluster bit-equality; introspection (default attr, get_params, Results carries, to_dict, summary label default+cluster+bootstrap-suppressed, cluster_name suppression under both analytical AND replicate survey, fit-clone idempotence, convenience function); input rejection on classical/hc2/hc2_bm/conley/ unknown with distinct methodology-keyword pins; cluster+replicate rejection. Full pytest tests/test_imputation.py: 128 passed. Docs: - REGISTRY.md: IF-based taxonomy adds ImputationDiD to "Enforced today" tier; ImputationDiD section gains 4 new Notes (vcov_type contract, cluster+replicate fail-closed, bootstrap n<2 NaN, default unit-cluster CR1 rendering) - CHANGELOG.md: [Unreleased] entry - TODO.md: Phase 1b row narrowed to TwoStageDiD + EfficientDiD; ImputationDiD Conley follow-up row added - guides/llms-full.txt: vcov_type + pretrends signature drift fix + shared staggered-results section advertises new variance metadata fields and to_dict() Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-25T22:17:54Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 098dc7c08a21ed7e3e824cfe13190afc664c5401

Overall Assessment

✅ Looks good

Highest unmitigated severity is P3. I did not find an unmitigated P0 or P1 in the changed ImputationDiD variance, bootstrap, or result-metadata paths.

Executive Summary

Re-review focus: the prior survey-metadata area still looks resolved. Survey-backed fits suppress cluster_name / n_clusters, and summary() suppresses the analytical variance-family label under survey designs. diff_diff/imputation.py:L889-L935, diff_diff/imputation_results.py:L226-L264, tests/test_imputation.py:L2812-L2861
The PR narrows the public vcov_type contract to the already-implemented Theorem 3 variance path rather than changing estimator math. Validation is wired at construction time and re-checked at fit time after set_params() mutation. diff_diff/imputation.py:L148-L189, diff_diff/imputation.py:L250-L253, diff_diff/imputation.py:L2438-L2505
The new degenerate-bootstrap guard is fail-closed: <2 clusters or PSUs now yield NaN SE / p-value / CI payloads, including per-horizon and per-group bootstrap dicts, avoiding misleading near-zero SEs. diff_diff/imputation_bootstrap.py:L275-L323, diff_diff/imputation_bootstrap.py:L366-L395, tests/test_imputation.py:L2584-L2630
The cluster + replicate_weights rejection is methodologically appropriate and documented, so it is not a blocker. diff_diff/imputation.py:L302-L323, diff_diff/survey.py:L104-L110, docs/methodology/REGISTRY.md:L1314-L1314
I could not execute the added tests here because pytest, numpy, pandas, and scipy are not installed in this environment.

Methodology

Severity: P3 (informational). Impact: None for correctness. The new vcov_type={"hc1"} boundary is a documented library-side API contract, not an undocumented departure from Borusyak-Jaravel-Spiess. Theorem 3’s conservative variance remains the cluster-summed square-sum implemented in _compute_conservative_variance(), and the paper notes the same asymptotic argument extends to vector-valued estimands, so reusing that path for overall, event-study, and group inference is methodologically consistent. Concrete fix: None. diff_diff/imputation.py:L16-L23, diff_diff/imputation.py:L1435-L1505, diff_diff/imputation.py:L2463-L2505, docs/methodology/REGISTRY.md:L349-L363, docs/methodology/REGISTRY.md:L1313-L1315 citeturn4view2turn4view3
Severity: P3 (informational). Impact: None for correctness. Rejecting cluster= together with replicate-weight survey designs is a proper fail-closed guard, and the limitation is explicitly documented in the registry. Concrete fix: None. diff_diff/imputation.py:L302-L323, docs/methodology/REGISTRY.md:L1314-L1314, tests/test_imputation.py:L2657-L2675

Code Quality

No findings. The new public parameter is fully propagated through constructor validation, get_params(), results metadata, to_dict(), and the convenience wrapper. diff_diff/imputation.py:L148-L189, diff_diff/imputation.py:L2438-L2595, diff_diff/imputation_results.py:L145-L152, diff_diff/imputation_results.py:L476-L521

Performance

No findings. The added work is front-door validation and early-exit degeneracy checks with negligible steady-state cost. diff_diff/imputation.py:L250-L253, diff_diff/imputation_bootstrap.py:L366-L395

Maintainability

No findings. The contract is centralized in _validate_vcov_type(), and the result-surface additions are localized to one result class plus one summary-label helper path. diff_diff/imputation.py:L2463-L2505, diff_diff/imputation_results.py:L226-L264, diff_diff/imputation_results.py:L476-L521

Tech Debt

Severity: P3 (informational). Impact: None in this PR. vcov_type="conley" remains unsupported for ImputationDiD, but that follow-up is explicitly tracked in TODO.md and documented in REGISTRY.md, so it is mitigated under the review rules. Concrete fix: None in this PR; keep the tracked follow-up. TODO.md:L107-L114, docs/methodology/REGISTRY.md:L1313-L1313

Security

No findings.

Documentation/Tests

Severity: P3 (informational). Impact: Reviewer confidence only. The changed surface is documented in the guide and registry, and the new test class covers constructor rejection, fit-time revalidation, cluster/survey/replicate/bootstrap paths, NaN propagation, summary/to_dict metadata, and the replicate-weight fail-closed guard. I could not run those tests locally because the scientific Python stack is absent in this environment. Concrete fix: rely on CI or rerun in a project environment with pytest, numpy, pandas, and scipy. tests/test_imputation.py:L2184-L2906, diff_diff/guides/llms-full.txt:L377-L408, diff_diff/guides/llms-full.txt:L1284-L1292, docs/methodology/REGISTRY.md:L1313-L1315

igerber added ready-for-ci Triggers CI test workflows and removed ready-for-ci Triggers CI test workflows labels May 25, 2026

igerber force-pushed the feature/imputation-did-vcov-type-phase1b branch from b295273 to 1249ded Compare May 25, 2026 20:09

igerber added the ready-for-ci Triggers CI test workflows label May 25, 2026

igerber force-pushed the feature/imputation-did-vcov-type-phase1b branch from 1249ded to 098dc7c Compare May 25, 2026 22:11

igerber merged commit f034928 into main May 25, 2026
26 checks passed

igerber deleted the feature/imputation-did-vcov-type-phase1b branch May 25, 2026 23:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImputationDiD: thread vcov_type as narrow {hc1} contract per BJS Theorem 3 (Phase 1b interstitial #3)#492

ImputationDiD: thread vcov_type as narrow {hc1} contract per BJS Theorem 3 (Phase 1b interstitial #3)#492
igerber merged 1 commit into
mainfrom
feature/imputation-did-vcov-type-phase1b

igerber commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

igerber commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented May 25, 2026

Summary

Methodology references

Validation

Security / privacy

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

igerber commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Overall Assessment

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant