Skip to content

ImputationDiD: thread vcov_type as narrow {hc1} contract per BJS Theorem 3 (Phase 1b interstitial #3)#492

Merged
igerber merged 1 commit into
mainfrom
feature/imputation-did-vcov-type-phase1b
May 25, 2026
Merged

ImputationDiD: thread vcov_type as narrow {hc1} contract per BJS Theorem 3 (Phase 1b interstitial #3)#492
igerber merged 1 commit into
mainfrom
feature/imputation-did-vcov-type-phase1b

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 25, 2026

Summary

Methodology references

  • Method name(s): ImputationDiD (Phase 1b interstitial Add robust parallel trends testing with Wasserstein distance #3vcov_type contract narrowing + summary label correction + bootstrap NaN guard).
  • Paper / source link(s): Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event-Study Designs: Robust and Efficient Estimation. Review of Economic Studies, 91(6), 3253-3285. Theorem 3 (equation 7) conservative variance.
  • Any intentional deviations from the source (and why): The narrow vcov_type = {"hc1"} contract is library-architectural, not paper-prescribed. Analytical-sandwich families {classical, hc2, hc2_bm} and conley spatial-HAC are rejected at __init__ because the per-unit IF aggregation has no equivalent single design matrix on which hat-matrix leverage or Bell-McCaffrey Satterthwaite DOF can be defined. Documented in docs/methodology/REGISTRY.md IF-vs-sandwich taxonomy + two new ImputationDiD **Note:** bullets.

Validation

  • Tests added/updated: tests/test_imputation.py::TestImputationDiDVcovType (+41 new tests across 7 surfaces — default / cluster / TSL-survey / replicate-survey + bootstrap × cluster + bootstrap × survey bit-equality ALL parametrized over aggregate ∈ {None, "event_study", "group"} with per-horizon and per-group SE override branches pinned; fit()-time revalidation after set_params bypass; bootstrap n_psu<2 / n_clusters<2 NaN propagation including coef_var NaN; pretrends=True × vcov_type='hc1' × cluster bit-equality; introspection on default attr / get_params / Results carries / to_dict / summary label default+cluster+bootstrap-suppressed / cluster_name suppression under survey / fit-clone idempotence / convenience function; input rejection on classical/hc2/hc2_bm/conley/unknown with distinct methodology-keyword pins; cluster + replicate_weights rejection). Full pytest tests/test_imputation.py suite: 128 passed.
  • Backtest / simulation / notebook evidence (if applicable): N/A — bit-equal preservation of the existing variance machinery (no methodology change); R parity unchanged.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

@github-actions
Copy link
Copy Markdown

Overall Assessment
✅ Looks good

Executive Summary

  • No unmitigated P0/P1 issues found in the changed ImputationDiD variance path.
  • The new permanent vcov_type={"hc1"} contract is documented in the Methodology Registry and enforced at both construction and fit time; I did not find an undocumented deviation from Borusyak-Jaravel-Spiess Theorem 3.
  • The new bootstrap n_clusters<2 / n_psu<2 guard correctly returns NaN inference instead of misleading near-zero SEs, and I did not find a partial-NaN propagation bug in the changed path.
  • cluster + replicate_weights now fails closed instead of silently no-oping, which is the right behavior for replicate-weight survey designs.
  • Test review is static only: this environment does not have numpy or pytest, so I could not rerun the added suite locally.

Methodology

  • Severity: P3. Impact: none; the permanent vcov_type={"hc1"} narrowing, unit-cluster CR1 labeling, replicate-weight fail-closed behavior, and the degenerate-bootstrap NaN handling are all documented and match the implementation, so I do not see a methodology defect here. Concrete fix: none. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1313-L1315, diff_diff/imputation.py:L148-L177, diff_diff/imputation.py:L250-L253, diff_diff/imputation.py:L302-L323, diff_diff/imputation_bootstrap.py:L275-L395, diff_diff/imputation_results.py:L231-L262.

Code Quality

  • Severity: none. Impact: no new partial-NaN or incomplete-parameter-propagation issue found in the changed path. Concrete fix: none. vcov_type is threaded through constructor state, get_params(), fit-time validation, result metadata, summary labeling, and the convenience wrapper in diff_diff/imputation.py:L148-L177, diff_diff/imputation.py:L889-L934, diff_diff/imputation.py:L2436-L2503, and diff_diff/imputation.py:L2522-L2588.

Performance

  • Severity: none. Impact: the added work is limited to lightweight validation, metadata threading, and early-exit guards; no meaningful regression risk is visible in the diff. Concrete fix: none.

Maintainability

  • Severity: none. Impact: no maintainability blocker stood out in the touched code. Concrete fix: none.

Tech Debt

  • Severity: P3. Impact: none; the remaining vcov_type="conley" work is explicitly tracked, so that rejection path is documented deferred work rather than an untracked gap. Concrete fix: none. Reference: TODO.md:L114-L114.

Security

  • Severity: none. Impact: no security findings in the changed files. Concrete fix: none.

Documentation/Tests

  • Severity: none. Impact: the added tests cover the main changed surfaces: default/cluster/survey/replicate parity, fit-time revalidation, fail-closed cluster + replicate_weights, and the n_clusters<2 / n_psu<2 NaN path. Concrete fix: none. Reference: tests/test_imputation.py:L2354-L2875.
  • Severity: P3. Impact: I could not execute the tests in this review environment because python here lacks numpy and pytest, so runtime validation is static rather than a local rerun. Concrete fix: rerun the targeted Imputation suite in the normal project test environment or CI.

@igerber
Copy link
Copy Markdown
Owner Author

igerber commented May 25, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 9529bb2d61418eab9e5ed59cc6fed5079d514cd4


Overall Assessment

✅ Looks good

Executive Summary

  • No unmitigated P0/P1 issues found in the changed ImputationDiD variance path, and I did not find a regression relative to the prior clean re-review baseline.
  • The new permanent vcov_type={"hc1"} contract is documented in the Methodology Registry and implemented consistently at construction, fit-time revalidation, results metadata, and summary rendering, so I do not see an undocumented deviation from Borusyak-Jaravel-Spiess Theorem 3.
  • The new bootstrap n_clusters<2 / n_psu<2 fail-closed path correctly propagates NaN inference instead of surfacing misleading degenerate bootstrap output.
  • cluster + replicate_weights now fails closed, and the deferred conley work is properly tracked in TODO.md, so neither is a blocker.
  • One P3 documentation gap remains: the LLM guide’s shared staggered-results section was not updated for ImputationDiD’s new to_dict() surface and variance-metadata fields.
  • Runtime validation here is static only because this environment does not have numpy, pandas, scipy, or pytest.

Methodology

  • Severity: P3. Impact: none. The PR’s methodology-affecting changes are documented rather than silent: the narrow vcov_type={"hc1"} contract, unit-cluster labeling for cluster=None, replicate-weight fail-closed behavior, and degenerate-bootstrap NaN handling are all explicitly captured in the registry, and the implementation matches that documented contract. Concrete fix: none. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1309-L1315, diff_diff/imputation.py:L148-L183, diff_diff/imputation.py:L250-L315, diff_diff/imputation.py:L889-L929, diff_diff/imputation_bootstrap.py:L275-L395, diff_diff/imputation_results.py:L231-L262.

Code Quality

  • Severity: none. Impact: none. I did not find a new incomplete-parameter-propagation or NaN-propagation defect in the changed code; vcov_type is carried through constructor state, fit-time validation, get_params(), result metadata, and the convenience wrapper. Concrete fix: none. References: diff_diff/imputation.py:L148-L183, diff_diff/imputation.py:L250-L315, diff_diff/imputation.py:L2436-L2589.

Performance

  • Severity: none. Impact: none. The added work is limited to lightweight validation, metadata plumbing, and early-exit degenerate-bootstrap guards; I do not see a meaningful regression risk in the touched path. Concrete fix: none. References: diff_diff/imputation_bootstrap.py:L325-L395.

Maintainability

  • Severity: none. Impact: none. The new behavior follows the existing Callaway-Sant’Anna / TripleDifference narrow-contract pattern and stays localized to estimator validation, bootstrap handling, and results formatting. Concrete fix: none.

Tech Debt

  • Severity: P3. Impact: none. The remaining vcov_type="conley" gap is explicitly tracked in TODO.md, so this is documented deferred work rather than an untracked omission introduced by the PR. Concrete fix: none for this PR; follow the existing TODO row. References: TODO.md:L107-L114.

Security

  • Severity: none. Impact: none. I did not find a security issue in the changed files. Concrete fix: none.

Documentation/Tests

  • Severity: P3. Impact: low. ImputationDiDResults now exposes new public variance metadata and a new to_dict() method, but the shared staggered-results guide still advertises only summary(), print_summary(), and to_dataframe(), so guide consumers will miss the newly added surface. Concrete fix: update the shared results block or add an Imputation-specific results subsection covering to_dict(), vcov_type, cluster_name, and n_clusters. References: diff_diff/guides/llms-full.txt:L1265-L1284, diff_diff/imputation_results.py:L145-L152, diff_diff/imputation_results.py:L476-L521.
  • Severity: none. Impact: none on PR correctness. The added tests are broad and well-targeted for the changed surfaces: constructor/fit-time validation, default/cluster/survey/replicate parity, bootstrap parity, degenerate-bootstrap NaN propagation, summary labels, and convenience-wrapper threading. Concrete fix: none. References: tests/test_imputation.py:L2354-L2877.
  • Severity: P3. Impact: review is static-only. I could not execute the targeted Imputation test module locally because this environment lacks numpy, pandas, scipy, and pytest. Concrete fix: rerun tests/test_imputation.py in the normal project environment or CI.

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: cfbfeaeb438251ff68ed49b1821acaa4aa49b4bf


Overall Assessment

✅ Looks good

Highest unmitigated severity is P2.

Executive Summary

  • Re-review focus: the prior P3 documentation gap is addressed; diff_diff/guides/llms-full.txt now documents ImputationDiD.vcov_type, the new variance metadata, and to_dict().
  • I did not find an unmitigated P0/P1 in the changed estimator/math/inference paths.
  • The new bootstrap degenerate-design guard is correctly fail-closed: the new helper returns NaN SE/CI/p-value payloads and the fit path propagates those through overall, event-study, and group inference.
  • One P2 remains: replicate-weight survey fits still leak cluster_name/n_clusters metadata and can print Number of clusters, even though the PR’s new public contract says those fields are suppressed under survey designs.
  • Validation here is static only; this environment does not have numpy, pandas, scipy, or pytest.

Methodology

  • No findings. The narrow vcov_type={"hc1"} contract is consistent with the existing ImputationDiD variance implementation: _compute_conservative_variance() still uses the Theorem 3 clustered sum-of-squares path, and REGISTRY.md explicitly documents the broader cluster=<col> / survey routing as a library synthesis rather than a paper change. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1309-L1315, diff_diff/imputation.py:L1454-L1503, diff_diff/imputation.py:L2462-L2503. (academic.oup.com)

Code Quality

  • Severity: P2. Impact: replicate-weight survey fits violate the new results-metadata contract introduced by this PR. ImputationDiD.fit() suppresses cluster_name / n_clusters only when resolved_survey.psu is not None, so replicate-weight survey designs keep cluster_name="unit" (or the bare cluster= value) and n_clusters; summary() then still prints Number of clusters even though the guide now says both fields are None under survey designs. This does not change estimates, but it misreports the new public surface. Concrete fix: change the suppression gate to resolved_survey is not None (or survey_metadata is not None) instead of resolved_survey.psu is not None, and add a replicate-weight regression asserting cluster_name is None, n_clusters is None, and summary omits the cluster count. References: diff_diff/imputation.py:L898-L907, diff_diff/imputation_results.py:L245-L262, diff_diff/guides/llms-full.txt:L1284-L1292, tests/test_imputation.py:L2441-L2470, tests/test_imputation.py:L2812-L2832.

Performance

  • No findings. The changed code is validation/metadata plumbing plus early degenerate-bootstrap exits.

Maintainability

  • No findings. The new vcov_type validation and bootstrap NaN helper are localized and follow the existing narrow-contract pattern used on the other IF-based estimators.

Tech Debt

  • No findings. The vcov_type="conley" exclusion is explicitly tracked in TODO.md, so I did not count it against this PR. References: TODO.md:L107-L114.

Security

  • No findings.

Documentation/Tests

  • No findings on the prior re-review item: the guide update appears to resolve the earlier missing-results-surface documentation. References: diff_diff/guides/llms-full.txt:L377-L404, diff_diff/guides/llms-full.txt:L1284-L1292.
  • Severity: P3. Impact: review is static-only. I could not run the targeted Imputation tests because this environment lacks numpy, pandas, scipy, and pytest. Concrete fix: rerun tests/test_imputation.py in CI or a normal project environment.

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: b295273f9f792b94e36f0b2342b788c1b231296d


Overall Assessment

✅ Looks good

Highest unmitigated severity is P3.

Executive Summary

  • Re-review focus: the prior replicate-survey metadata issue is fixed. ImputationDiD.fit() now suppresses cluster_name / n_clusters for any survey-backed fit, and summary() no longer leaks misleading cluster metadata on replicate-weight designs. References: diff_diff/imputation.py:L889-L909, diff_diff/imputation_results.py:L231-L263, tests/test_imputation.py:L2834-L2861.
  • I did not find an unmitigated P0/P1 in the changed estimator, variance, bootstrap, or results-surface paths.
  • Methodology alignment looks sound: this PR narrows the public vcov_type surface to the existing Theorem 3 influence-function variance route and documents that library choice in REGISTRY.md, rather than changing the underlying estimator math. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1313-L1315, diff_diff/imputation.py:L2463-L2505. citeturn2view0
  • The new degenerate-bootstrap guard is correctly fail-closed: n_clusters < 2 / n_psu < 2 now returns NaN SE / p-value / CI payloads and the fit path propagates those NaNs through overall, event-study, and group outputs. References: diff_diff/imputation_bootstrap.py:L275-L395, diff_diff/imputation.py:L846-L887, tests/test_imputation.py:L2582-L2630.
  • Validation here was static only. This environment does not have numpy, pandas, scipy, or pytest, so I could not rerun the added Imputation tests locally.

Methodology

  • Severity: P3 (informational). Impact: None for correctness. The narrowed vcov_type={"hc1"} contract is a documented library-surface restriction around ImputationDiD’s existing Theorem 3 influence-function variance, not an undocumented deviation from Borusyak-Jaravel-Spiess. Theorem 3 defines the conservative variance from unit/cluster IF contributions and extends the same construction to vector-valued estimands, which is the relevant source-material anchor for this PR. Concrete fix: None. References: docs/methodology/REGISTRY.md:L348-L363, docs/methodology/REGISTRY.md:L1313-L1315, diff_diff/imputation.py:L2463-L2505. citeturn2view0

Code Quality

  • No findings. The previous metadata leak is resolved in the changed code and covered by regression tests. References: diff_diff/imputation.py:L889-L909, diff_diff/imputation_results.py:L231-L263, tests/test_imputation.py:L2812-L2861.

Performance

  • No findings. The new bootstrap checks are early exits on degenerate designs and do not add meaningful steady-state cost. References: diff_diff/imputation_bootstrap.py:L359-L395.

Maintainability

  • No findings. The vcov_type contract is centralized in one validator and reused at both construction and fit-time; the results-surface additions are similarly localized. References: diff_diff/imputation.py:L177-L183, diff_diff/imputation.py:L250-L253, diff_diff/imputation.py:L2438-L2505, diff_diff/imputation_results.py:L476-L521.

Tech Debt

  • Severity: P3 (informational). Impact: None for this PR. vcov_type="conley" remains unsupported, but that deferral is explicitly tracked, so this PR fail-closes instead of silently advertising unsupported methodology. Concrete fix: None in this PR; follow the tracked TODO. References: TODO.md:L107-L114, docs/methodology/REGISTRY.md:L1313-L1315.

Security

  • No findings.

Documentation/Tests

  • Severity: P3 (informational). Impact: Reviewer confidence only. The docs and test updates are appropriately broad for the changed surface, including the guide/registry updates and targeted regressions for cluster, survey, bootstrap, and NaN-propagation behavior, but I could not execute them locally because the scientific Python stack is missing in this environment. Concrete fix: rely on CI / a normal project environment for runtime verification. References: diff_diff/guides/llms-full.txt:L377-L408, diff_diff/guides/llms-full.txt:L1284-L1292, tests/test_imputation.py:L2416-L2861.

@igerber igerber added ready-for-ci Triggers CI test workflows and removed ready-for-ci Triggers CI test workflows labels May 25, 2026
@igerber igerber force-pushed the feature/imputation-did-vcov-type-phase1b branch from b295273 to 1249ded Compare May 25, 2026 20:09
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 1249ded6434656c027680684bdb18aafdceef1a6


Overall Assessment

✅ Looks good

Highest unmitigated severity is P3. I did not find an unmitigated P0 or P1 in the changed ImputationDiD estimator, variance, bootstrap, or results-surface paths.

Executive Summary

  • Re-review focus: the prior survey-metadata issue remains resolved. Survey-backed ImputationDiD fits now suppress cluster_name / n_clusters, and the regression tests cover both analytical and replicate-weight survey paths. References: diff_diff/imputation.py:L889-L935, diff_diff/imputation_results.py:L226-L263, tests/test_imputation.py:L2812-L2858.
  • Affected method: ImputationDiD. The PR narrows the public vcov_type surface to the existing Theorem 3 influence-function variance path rather than changing the estimator math. Borusyak-Jaravel-Spiess’ Theorem 3 is a conservative clustered-variance result, and the paper notes the same strategy extends to vector-valued estimands. citeturn0view0
  • The new bootstrap degenerate-design guard is fail-closed: n_clusters < 2 / n_psu < 2 now return NaN SE / p-value / CI payloads and propagate those NaNs through overall, event-study, and group outputs instead of silently reporting zero-ish SEs. References: diff_diff/imputation_bootstrap.py:L275-L395, diff_diff/imputation.py:L846-L887, tests/test_imputation.py:L2584-L2625.
  • The cluster + replicate_weights rejection is methodologically appropriate and documented, so it is not a blocker under the review rules. References: diff_diff/imputation.py:L302-L323, docs/methodology/REGISTRY.md:L1314-L1315.
  • Runtime validation was limited: this environment does not have pytest, numpy, pandas, or scipy, so I could not execute the added tests.

Methodology

  • Severity: P3 (informational). Impact: None for correctness. The PR’s vcov_type={"hc1"} narrowing aligns with the library’s documented IF-vs-sandwich taxonomy and does not alter the underlying BJS estimator or Theorem 3 variance computation. The paper’s theorem is the right source anchor for conservative clustered inference here, and the library-side rejection of {classical, hc2, hc2_bm} plus deferred conley support is explicitly documented in the registry, so this is not an undocumented methodology deviation. Concrete fix: None. References: diff_diff/imputation.py:L148-L183, diff_diff/imputation.py:L250-L323, diff_diff/imputation.py:L2463-L2505, docs/methodology/REGISTRY.md:L346-L363, docs/methodology/REGISTRY.md:L1313-L1315. citeturn0view0

Code Quality

  • No findings. The prior survey-backed metadata leak remains fixed in the changed code. References: diff_diff/imputation.py:L889-L935, diff_diff/imputation_results.py:L226-L263, tests/test_imputation.py:L2812-L2858.

Performance

  • No findings. The added vcov_type validation and degenerate-bootstrap checks are front-door/early-exit guards with negligible steady-state cost. References: diff_diff/imputation.py:L250-L253, diff_diff/imputation_bootstrap.py:L366-L395.

Maintainability

  • No findings. The vcov_type contract is centralized in one validator and reused at both construction and fit-time, and the new metadata is threaded through a single results object plus to_dict(). References: diff_diff/imputation.py:L2438-L2505, diff_diff/imputation_results.py:L476-L521.

Tech Debt

  • Severity: P3 (informational). Impact: None for this PR. vcov_type="conley" remains unsupported for ImputationDiD, but that limitation is explicitly tracked in TODO.md and documented in REGISTRY.md, so it is mitigated under the project rules. Concrete fix: None in this PR; follow the tracked TODO.md row. References: TODO.md:L107-L114, docs/methodology/REGISTRY.md:L1313-L1315.

Security

  • No findings.

Documentation/Tests

  • Severity: P3 (informational). Impact: Reviewer confidence only. The documentation and tests are appropriately broad for the changed surface: constructor/guide text, registry notes, and targeted regressions for default/cluster/survey/replicate/bootstrap behavior, fit-time revalidation, replicate-weight rejection, and NaN propagation. I could not execute them locally because the scientific Python stack is missing in this environment. Concrete fix: rely on CI or a project environment with pytest, numpy, pandas, and scipy. References: diff_diff/guides/llms-full.txt:L377-L408, diff_diff/guides/llms-full.txt:L1284-L1292, tests/test_imputation.py:L2355-L2906.

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 25, 2026
… 3 (Phase 1b interstitial #3)

Phase 1b interstitial #3 for ImputationDiD. Mirrors the CallawaySantAnna
(PR #487) + TripleDifference (PR #488) template for IF-based estimators:
vcov_type is permanently narrow to {"hc1"} because the per-unit influence
function aggregation (Borusyak-Jaravel-Spiess 2024 Theorem 3) has no
single design matrix on which hat-matrix leverage or Bell-McCaffrey
Satterthwaite DOF can be defined.

Source surface (diff_diff/):
- imputation.py: vcov_type param + @staticmethod _validate_vcov_type +
  fit()-time revalidation + cluster+replicate-weights NotImplementedError
  guard + Results metadata resolution (cluster_name=unit by default for
  the Theorem 3 unit-clustered IF variance; suppressed under ANY survey
  design — analytical OR replicate — because replicate variance ignores
  cluster/PSU entirely)
- imputation_results.py: vcov_type/cluster_name/n_clusters fields, new
  to_dict() method, summary() variance line via shared _format_vcov_label
  (default cluster=None renders "CR1 cluster-robust at <unit>, G=<n>";
  bootstrap fits suppress the analytical label and render
  "Inference method: bootstrap" instead, mirroring DiDResults.summary()
  gate at results.py:213-226)
- imputation_bootstrap.py: dual-site n_clusters<2 / n_psu<2 NaN guards
  via new _build_nan_bootstrap_results helper (closes the BLAS-roundoff
  zero-SE class predicted to recur on IF-based estimators)

Tests: 42 new tests in TestImputationDiDVcovType covering default /
cluster / TSL-survey / replicate-survey + bootstrap × cluster + bootstrap
× survey bit-equality (ALL parametrized over aggregate ∈ {None,
"event_study", "group"} with per-horizon and per-group SE override
branches pinned); fit()-time revalidation after set_params bypass;
bootstrap n_psu<2 / n_clusters<2 NaN propagation including coef_var NaN;
pretrends=True × vcov_type='hc1' × cluster bit-equality; introspection
(default attr, get_params, Results carries, to_dict, summary label
default+cluster+bootstrap-suppressed, cluster_name suppression under
both analytical AND replicate survey, fit-clone idempotence,
convenience function); input rejection on classical/hc2/hc2_bm/conley/
unknown with distinct methodology-keyword pins; cluster+replicate
rejection. Full pytest tests/test_imputation.py: 128 passed.

Docs:
- REGISTRY.md: IF-based taxonomy adds ImputationDiD to "Enforced today"
  tier; ImputationDiD section gains 4 new Notes (vcov_type contract,
  cluster+replicate fail-closed, bootstrap n<2 NaN, default unit-cluster
  CR1 rendering)
- CHANGELOG.md: [Unreleased] entry
- TODO.md: Phase 1b row narrowed to TwoStageDiD + EfficientDiD;
  ImputationDiD Conley follow-up row added
- guides/llms-full.txt: vcov_type + pretrends signature drift fix +
  shared staggered-results section advertises new variance metadata
  fields and to_dict()

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the feature/imputation-did-vcov-type-phase1b branch from 1249ded to 098dc7c Compare May 25, 2026 22:11
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 098dc7c08a21ed7e3e824cfe13190afc664c5401


Overall Assessment

✅ Looks good

Highest unmitigated severity is P3. I did not find an unmitigated P0 or P1 in the changed ImputationDiD variance, bootstrap, or result-metadata paths.

Executive Summary

  • Re-review focus: the prior survey-metadata area still looks resolved. Survey-backed fits suppress cluster_name / n_clusters, and summary() suppresses the analytical variance-family label under survey designs. diff_diff/imputation.py:L889-L935, diff_diff/imputation_results.py:L226-L264, tests/test_imputation.py:L2812-L2861
  • The PR narrows the public vcov_type contract to the already-implemented Theorem 3 variance path rather than changing estimator math. Validation is wired at construction time and re-checked at fit time after set_params() mutation. diff_diff/imputation.py:L148-L189, diff_diff/imputation.py:L250-L253, diff_diff/imputation.py:L2438-L2505
  • The new degenerate-bootstrap guard is fail-closed: <2 clusters or PSUs now yield NaN SE / p-value / CI payloads, including per-horizon and per-group bootstrap dicts, avoiding misleading near-zero SEs. diff_diff/imputation_bootstrap.py:L275-L323, diff_diff/imputation_bootstrap.py:L366-L395, tests/test_imputation.py:L2584-L2630
  • The cluster + replicate_weights rejection is methodologically appropriate and documented, so it is not a blocker. diff_diff/imputation.py:L302-L323, diff_diff/survey.py:L104-L110, docs/methodology/REGISTRY.md:L1314-L1314
  • I could not execute the added tests here because pytest, numpy, pandas, and scipy are not installed in this environment.

Methodology

  • Severity: P3 (informational). Impact: None for correctness. The new vcov_type={"hc1"} boundary is a documented library-side API contract, not an undocumented departure from Borusyak-Jaravel-Spiess. Theorem 3’s conservative variance remains the cluster-summed square-sum implemented in _compute_conservative_variance(), and the paper notes the same asymptotic argument extends to vector-valued estimands, so reusing that path for overall, event-study, and group inference is methodologically consistent. Concrete fix: None. diff_diff/imputation.py:L16-L23, diff_diff/imputation.py:L1435-L1505, diff_diff/imputation.py:L2463-L2505, docs/methodology/REGISTRY.md:L349-L363, docs/methodology/REGISTRY.md:L1313-L1315 citeturn4view2turn4view3
  • Severity: P3 (informational). Impact: None for correctness. Rejecting cluster= together with replicate-weight survey designs is a proper fail-closed guard, and the limitation is explicitly documented in the registry. Concrete fix: None. diff_diff/imputation.py:L302-L323, docs/methodology/REGISTRY.md:L1314-L1314, tests/test_imputation.py:L2657-L2675

Code Quality

  • No findings. The new public parameter is fully propagated through constructor validation, get_params(), results metadata, to_dict(), and the convenience wrapper. diff_diff/imputation.py:L148-L189, diff_diff/imputation.py:L2438-L2595, diff_diff/imputation_results.py:L145-L152, diff_diff/imputation_results.py:L476-L521

Performance

  • No findings. The added work is front-door validation and early-exit degeneracy checks with negligible steady-state cost. diff_diff/imputation.py:L250-L253, diff_diff/imputation_bootstrap.py:L366-L395

Maintainability

  • No findings. The contract is centralized in _validate_vcov_type(), and the result-surface additions are localized to one result class plus one summary-label helper path. diff_diff/imputation.py:L2463-L2505, diff_diff/imputation_results.py:L226-L264, diff_diff/imputation_results.py:L476-L521

Tech Debt

  • Severity: P3 (informational). Impact: None in this PR. vcov_type="conley" remains unsupported for ImputationDiD, but that follow-up is explicitly tracked in TODO.md and documented in REGISTRY.md, so it is mitigated under the review rules. Concrete fix: None in this PR; keep the tracked follow-up. TODO.md:L107-L114, docs/methodology/REGISTRY.md:L1313-L1313

Security

  • No findings.

Documentation/Tests

  • Severity: P3 (informational). Impact: Reviewer confidence only. The changed surface is documented in the guide and registry, and the new test class covers constructor rejection, fit-time revalidation, cluster/survey/replicate/bootstrap paths, NaN propagation, summary/to_dict metadata, and the replicate-weight fail-closed guard. I could not run those tests locally because the scientific Python stack is absent in this environment. Concrete fix: rely on CI or rerun in a project environment with pytest, numpy, pandas, and scipy. tests/test_imputation.py:L2184-L2906, diff_diff/guides/llms-full.txt:L377-L408, diff_diff/guides/llms-full.txt:L1284-L1292, docs/methodology/REGISTRY.md:L1313-L1315

@igerber igerber merged commit f034928 into main May 25, 2026
26 checks passed
@igerber igerber deleted the feature/imputation-did-vcov-type-phase1b branch May 25, 2026 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant