Skip to content

Close BR/DR gap #6: target-parameter block in schemas#347

Merged
igerber merged 17 commits intomainfrom
br-dr-target-parameter
Apr 22, 2026
Merged

Close BR/DR gap #6: target-parameter block in schemas#347
igerber merged 17 commits intomainfrom
br-dr-target-parameter

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Apr 20, 2026

Summary

  • Adds a new top-level target_parameter block to both BusinessReport and DiagnosticReport schemas naming what the headline scalar represents as an estimand for each of the 17 covered result classes (16 estimator classes from _APPLICABILITY + BaconDecompositionResults).
  • Fields: name, definition, aggregation (machine-readable dispatch tag), headline_attribute (raw result attribute), reference (REGISTRY.md citation pointer). Additive schema change (experimental per REPORTING.md stability policy).
  • Per-estimator dispatch lives in the new diff_diff/_reporting_helpers.py::describe_target_parameter (own module to avoid circular-import risk — both BR and DR import from it). Three branches read fit-time config: EfficientDiD's pt_assumption, StackedDiD's clean_control, and dCDH's L_max / covariate_residuals / linear_trends_effects. The rest emit a fixed tag.
  • Prose: BR summary and DR overall-interpretation paragraph each emit "Target parameter: ." after the headline; both full reports carry a "## Target Parameter" section with the full definition.
  • Closes BR/DR foundation gap Add multi-period DiD support #6 (target-parameter clarity).

Methodology references (required if estimator / math changes)

  • Method name(s): no new methodology. This is a reporting-layer metadata addition that names canonical estimands sourced from REGISTRY.md per estimator.
  • Paper / source link(s): per-estimator references embedded in _reporting_helpers.py: Callaway & Sant'Anna (2021); Sun & Abraham (2021); Borusyak-Jaravel-Spiess (2024); Gardner (2022); Wing-Freedman-Hollingsworth (2024); Wooldridge (2023); Chen-Sant'Anna-Xie (2025); Callaway-Goodman-Bacon-Sant'Anna (2024); Ortiz-Villavicencio & Sant'Anna (2025); de Chaisemartin & D'Haultfoeuille (2020, 2024); Arkhangelsky et al. (2021); Goodman-Bacon (2021). All already cited in REGISTRY.md.
  • Any intentional deviations from the source (and why): None. Every per-estimator branch's wording is sourced from the corresponding REGISTRY.md section (see feedback_verify_claims.md). Two plan-review catches documented in the code:
    • ImputationDiDResults / TwoStageDiDResults do not persist aggregate on the result object; overall_att is always the sample-mean aggregation regardless of fit-time config. Emitted as the fixed "simple" tag; the definition names this explicitly so the user is not misled.
    • ContinuousDiDResults has no PT-vs-SPT regime attribute; the definition is disjunctive (ATT^loc under PT, ATT^glob under SPT) so the user can pick the interpretation that matches their assumption.

Validation

  • Tests added/updated: tests/test_target_parameter.py (new, 37 tests across per-estimator dispatch, fit-config branching, cross-surface parity, exhaustiveness, and prose rendering). tests/test_business_report.py + tests/test_diagnostic_report.py top-level-key contract tests updated to include target_parameter. Total 319 BR/DR tests pass.
  • Exhaustiveness guard: TestTargetParameterCoversEveryResultClass iterates _APPLICABILITY and asserts every result-class name gets a non-default, non-empty target-parameter block. New result classes (e.g., HAD when it lands) will fail this test until an explicit branch is added.
  • Cross-surface parity: TestTargetParameterCrossSurfaceParity asserts BR and DR emit byte-identical target-parameter blocks on the same fit (both consume the same helper).
  • Backtest / simulation / notebook evidence: N/A.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

Closes BR/DR foundation gap #6 from project_br_dr_foundation.md:
BusinessReport and DiagnosticReport now name what the headline
scalar actually represents as an estimand, for each of the 16
result classes. Baker et al. (2025) Step 2 ("define the target
parameter") was previously in BR's next_steps list but not done
by BR itself — this PR closes that gap.

New top-level ``target_parameter`` block (additive schema
change; experimental per REPORTING.md stability policy):

  {
    "name": str,               # stakeholder-facing name
    "definition": str,         # plain-English description
    "aggregation": str,        # machine-readable dispatch tag
    "headline_attribute": str, # which raw result attribute
    "reference": str,          # REGISTRY.md citation pointer
  }

Schema placement: top-level block (user preference, selected via
AskUserQuestion in planning). Aggregation tags include "simple",
"event_study", "group", "2x2", "twfe", "iw", "stacked", "ddd",
"staggered_ddd", "synthetic", "factor_model", "M", "l", "l_x",
"l_fd", "l_x_fd", "dose_overall", "pt_all_combined",
"pt_post_single_baseline", "unknown".

Per-estimator dispatch lives in the new
``diff_diff/_reporting_helpers.py::describe_target_parameter``
(own module rather than business_report / diagnostic_report to
avoid circular-import risk — plan-review LOW #7). All 17 result
classes covered (16 from _APPLICABILITY + BaconDecompositionResults);
exhaustiveness locked in by
TestTargetParameterCoversEveryResultClass.

Fit-time config reads:

- ``EfficientDiDResults.pt_assumption`` branches the aggregation
  tag between pt_all_combined and pt_post_single_baseline.
- ``StackedDiDResults.clean_control`` varies the definition clause
  (never_treated / strict / not_yet_treated).
- ``ChaisemartinDHaultfoeuilleResults.L_max`` +
  ``covariate_residuals`` + ``linear_trends_effects`` branches
  the dCDH estimand between DID_M / DID_l / DID^X_l /
  DID^{fd}_l / DID^{X,fd}_l.

Fixed-tag branches (per plan-review CRITICAL #1 and #2):

- ``CallawaySantAnna`` / ``ImputationDiD`` / ``TwoStageDiD`` /
  ``WooldridgeDiD``: the fit-time ``aggregate`` kwarg does not
  change the ``overall_att`` scalar — it only populates
  additional horizon / group tables on the result object.
  Disambiguating those tables in prose is tracked under gap #9.
- ``ContinuousDiDResults``: the PT-vs-SPT regime is a user-level
  assumption, not a library setting. Emits a single
  "dose_overall" tag with disjunctive definition naming both
  regime readings (ATT^loc under PT, ATT^glob under SPT).

Prose rendering:

- BR ``_render_summary``: emits "Target parameter: <name>."
  after the headline sentence (short name only; full definition
  lives in the full_report and schema).
- BR ``_render_full_report``: "## Target Parameter" section
  between "## Headline" and "## Identifying Assumption".
- DR ``_render_overall_interpretation``: mirror sentence.
- DR ``_render_dr_full_report``: "## Target Parameter" section
  with name, definition, aggregation tag, headline attribute,
  and reference.

Cross-surface parity: both BR and DR consume the same helper
(the single source of truth), so their ``target_parameter``
blocks are byte-identical (verified by
TestTargetParameterCrossSurfaceParity).

Tests: 37 new (TestTargetParameterPerEstimator +
TestTargetParameterFitConfigReads +
TestTargetParameterCoversEveryResultClass +
TestTargetParameterCrossSurfaceParity +
TestTargetParameterProseRendering). Existing BR/DR top-level-key
contract tests updated to include ``target_parameter``. Total
319 tests pass (282 prior + 37 new).

Docs: REPORTING.md gains a "Target parameter" section
documenting the per-estimator dispatch and schema shape.
business_report.rst and diagnostic_report.rst note the new
field with a pointer to REPORTING.md. CHANGELOG entry under
Unreleased.

Out of scope: REGISTRY.md per-estimator "Target parameter"
sub-sections (plan-review additional-note); the reporting-layer
doc in REPORTING.md is the current source of truth. A follow-up
docs PR can land those sub-sections if maintainers want the
registry to own the canonical wording directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment

⚠️ Needs changes

Executive Summary

  • P1: StackedDiD’s new target_parameter block describes the headline as a treated-share-weighted aggregate across sub-experiments, but the actual overall_att in this library is the simple average of post-treatment event-study coefficients. No REGISTRY note mitigates that mismatch.
  • P1: the TWFE-specific branch is effectively dead. describe_target_parameter() dispatches on type(results).__name__, but TwoWayFixedEffects.fit() returns DiDResults, so real TWFE reports still get the generic 2x2 ATT block.
  • P1: both dCDH branches set headline_attribute="att" even though the headline scalar lives in overall_att, which breaks the new machine-readable schema contract.
  • P2: the added tests are mostly stub-based and miss the live TWFE path and the dCDH raw-attribute mismatch.
  • No performance or security issues stood out in the changed files.

Methodology

  • Severity P1. Impact: StackedDiD’s new target-parameter text names the wrong estimand for the headline scalar. The helper says overall_att is a treated-unit-share-weighted aggregate across sub-experiments, but the implementation and result contract define it as the average of post-treatment delta_h coefficients. That is an undocumented mismatch against the Methodology Registry and will mislabel the reported estimand. Concrete fix: rewrite the StackedDiDResults branch so it describes overall_att as the average of post-treatment event-study coefficients; if helpful, separately note that each delta_h estimates theta_kappa^e. Refs: diff_diff/_reporting_helpers.py#L200, diff_diff/stacked_did.py#L541, diff_diff/stacked_did_results.py#L28, docs/methodology/REGISTRY.md#L1219, docs/methodology/REGISTRY.md#L1291
  • Severity P1. Impact: the PR claims TWFE-specific target-parameter coverage, but the runtime path does not support it. describe_target_parameter() branches on type(results).__name__, yet TwoWayFixedEffects.fit() returns DiDResults, so real TWFE fits will still emit the generic DiDResults 2x2 target-parameter block, not the TWFE one added in this PR. Concrete fix: either return a dedicated TwoWayFixedEffectsResults type from TWFE, or persist estimator provenance on DiDResults and branch on that provenance in BR/DR; then add an integration test on an actual TWFE fit. Refs: diff_diff/_reporting_helpers.py#L70, diff_diff/_reporting_helpers.py#L100, diff_diff/twfe.py#L472, tests/test_target_parameter.py#L57

Code Quality

Performance

  • No findings.

Maintainability

  • Severity P2. Impact: the new feature extends the existing string-name dispatch pattern without anchoring it to real result types, which is how the dead TWFE branch slipped in. That leaves future report coverage brittle and easy to drift from actual estimator outputs. Concrete fix: centralize report metadata in one registry shared by estimator outputs and reporting helpers, and drive exhaustiveness tests from that registry rather than from manually invented class-name stubs. Refs: diff_diff/_reporting_helpers.py#L70, docs/methodology/REPORTING.md#L104, tests/test_target_parameter.py#L239

Tech Debt

  • No deferrable findings. I found no matching TODO.md entry or REGISTRY deviation note that mitigates the P1 issues above, so they remain blockers under the stated review policy.

Security

  • No findings.

Documentation/Tests

  • Severity P2. Impact: the new tests validate helper output on synthetic class-name stubs and the _APPLICABILITY set, so they miss both the live TWFE runtime path and the incorrect dCDH headline_attribute. That means the current suite would pass while the shipped schema is still wrong for real consumers. Concrete fix: add integration tests that assert target_parameter on an actual TwoWayFixedEffects().fit(...) result, assert dCDH uses headline_attribute="overall_att", and assert StackedDiD wording matches the estimator’s own overall_att contract. Refs: tests/test_target_parameter.py#L57, tests/test_target_parameter.py#L239, diff_diff/diagnostic_report.py#L93

Path to Approval

  1. Fix the StackedDiDResults branch in describe_target_parameter() so it describes the actual overall_att headline scalar: the average of post-treatment delta_h coefficients.
  2. Make the TWFE-specific path reachable on real estimator outputs, either by returning a dedicated TWFE result class or by persisting estimator provenance on DiDResults and teaching BR/DR to branch on it.
  3. Change both dCDH branches’ headline_attribute from "att" to "overall_att".
  4. Add integration tests for real TWFE, dCDH, and StackedDiD runtime objects so these contracts are enforced end-to-end.

Local execution was not possible in this review environment because the Python environment is missing numpy/pandas; assessment is based on static diff inspection.

…x dCDH headline_attribute

R1 surfaced three P1s, all legitimate:

1. StackedDiD wording mismatch. Claimed ``overall_att`` is a
   treated-share-weighted aggregate across sub-experiments; actual
   implementation (``stacked_did.py`` ~line 541) computes
   ``overall_att`` as the simple average of post-treatment event-
   study coefficients ``delta_h`` with delta-method SE. Per-horizon
   ``delta_h`` is the paper's ``theta_kappa^e`` cross-event
   aggregate, but the headline is an equally-weighted average over
   those per-horizon coefficients, not a separate cross-event
   weighting at the ATT level. Definition rewritten to describe the
   actual estimand.

2. Dead ``TwoWayFixedEffectsResults`` branch. ``TwoWayFixedEffects``
   is a subclass of ``DifferenceInDifferences`` and its ``fit()``
   returns ``DiDResults`` — there is no separate TWFE result class,
   so the ``type(results).__name__ == "TwoWayFixedEffectsResults"``
   dispatch branch was unreachable on any real fit. Removed the
   dead branch and rewrote the ``DiDResults`` branch to cover both
   2x2 DiD and TWFE interpretations explicitly (both estimators
   route here). Follow-up for future PR: persist estimator
   provenance on ``DiDResults`` (or return a dedicated TWFE result
   class) so the branch can split again; documented inline.

3. dCDH ``headline_attribute="att"``. Both dCDH branches (``DID_M``
   for ``L_max=None``, ``DID_l``/derivatives for ``L_max >= 1``)
   named ``"att"`` as the headline attribute, but
   ``ChaisemartinDHaultfoeuilleResults`` stores the headline in
   ``overall_att`` (``chaisemartin_dhaultfoeuille_results.py:357``).
   Fixed both branches to ``"overall_att"``; downstream consumers
   using the machine-readable contract now point at the correct
   attribute.

Tests: new ``TestTargetParameterRealFitIntegration`` covers the
gap R1 P2 flagged — prior coverage was stub-based and would not
have caught any of the three P1s. Four new real-fit tests:

- ``TwoWayFixedEffects().fit(...)`` returns ``DiDResults``; target-
  parameter block uses the shared DiD/TWFE branch.
- ``StackedDiD(...).fit(...)`` on a staggered panel; the
  ``headline_attribute`` matches the actual real attribute and the
  definition names the event-study-coefficient estimand.
- ``ChaisemartinDHaultfoeuille().fit(...)`` on a reversible-
  treatment panel (both ``DID_M`` and ``DID_l`` regimes);
  ``headline_attribute == "overall_att"`` and the named attribute
  actually exists on the real fit object.

Existing stub-based dispatch tests updated: the ``test_twfe_results``
test is now ``test_did_results_mentions_twfe`` (asserts the DiD
branch describes both estimators). The dCDH stub tests now also
assert ``headline_attribute == "overall_att"``.

All 323 BR/DR tests pass (319 prior + 4 new real-fit integration).

Out of scope (plan-review MEDIUM #2 — centralizing report metadata
in a single registry shared by estimator outputs and reporting
helpers): queued as a separate PR. Current approach (string dispatch
on ``type(results).__name__`` + REGISTRY.md references) is working
but brittle; a centralized registry is the principled fix for the
TWFE-dispatch-dead-code class of bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 7e8e26479f4e10b4f9100b180d59a53a56880277


Overall Assessment

⚠️ Needs changes

Executive Summary

  • P1: ChaisemartinDHaultfoeuille dynamic fits are still misclassified by the new target_parameter helper. It treats all L_max>=1 cases as DID_l, but the library’s actual overall_att contract is DID_1 at L_max=1, delta at L_max>=2, and no scalar aggregate under trends_linear + L_max>=2.
  • The same dCDH flattening is now documented in REPORTING.md, so the reporting docs and schema drift together from REGISTRY.md and the result-class contract.
  • P2: the new tests codify the wrong dCDH expectation (L_max=2 -> aggregation "l") and miss the L_max=1 and trend-adjusted no-aggregate cases.
  • The previous blocking findings from the earlier review look addressed: StackedDiD’s wording now matches overall_att, real TWFE fits route through the live DiD/TWFE branch, and dCDH headline_attribute now points to overall_att.
  • No performance or security issues stood out in the changed files.

Methodology

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings beyond the methodology drift above.

Tech Debt

No deferrable findings. I found no TODO.md entry that mitigates the dCDH target-parameter mismatch, and this class of methodology bug would remain blocking even if it were tracked.

Security

No findings.

Documentation/Tests

Path to Approval

  1. Update describe_target_parameter() so dCDH dynamic results follow the actual overall_att contract: DID_M for L_max=None, DID_1 for L_max=1, delta for L_max>=2, and no scalar aggregate for trends_linear + L_max>=2.
  2. Update docs/methodology/REPORTING.md so the documented dCDH dispatch matches REGISTRY.md and the result-class behavior.
  3. Replace the current dCDH target-parameter tests with cases that cover L_max=1, L_max>=2, and trends_linear + L_max>=2 on both stubbed and real results.

Static review only: this environment lacks pytest and numpy, so I could not execute the new test file.

R2 surfaced one P1 methodology finding: the dCDH dynamic branch
flattened every ``L_max >= 1`` into a generic ``DID_l`` estimand,
but the library's actual ``overall_att`` contract is:

- ``L_max = None`` -> ``DID_M`` (Phase 1 per-period aggregate).
- ``L_max = 1`` -> ``DID_1`` (single-horizon per-group estimand,
  Equation 3 of the dynamic companion paper — NOT the generic
  ``DID_l``).
- ``L_max >= 2`` (no ``trends_linear``) -> ``delta`` (cost-benefit
  cross-horizon aggregate, Lemma 4;
  ``chaisemartin_dhaultfoeuille.py:2602-2634``).
- ``trends_linear = True`` AND ``L_max >= 2`` -> ``overall_att`` is
  intentionally NaN by design
  (``chaisemartin_dhaultfoeuille.py:2828-2834``). No scalar
  aggregate; per-horizon level effects live on
  ``linear_trends_effects[l]``.

Fix: ``describe_target_parameter()`` now mirrors the result class's
own ``_estimand_label()`` at
``chaisemartin_dhaultfoeuille_results.py:454-490``. New aggregation
tags: ``DID_1`` / ``DID_1_x`` / ``DID_1_fd`` / ``DID_1_x_fd`` for
single-horizon, ``delta`` / ``delta_x`` for cost-benefit, and
``no_scalar_headline`` for the trends+L_max>=2 suppression case.
On the no-scalar case, ``headline_attribute`` is ``None`` so
downstream consumers do not point at a field whose value is NaN
by design.

Tests: added stub-based branches for every new case (``DID_1``,
``DID_1^X``, ``delta``, ``delta^X``, trends + L_max>=2 no-scalar,
trends + L_max=1 still-has-scalar) and split the real-fit
integration test into ``L_max=1`` and ``L_max=2`` real-panel
cases so the contract is enforced end-to-end per R2 P2. The
parameterized ``test_dcdh_config_branches_tag`` now covers 10 cases
and also asserts ``headline_attribute`` flips to ``None`` only on
the no-scalar case.

Docs: ``REPORTING.md`` dCDH section rewritten to match the
corrected dispatch, including the ``no_scalar_headline`` case and
the L_max=None/1/>=2 contract.

332 BR/DR tests pass.

Out of scope (still open from R1): centralizing report metadata
in a single registry shared by estimator outputs and reporting
helpers (plan-review MEDIUM #2 / R1 P2 maintainability). The
current string dispatch on ``type(results).__name__`` + explicit
REGISTRY.md citations is source-faithful but requires manual
mirroring of result-class contracts; a centralized registry is
the principled fix. Tracked for a follow-up PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 9f6b4d1899b63b7c0dde602cbd1589c6ec55ed53


Overall Assessment

✅ Looks good

Executive Summary

  • The prior blocking dCDH methodology issue appears resolved: the new helper now mirrors the documented overall_att contract for L_max=None, L_max=1, L_max>=2, and the trends_linear=True, L_max>=2 no-scalar branch.
  • No new P0/P1 findings stood out in the changed code.
  • Severity P2: the public docs for the new machine-readable aggregation contract are still stale and describe old dCDH tags (DID_l, l_x, etc.) instead of the implemented DID_1* / delta* / no_scalar_headline behavior.
  • Severity P3: the new regression file still lacks an end-to-end real-fit check for the trends_linear=True, L_max>=2 no-scalar dCDH branch; that path is only stub-tested in this PR.
  • Static review only: I could not execute the added tests because this environment lacks pytest and numpy.

Methodology

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings. I found no TODO.md entry that changes the severity of the remaining items; they are non-blocking documentation/test gaps.

Security

  • No findings.

Documentation/Tests

  • Severity P2. Impact: REPORTING.md still advertises obsolete machine-readable aggregation values ("l", "l_x", "l_fd", "l_x_fd") and documents headline_attribute as if it were always a string, while the implementation now returns DID_1*, delta*, and no_scalar_headline, with headline_attribute=None for the no-scalar dCDH branch. business_report.rst still summarizes the dCDH side as DID_l. Consumers following the docs can wire dispatch logic to tags the implementation no longer emits. Concrete fix: sync the public schema docs to the actual helper outputs, explicitly documenting DID_1*, delta*, no_scalar_headline, and the headline_attribute=None case. Refs: docs/methodology/REPORTING.md#L90, docs/methodology/REPORTING.md#L125, docs/api/business_report.rst#L52, diff_diff/_reporting_helpers.py#L352, diff_diff/_reporting_helpers.py#L450.
  • Severity P3. Impact: the new regression file adds live dCDH checks for DID_1 and delta, but the trends_linear=True, L_max>=2 reporting branch is still only covered with stubs in this PR. That leaves the reporting-layer integration unpinned for the one branch where headline_attribute intentionally becomes None. Concrete fix: add a real-fit ChaisemartinDHaultfoeuille(..., L_max=2, trends_linear=True) case asserting non-finite overall_att, populated linear_trends_effects, aggregation="no_scalar_headline", and headline_attribute is None. Refs: tests/test_target_parameter.py#L205, tests/test_target_parameter.py#L543.

…fit no-scalar test

R3 approved (✅) with two non-blocking follow-ups; this commit
addresses both.

P2 (docs): ``REPORTING.md`` and ``business_report.rst`` still
listed the obsolete dCDH aggregation tags (``"DID_l"``, ``"l"``,
``"l_x"``, ``"l_fd"``, ``"l_x_fd"``) and documented
``headline_attribute`` as always a string, even though R2 replaced
those with ``"DID_1"`` / ``"DID_1_x"`` / ``"DID_1_fd"`` /
``"DID_1_x_fd"`` / ``"delta"`` / ``"delta_x"`` /
``"no_scalar_headline"`` and introduced the
``headline_attribute=None`` no-scalar case. Consumers wiring
dispatch logic off the docs would have pointed at tags the helper
no longer emits. Rewrote the ``aggregation`` enum in REPORTING.md
as a full per-estimator dispatch list, and updated the
``headline_attribute`` description to name the ``None`` case
explicitly. ``business_report.rst`` summary replaced ``DID_l``
with ``DID_1`` / cost-benefit delta and added a pointer to the
no-scalar case.

P3 (tests): added ``test_dcdh_trends_linear_with_l_max_geq_2_fit_real``
— a real-fit regression that exercises the
``ChaisemartinDHaultfoeuille(..., L_max=2, trends_linear=True)``
path end-to-end. Asserts (a) ``fit.overall_att`` is NaN by design
(matching ``chaisemartin_dhaultfoeuille.py:2828-2834``), (b)
``linear_trends_effects`` is populated, (c) the target-parameter
block emits ``aggregation="no_scalar_headline"`` and
``headline_attribute is None``, (d) the definition references
``linear_trends_effects``. Previously this branch was only
stub-tested; now the reporting-layer integration is pinned by a
live dCDH fit.

333 BR/DR tests pass. Black and ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 450c5925bae4e4a828d287012c745c4b94d7c019


Overall Assessment

⚠️ Needs changes

The previous re-review’s non-blocking doc/test items look addressed, but I found one new unmitigated P1 in the new target_parameter integration path.

Executive Summary

  • The earlier non-blocking gaps appear resolved: the public reporting docs now describe the implemented dCDH tags, and there is now a real-fit regression test for the trends_linear=True, L_max>=2 dCDH helper branch.
  • P1: the new documented dCDH no_scalar_headline branch is not propagated through BR/DR headline extraction and prose rendering, so valid fits still surface a NaN headline and “estimation failed” messaging.
  • P2: the Wooldridge target-parameter branch is not method-aware; OLS ETWFE fits are labeled as ASF-based even though the OLS path reports an observation-count-weighted average of ATT(g,t) coefficients.
  • I did not find new inference anti-patterns or security issues in the changed files.
  • Static review only: I could not run pytest here because pytest is not installed in this environment.

Methodology

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No additional findings. Neither issue above is tracked in TODO.md, so neither is mitigated for assessment purposes.

Security

No findings.

Documentation/Tests

  • Severity P3. Impact: the previous re-review’s doc drift and live-fit helper coverage look fixed, but the new prose/rendering tests still only exercise a Callaway stub, and the real dCDH trends_linear=True, L_max>=2 test stops at describe_target_parameter(). That leaves the broken no-scalar BR/DR surfaces and the Wooldridge method split unpinned. Refs: tests/test_target_parameter.py:374, tests/test_target_parameter.py:592. Concrete fix: add end-to-end assertions for to_dict(), summary(), full_report(), and DR interpretation/full report on a real dCDH no-scalar fit, plus Wooldridge method="ols" vs nonlinear target-parameter cases.

Path to Approval

  1. Handle headline_attribute=None / aggregation="no_scalar_headline" as a first-class BR/DR branch: do not extract or narrate a scalar headline as a failed fit, and instead emit explicit no-scalar-by-design messaging that points to linear_trends_effects.
  2. Make Wooldridge target-parameter dispatch method-aware so OLS ETWFE and nonlinear ASF paths get different wording/references.
  3. Add end-to-end regression tests for both branches so the report surfaces, not just the helper, are covered.

…ridge method-aware

R4 surfaced one P1 + one P2, both addressed.

P1 (methodology): the dCDH no-scalar branch was documented in the
schema but not plumbed through BR/DR rendering. When
``aggregation="no_scalar_headline"`` and ``headline_attribute=None``
(``trends_linear=True`` + ``L_max>=2``), BR/DR still extracted
``overall_att`` (NaN by design) and narrated it via the estimation-
failure path, producing internally inconsistent output — the
``target_parameter`` block said "no scalar aggregate; consult
linear_trends_effects" while the headline prose told users to
inspect rank deficiency.

Fix (both surfaces):

- BR ``_build_schema``: compute ``target_parameter`` BEFORE
  ``_extract_headline``; if the aggregation tag is
  ``no_scalar_headline``, route through a dedicated headline block
  with ``status="no_scalar_by_design"`` / ``effect=None`` /
  ``sign="none"`` and an explicit ``reason`` field naming the
  ``linear_trends_effects`` alternative.
- BR ``_render_headline_sentence``: detect
  ``status == "no_scalar_by_design"`` and emit explicit "does not
  produce a scalar aggregate effect ... by design" prose instead
  of the non-finite / estimation-failure sentence.
- BR ``_build_caveats``: the existing ``sign == "undefined"``
  estimation-failure caveat does not fire because we emit
  ``sign == "none"`` (not ``"undefined"``) on the no-scalar case.
- DR ``_execute``: analogous headline-metric short-circuit with
  ``status="no_scalar_by_design"`` on detection of the
  no_scalar_headline tag.
- DR ``_render_overall_interpretation``: explicit no-scalar
  sentence takes precedence over the non-finite estimation-failure
  branch.

P2 (Wooldridge method awareness): the Wooldridge branch previously
labeled every fit as ASF-based, but REGISTRY.md Sec. WooldridgeDiD
splits OLS ETWFE (observation-count-weighted average of ATT(g,t)
from a saturated regression) from the nonlinear (logit / Poisson)
ASF path. Branch on ``results.method`` ("ols" -> coefficient-
aggregation wording; other -> ASF wording).

Tests: added 4 end-to-end regressions.

- ``test_dcdh_trends_linear_no_scalar_propagates_through_br``:
  real dCDH fit with ``trends_linear=True`` + ``L_max=2``; asserts
  BR schema emits ``status="no_scalar_by_design"``, summary prose
  contains "no scalar" / "does not produce a scalar", does NOT
  contain "rank deficiency" / "estimation failed", and caveats do
  NOT include ``estimation_failure``.
- ``test_dcdh_trends_linear_no_scalar_propagates_through_dr``:
  mirror on the DR side (``headline_metric`` status and
  ``overall_interpretation`` prose).
- ``test_wooldridge_ols``: asserts the OLS branch names
  ATT(g,t) aggregation and does NOT include "ASF" in the name.
- ``test_wooldridge_nonlinear``: asserts logit/poisson routes
  through the ASF branch.

336 BR/DR tests pass. Black and ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: a8e1719058c42895ab52dbf8a8d129b255474870


Overall Assessment

✅ Looks good

Executive Summary

  • The prior P1 dCDH no-scalar-headline issue appears resolved: BR/DR now branch on aggregation=="no_scalar_headline" and no longer narrate that configuration as an estimation failure.
  • The prior Wooldridge mislabel also appears resolved: target-parameter dispatch is now method-aware and no longer applies ASF wording to OLS ETWFE fits.
  • Severity P2: DiagnosticReport.full_report() still renders a malformed top headline on the no-scalar dCDH branch because the markdown headline line does not special-case status="no_scalar_by_design".
  • Severity P3: the new helper’s own Wooldridge docstring is stale and still describes overall_att generically as ASF-based, which now contradicts both the registry and the method-aware implementation.
  • Static review only: I could not run the added tests in this environment because numpy and pytest are not installed.

Methodology

Code Quality

  • Severity P2. Impact: the new dCDH no_scalar_by_design handling is still incomplete in the DR markdown renderer. DiagnosticReport.full_report() will currently interpolate None into the top headline line (**Headline**: ... = None (SE None, p = None)) for a valid trends_linear=True, L_max>=2 fit, even though the rest of the report correctly explains that no scalar headline exists by design. This is a user-facing rendering bug on the new code path. diff_diff/diagnostic_report.py:L3271-L3277, diff_diff/diagnostic_report.py:L930-L959, diff_diff/diagnostic_report.py:L3010-L3022. Concrete fix: branch on headline.get("status") == "no_scalar_by_design" in _render_dr_full_report() and render explicit no-scalar prose there instead of formatting value/se/p_value.

Performance

  • No findings.

Maintainability

  • No findings beyond the stale helper docstring noted above.

Tech Debt

  • No findings. The residual items above are not tracked in TODO.md, but neither rises to P1+.

Security

  • No findings.

Documentation/Tests

  • Severity P3. Impact: the added no-scalar real-fit tests stop at run_all() / interpretation and do not assert DiagnosticReport.full_report(), which is how the malformed = None (SE None, p = None) headline can slip through unpinned. tests/test_target_parameter.py:L444-L450, tests/test_target_parameter.py:L649-L676. Concrete fix: add an end-to-end DiagnosticReport(fit).full_report() assertion for the trends_linear=True, L_max>=2 dCDH case that rejects the raw None headline and checks for the explicit no-scalar wording.
  • Residual risk: runtime validation was not possible here because the local environment is missing numpy and pytest.

…e docstring + full_report test

R5 approved (✅) with two small follow-throughs from R4.

P2: DR ``_render_dr_full_report`` still formatted ``value`` /
``se`` / ``p_value`` straight into the top ``**Headline**:`` line
on the no-scalar-by-design dCDH branch. With those fields all
``None`` (by design), the markdown rendered as
``**Headline**: ... = None (SE None, p = None)`` even though the
"## Target Parameter" section below correctly explained the
suppression. Added a ``status == "no_scalar_by_design"`` branch
that emits ``**Headline**: no scalar aggregate by design.`` plus
the headline's ``reason`` field.

P3 (stale docstring): the ``_reporting_helpers.py`` top-level
docstring still described Wooldridge ``overall_att`` as always
ASF-based. R4 split that into OLS vs nonlinear dispatch;
updated the docstring bullet to match.

P3 (test gap): the dCDH no-scalar real-fit regression asserted
``run_all()`` + ``interpretation`` but not ``full_report()``. That
gap is exactly how the R5 P2 malformed headline could slip
through unpinned. Extended the test to assert the ``**Headline**:
no scalar aggregate by design`` line appears in the markdown and
the raw ``= None (SE None, p = None)`` pattern does NOT appear.

336 BR/DR tests pass. Black clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 21, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: e995954a13c74bd579d4441890d4d11df4c8e2c1


Overall Assessment

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

  • No findings.

Maintainability

  • Severity P3. Impact: the StackedDiD target_parameter.definition includes stacked_did.py line-number prose in a user-facing metadata field. That reference is not methodology source material and will go stale under routine edits, even if the estimand itself does not change. Concrete fix: remove the internal line reference and keep the definition in paper/registry terms only at diff_diff/_reporting_helpers.py:211.

Tech Debt

  • No new findings.

Security

  • No findings.

Documentation/Tests

  • Severity P3. Impact: the AI guide still lists the old BR/DR schema top-level keys, so downstream agent-facing documentation is now out of sync with the runtime schema and will hide the new target_parameter block from guide-driven consumers. Concrete fix: add target_parameter to both schema key lists in diff_diff/guides/llms-full.txt:1775 and diff_diff/guides/llms-full.txt:1832, and optionally add a doc-smoke assertion to keep those lists synchronized.
  • Residual risk: runtime validation was not possible here because pytest, numpy, and pandas are unavailable.

igerber and others added 2 commits April 20, 2026 20:48
…full.txt schema

Two P3 cleanups from R6.

P3 #1: the StackedDiD ``target_parameter.definition`` embedded an
internal implementation line reference (``stacked_did.py`` around
line 541). That pointer is not methodology source material and
will go stale under routine estimator edits even when the estimand
itself is unchanged. Removed the reference; definition now stands
on paper/registry terms alone.

P3 #2: ``diff_diff/guides/llms-full.txt`` listed the pre-PR BR/DR
schema top-level keys and omitted ``target_parameter``, so agent-
facing documentation disagreed with the runtime schema. Added
``target_parameter`` to both schema-key lists (BR around line 1779
and DR around line 1844). Documented the field shape
(``name`` / ``definition`` / ``aggregation`` /
``headline_attribute`` / ``reference``), the dispatch tag set, and
the ``headline_attribute=None`` / ``aggregation="no_scalar_headline"``
edge case for the dCDH ``trends_linear=True, L_max>=2`` fit. Also
noted the ``headline.status="no_scalar_by_design"`` value so
guide-driven agents can dispatch correctly. UTF-8 fingerprint
preserved per ``feedback_llms_guide_utf8_fingerprint.md``
(``tests/test_guides.py`` passes).

354 BR/DR + guide tests pass (337 BR/DR + 17 guide). Black clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Formatting-only follow-up to the R6 edit — the previous commit
landed the StackedDiD-line-reference cleanup before black could
reflow the affected block.
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 21, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 5c3d0ba004343126fd619b6872566d93d40a5903


Overall Assessment

⚠️ Needs changes

Executive Summary

  • The previous re-review items look resolved: the dCDH no-scalar case is now special-cased in BR/DR rendering, the Wooldridge OLS vs nonlinear split is reflected in the new helper, and llms-full.txt now documents the added target_parameter block.
  • Severity P1 [Newly identified]: the new EfficientDiD target_parameter branch still does not identify the actual headline estimand. The implementation and registry define overall_att as a cohort-size-weighted average over post-treatment (g,t) cells, distinct from the paper’s ES_avg, but the new PT-All/PT-Post prose omits that distinction.
  • Severity P1 [Newly identified]: the PR introduces a new no_scalar_by_design schema status without bumping BUSINESS_REPORT_SCHEMA_VERSION or DIAGNOSTIC_REPORT_SCHEMA_VERSION, even though REPORTING.md says new status-enum values are breaking changes.
  • I did not find a new P0 numerical/inference regression in the changed code.
  • Static review only: pytest is not installed and numpy is unavailable here, so I could not execute the added tests; I only verified AST parsing of the changed Python files.

Methodology

  • Severity P1 [Newly identified]. Impact: the EfficientDiD target-parameter metadata still leaves the headline ambiguous between the library’s overall_att and the paper’s ES_avg. The implementation aggregates overall_att with cohort-size weights across post-treatment (g,t) cells, and the registry explicitly documents that this differs from the paper’s uniform event-time average; the new PT-All/PT-Post text only describes the identification regime, not the reported estimand. That undercuts the PR’s target-parameter goal and is an undocumented methodology mismatch in the new reporting surface. Concrete fix: update both EfficientDiD branches in the helper to say that BR/DR headline overall_att is the library’s cohort-size-weighted post-treatment ATT aggregate, not ES_avg, mirror that note in the reporting docs, and add a regression test for that wording. References: diff_diff/_reporting_helpers.py:L277-L304, diff_diff/efficient_did.py:L1274-L1282, docs/methodology/REGISTRY.md:L875-L876, tests/test_target_parameter.py:L127-L135

Code Quality

No findings.

Performance

No findings.

Maintainability

Tech Debt

No findings.

Security

No findings.

Documentation/Tests

No additional PR-specific findings. Residual risk: runtime validation was not possible here because pytest is unavailable and numpy is not installed; I only verified that the changed Python files parse successfully.

Path to Approval

  1. Update the EfficientDiD target-parameter prose so it explicitly states that the reported BR/DR headline is the library’s cohort-size-weighted overall_att over post-treatment (g,t) cells, not the paper’s ES_avg, and add a regression test that locks that wording in.
  2. Bump BUSINESS_REPORT_SCHEMA_VERSION and DIAGNOSTIC_REPORT_SCHEMA_VERSION for the new no_scalar_by_design status (or remove that new enum and reuse existing schema states), then refresh the corresponding docs and schema-version tests.

…y vs ES_avg note

Two P1 findings from R7, both addressed.

P1 #1 (schema version bump): the new ``headline.status`` /
``headline_metric.status`` value ``"no_scalar_by_design"`` added
in R4 for the dCDH ``trends_linear=True, L_max>=2`` configuration
is a breaking change per REPORTING.md stability policy (new
status-enum values are breaking — agents doing exhaustive match
will break on unknown enums). Bumped
``BUSINESS_REPORT_SCHEMA_VERSION`` and
``DIAGNOSTIC_REPORT_SCHEMA_VERSION`` from ``"1.0"`` to ``"2.0"``,
updated the in-tree schema-version tests (one explicit
``== "1.0"`` assertion and six ``"schema_version": "1.0"`` stub
dicts in BR / DR test files), added a REPORTING.md "Schema
version 2.0" note, and documented the bump in the CHANGELOG
Unreleased entry. The schemas remain marked experimental so the
formal deprecation policy does not yet apply.

P1 #2 (EfficientDiD library vs paper estimand): both
EfficientDiD branches now explicitly state that BR/DR's headline
``overall_att`` is the library's cohort-size-weighted average
over post-treatment ``(g, t)`` cells, NOT the paper's ``ES_avg``
uniform event-time average. The regime (PT-All / PT-Post)
describes identification; the aggregation choice is a separate
library-level policy that REGISTRY.md Sec. EfficientDiD
documents. Added ``cohort-size-weighted`` + ``ES_avg`` /
``post-treatment`` assertions to ``test_efficient_did_pt_all``
and ``test_efficient_did_pt_post`` so the wording is pinned.

354 BR/DR + guide + target-parameter tests pass. Black and ruff
clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 21, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: fdaf94d81627ac6c353fbe35a2b848d7ea1475dc


Overall Assessment
✅ Looks good

Executive Summary

  • The previous re-review blockers appear resolved: the EfficientDiD branch now distinguishes the library’s cohort-size-weighted overall_att from the paper’s ES_avg, and both report schema versions are bumped to 2.0 for the new no_scalar_by_design headline state.
  • I did not find a new P0/P1 methodology, inference, or edge-case regression in the changed reporting layer.
  • Severity P3: the new BR API docs still describe the TROP target parameter as a “factor-model residual,” which does not match the helper/registry definition.
  • Severity P3: the new exhaustiveness tests cover _APPLICABILITY only, so the BaconDecompositionResults target-parameter branch remains untested.
  • Static review only: pytest is not installed in this environment, so I could not execute the added tests.

Methodology
No findings.

Code Quality
No findings.

Performance
No findings.

Maintainability
No findings.

Tech Debt
No findings.

Security
No findings.

Documentation/Tests

  • Severity P3. Impact: the new BR API docs describe the TROP target_parameter as a “factor-model residual,” but the implementation and registry define it as an ATT-style weighted average over treated cells. That can mislead downstream readers of the schema docs even though the code path is correct. Concrete fix: change the wording in docs/api/business_report.rst:L52 to “factor-model-adjusted ATT” or equivalent wording that matches diff_diff/_reporting_helpers.py:L550 and docs/methodology/REGISTRY.md:L1899.
  • Severity P3. Impact: the new “covers every result class” guard only iterates _APPLICABILITY, so it never exercises the newly added BaconDecompositionResults branch even though the helper now supports it. That leaves the diagnostic-only target_parameter path able to drift untested. Concrete fix: add a direct Bacon unit test, and ideally one DR integration test, for the branch at diff_diff/_reporting_helpers.py:L536; the current exhaustiveness guard in tests/test_target_parameter.py:L348 is limited to estimator classes listed in diff_diff/diagnostic_report.py:L93.

…n tests

Both P3 cleanups from R8.

P3 #1 (TROP wording in rst): ``business_report.rst`` summary listed
TROP's target parameter as "factor-model residual" — which does
not match the helper / REGISTRY definition. Both say the TROP
target parameter is a factor-model-adjusted weighted average
over treated cells (not a residual). Fixed the rst wording to
"factor-model-adjusted ATT".

P3 #2 (Bacon branch untested): the exhaustiveness guard iterates
``_APPLICABILITY``, but ``BaconDecompositionResults`` is a
diagnostic read-out on the DR side and is NOT listed in
``_APPLICABILITY`` (BR rejects it with a TypeError). The helper
branch for Bacon therefore slipped through the 16-class guard.
Added two regressions:

- ``test_bacon_decomposition`` (unit-level, direct helper call):
  asserts aggregation / headline_attribute / definition wording
  / Goodman-Bacon reference.
- ``test_dr_with_bacon_result_emits_target_parameter``
  (integration): passes a real ``BaconDecompositionResults``
  from ``bacon_decompose`` on a staggered panel through DR,
  asserts the ``target_parameter`` block propagates into DR's
  schema, and confirms the named ``headline_attribute``
  (``twfe_estimate``) exists on the real fit object.

356 BR/DR + guide + target-parameter tests pass. Black and ruff
clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 21, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: f2fc763457f8afd5730cb12fbc88f0cf0cb9d728


Overall Assessment

⚠️ Needs changes

Executive Summary

  • The prior re-review P3s appear addressed: the BR API docs now use ATT-style TROP wording, and BaconDecompositionResults now has direct target-parameter coverage in the new tests.
  • Severity P1: the new dCDH target-parameter dispatch infers trends_linear=True from linear_trends_effects is not None, but the estimator can legitimately set linear_trends_effects=None when the cumulated-horizon surface is empty while still forcing overall_att to NaN by design. In that case BR/DR fall out of the new no_scalar_by_design path and misreport an intentional no-scalar configuration as an estimation failure.
  • Severity P3: the added tests cover populated trends_linear outputs, but they do not cover the empty-surface case above, so the regression is currently unguarded.
  • Static review only: pytest is not installed here, and the Python environment is also missing required deps like numpy, so I could not execute the added tests.

Methodology

  • Severity P1. Impact: the new dCDH reporting path is not source-faithful on an empty trends-linear surface. [diff_diff/_reporting_helpers.py:L372-L410](/home/runner/work/diff-diff/diff-diff/diff_diff/_reporting_helpers.py#L372) uses linear_trends_effects is not None as the proxy for trends_linear=True, but the estimator itself sets linear_trends_effects = None whenever the cumulated dict is empty while still unconditionally NaN-ing overall_att for trends_linear=True, L_max>=2 at [diff_diff/chaisemartin_dhaultfoeuille.py:L2772-L2834](/home/runner/work/diff-diff/diff-diff/diff_diff/chaisemartin_dhaultfoeuille.py#L2772). Because BR and DR gate their new no-scalar branch off that helper at [diff_diff/business_report.py:L436-L472](/home/runner/work/diff-diff/diff-diff/diff_diff/business_report.py#L436) and [diff_diff/diagnostic_report.py:L930-L959](/home/runner/work/diff-diff/diff-diff/diff_diff/diagnostic_report.py#L930), an empty-horizon trends-linear fit will fall through to the non-trends delta path and be narrated as a failed estimate instead of “no scalar by design.” This is both an undocumented methodology mismatch and a missed empty-result-set edge case. Concrete fix: persist an explicit fit-time marker for trends_linear (or a direct “overall aggregate kind” flag) on ChaisemartinDHaultfoeuilleResults, and drive describe_target_parameter() plus the BR/DR no-scalar routing from that persisted config rather than from the presence of linear_trends_effects.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No separate findings. TODO.md does not currently track the P1 above, so it remains unmitigated.

Security

  • No findings.

Documentation/Tests

  • Severity P3. Impact: the new regression coverage assumes a populated trends-linear surface and misses the empty-surface case that breaks the new dispatch. The current dCDH tests exercise trends_linear=True only with non-empty linear_trends_effects stubs or real fits at [tests/test_target_parameter.py:L234-L254](/home/runner/work/diff-diff/diff-diff/tests/test_target_parameter.py#L234) and [tests/test_target_parameter.py:L665-L786](/home/runner/work/diff-diff/diff-diff/tests/test_target_parameter.py#L665). Concrete fix: add a helper-level stub test and BR/DR end-to-end tests for trends_linear=True with no estimable cumulated horizons, asserting aggregation="no_scalar_headline" and headline.status / headline_metric.status == "no_scalar_by_design".

Path to Approval

  1. Persist trends_linear (or an equivalent explicit aggregate-kind/status field) on ChaisemartinDHaultfoeuilleResults.
  2. Update [diff_diff/_reporting_helpers.py:L372-L410](/home/runner/work/diff-diff/diff-diff/diff_diff/_reporting_helpers.py#L372), [diff_diff/business_report.py:L436-L472](/home/runner/work/diff-diff/diff-diff/diff_diff/business_report.py#L436), and [diff_diff/diagnostic_report.py:L930-L959](/home/runner/work/diff-diff/diff-diff/diff_diff/diagnostic_report.py#L930) to use that persisted flag so every trends_linear=True, L_max>=2 fit routes to the documented no-scalar branch, even when linear_trends_effects is empty.
  3. Add regression coverage for the empty trends-linear surface in [tests/test_target_parameter.py](/home/runner/work/diff-diff/diff-diff/tests/test_target_parameter.py), including both helper-level dispatch and BR/DR rendering/schema assertions.

…rface no-scalar dispatch

R9 surfaced a real P1 edge case: the helper inferred
``trends_linear=True`` from
``linear_trends_effects is not None``, but the estimator can set
``linear_trends_effects = None`` when the cumulated-horizon dict is
empty (no estimable horizons) while still unconditionally NaN-ing
``overall_att`` under ``trends_linear=True`` + ``L_max >= 2``
(``chaisemartin_dhaultfoeuille.py:2828-2834``). The inference
missed that case — an empty-horizon fit would fall through to the
``delta`` branch, BR/DR would extract ``overall_att`` (NaN), and
the headline would be narrated as an estimation failure instead
of "no scalar by design."

Fix:

- Persisted the fit-time ``trends_linear`` flag explicitly on
  ``ChaisemartinDHaultfoeuilleResults`` as a new
  ``Optional[bool]`` field (with a docstring note).
- The dCDH estimator now threads ``_is_trends_linear`` into the
  result constructor at
  ``chaisemartin_dhaultfoeuille.py:3139``.
- ``describe_target_parameter()`` reads the persisted flag first
  and only falls back to the ``linear_trends_effects is not None``
  inference when the flag is absent (older cached fits predating
  the persisted field).

BR/DR no-scalar routing is unchanged — both surfaces dispatch on
``target_parameter["aggregation"] == "no_scalar_headline"``, which
now fires correctly for empty-surface fits too.

Tests:

- ``test_dcdh_trends_linear_with_l_max_geq_2_emits_no_scalar_headline``
  updated to pass ``trends_linear=True`` explicitly (primary
  contract).
- ``test_dcdh_trends_linear_empty_surface_still_no_scalar``: new
  R9 regression. ``linear_trends_effects=None`` + ``trends_linear=True``
  (the empty-surface case) routes to the no-scalar branch.
- ``test_dcdh_legacy_fit_without_persisted_flag_still_routes_correctly``:
  preserves backwards compatibility for cached fits that predate
  the persisted flag — the legacy
  ``linear_trends_effects is not None`` fallback still fires.

368 tests pass (BR/DR + guide + target-parameter + dCDH methodology).
Black and ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 21, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: c0d012781503e462f986a410ed2d0046cb3cb457


Overall Assessment

⚠️ Needs changes

Executive Summary

  • The prior dCDH BR/DR blocker is addressed: the PR now persists trends_linear on dCDH results and uses it in the new target-parameter/no-scalar routing, which fixes the earlier empty-surface headline bug in BR/DR.
  • Severity P1 [Newly identified]: that same fix is only partially propagated. Remaining public dCDH reporting paths still infer linear-trends mode from linear_trends_effects is not None, so empty-surface trends_linear=True, L_max>=2 fits are still mislabeled outside the new BR/DR headline path.
  • I did not find new variance/SE or NaN-inference regressions in the modified estimator/reporting code.
  • No performance or security blockers stood out in the changed files.
  • Static review only: I could not run the added tests here because pytest is not installed, and direct Python imports also fail because numpy is unavailable.

Methodology

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No separate findings beyond the P1 above.

Tech Debt

  • No separate findings. TODO.md does not track the P1 above, so it remains unmitigated.

Security

  • No findings.

Documentation/Tests

  • Severity P3. Impact: the new regression suite covers the fixed BR/DR headline path, but not the remaining empty-surface consumers. The added no-scalar tests in tests/test_target_parameter.py:711 and tests/test_target_parameter.py:749 stop at BR/DR headline rendering, while the existing BR assumption tests only cover populated linear_trends_effects stubs at tests/test_business_report.py:1805. That leaves the stale assumption/label paths unguarded. Concrete fix: add regression coverage for trends_linear=True, L_max>=2, linear_trends_effects=None asserting that BR’s assumption text still mentions the DID^{fd}/linear-trends identification contract, and that native dCDH result labels/tables do not report delta on that path.

Path to Approval

  1. Update the remaining dCDH reporting/label paths to read results.trends_linear when present, with legacy fallback only when the field is absent. At minimum this needs to cover diff_diff/chaisemartin_dhaultfoeuille_results.py:455, diff_diff/chaisemartin_dhaultfoeuille_results.py:464, diff_diff/chaisemartin_dhaultfoeuille_results.py:605, diff_diff/chaisemartin_dhaultfoeuille_results.py:1058, and diff_diff/business_report.py:1205.
  2. Add a regression test for the empty-surface case (trends_linear=True, L_max>=2, linear_trends_effects=None) that asserts BusinessReport(...).to_dict()["assumption"]["description"] still includes the linear-trends / DID^{fd} identification wording.
  3. Add a native dCDH results regression test for the same case asserting _estimand_label() and to_dataframe("overall") do not label the overall estimand as delta.

…DH reporting consumers

R10 found my R9 fix was partial. The persisted ``trends_linear``
flag was only read by ``describe_target_parameter``. Three other
dCDH reporting paths still inferred trends-linear from
``linear_trends_effects is not None`` and silently mis-labeled
empty-surface fits as ``delta`` or omitted the linear-trends
identification clause:

1. ``ChaisemartinDHaultfoeuilleResults._horizon_label`` and
   ``_estimand_label`` (also reached via
   ``to_dataframe("overall")``) — per-horizon labels and overall
   estimand label.
2. ``ChaisemartinDHaultfoeuilleResults.summary`` — the
   covariate/trend-adjusted tag in the overall-results summary.
3. ``BusinessReport._describe_assumption`` dCDH branch — the
   identifying-assumption prose that names ``DID^{fd}_l`` vs
   ``DID_l``.

Fix: added a ``_has_trends_linear()`` helper on
``ChaisemartinDHaultfoeuilleResults`` that reads the persisted
flag first and falls back to the legacy inference, and rewired
all three result-class callsites to use it. BR's
``_describe_assumption`` branch gained a matching persisted-first-
then-inference lookup via ``getattr(results, "trends_linear", None)``.

Tests: new ``test_dcdh_empty_surface_propagates_to_assumption_and_native_label``
stubs a ``ChaisemartinDHaultfoeuilleResults`` with
``trends_linear=True``, ``L_max=2``, and
``linear_trends_effects=None`` (the exact R9/R10 edge case), then
asserts:

- ``stub._estimand_label()`` returns ``DID^{fd}_l`` /
  ``DID^{X,fd}_l``, NOT ``delta``.
- ``stub.to_dataframe("overall")`` does not label the overall row
  as ``delta``.
- BR's ``_describe_assumption`` description includes the linear-
  trends / first-differenced identification clause.

527 tests pass across BR/DR + guide + target-parameter + dCDH
methodology + dCDH unit suites. Black and ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 22, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 5a82ab79bc9af67747e430d6d122ac3c65e6176d


Overall Assessment

✅ Looks good

Executive Summary

Methodology

  • Severity: P3. Impact: The new OLS Wooldridge target-parameter branch uses Wooldridge (2023) in both the user-visible definition and the reference field, but the registry treats OLS ETWFE as the Wooldridge (2025) path and reserves 2023 for the nonlinear ASF extension. This does not affect estimates, but it is a source-provenance mismatch in the new reporting surface. Concrete fix: update the OLS branch at diff_diff/_reporting_helpers.py:242-252 to cite Wooldridge (2025) (or explicitly split 2025 for OLS and 2023 for nonlinear), matching docs/methodology/REGISTRY.md:1301-1303 and docs/methodology/REGISTRY.md:1317-1333.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings. The prior dCDH propagation issue is fixed in code rather than deferred to TODO.md.

Security

  • No findings.

Documentation/Tests

R11 was ✅ with one P3: REGISTRY anchors OLS ETWFE in Wooldridge
(2025) and reserves Wooldridge (2023) for the nonlinear ASF
extension. The target-parameter helper's OLS branch was citing
2023 in both the definition prose and the ``reference`` field.
Updated both to ``Wooldridge (2025)``. The nonlinear branch's
joint ``Wooldridge (2023, 2025)`` reference is unchanged.

334 BR/DR + target-parameter tests pass. Black clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 22, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 696d2cd141dde5f05dbb94e2b8bb4ae0f42c96d3


Overall Assessment

⚠️ Needs changes

Executive Summary

  • The prior Wooldridge citation issue looks resolved: the OLS ETWFE branch now points to Wooldridge (2025), and the nonlinear ASF branch points to Wooldridge (2023/2025), matching the registry split in docs/methodology/REGISTRY.md:L1299-L1345.
  • The main dCDH fix is mostly in place: the persisted trends_linear flag now flows through the target-parameter helper, BR assumption prose, and the dCDH native estimand label in diff_diff/_reporting_helpers.py:L372-L428, diff_diff/business_report.py:L1205-L1238, and diff_diff/chaisemartin_dhaultfoeuille_results.py:L452-L490.
  • One blocker remains on the exact empty-surface edge case this PR is trying to address: BR/DR now route trends_linear=True, L_max>=2 fits into the no-scalar branch, but they still tell users to inspect linear_trends_effects even when that surface is None, and the public accessor still throws the wrong remediation message.
  • One non-blocking methodology issue remains in the new machine-readable schema: target_parameter.aggregation uses "2x2" for both true 2x2 DiD and TWFE fits because both return DiDResults, so downstream dispatch cannot actually distinguish the two estimands.
  • Static review only: I could not run the new tests here because pytest is not installed and runtime deps like numpy are unavailable.

Methodology

  • Severity: P1 [Newly identified]. Location: diff_diff/_reporting_helpers.py:L407-L428, diff_diff/business_report.py:L443-L469, diff_diff/diagnostic_report.py:L939-L956, with the underlying estimator contract at diff_diff/chaisemartin_dhaultfoeuille.py:L2845-L2858 and surrounding public accessors at diff_diff/chaisemartin_dhaultfoeuille_results.py:L917-L921 and diff_diff/chaisemartin_dhaultfoeuille_results.py:L1268-L1272. Impact: for ChaisemartinDHaultfoeuilleResults fits with trends_linear=True and L_max>=2, the PR correctly recognizes the no-scalar-by-design case, but on the empty-horizon subcase (linear_trends_effects=None) it still tells users to “see linear_trends_effects” even though there is nothing there; to_dataframe("linear_trends") then raises “Pass trends_linear=True to fit()”, which is wrong on precisely this configuration. That is incomplete empty-result handling on a newly changed code path and can produce production-facing errors/misleading guidance. Concrete fix: add an explicit empty-surface branch for trends_linear=True && L_max>=2 && linear_trends_effects is None in describe_target_parameter(), BR/DR headline reasons/prose, and the dCDH public accessors so they say “no estimable cumulated level effects survived” instead of pointing to a nonexistent dict.
  • Severity: P2. Location: diff_diff/_reporting_helpers.py:L77-L106, docs/methodology/REPORTING.md:L90-L93. Impact: the new machine-readable target_parameter.aggregation tag is "2x2" for both DifferenceInDifferences and TwoWayFixedEffects because both emit DiDResults, even though the same definition text acknowledges that TWFE can be a weighted average with forbidden comparisons. Since the schema explicitly says agents should branch on aggregation, this tag is not actually estimator-faithful for TWFE fits. Concrete fix: either persist estimator provenance on DiDResults and split the branch, or replace "2x2" with a neutral non-misleading tag such as did_or_twfe until provenance exists.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings. The blocking dCDH empty-surface issue is not tracked in TODO.md, so it should not be treated as deferred work.

Security

  • No findings.

Documentation/Tests

  • No separate findings beyond the blocker above. Validation note: this was a static review; I could not execute tests/test_target_parameter.py, tests/test_business_report.py, or tests/test_diagnostic_report.py because pytest is unavailable here and imports fail without numpy.

Path to Approval

  1. Update the dCDH no-scalar branch so trends_linear=True, L_max>=2, linear_trends_effects=None emits explicit empty-surface messaging instead of “see results.linear_trends_effects[l].”
  2. Fix the public dCDH inspection surface for that same case, especially to_dataframe("linear_trends"), so it returns a targeted empty-result response or error message rather than telling the user to re-fit with trends_linear=True.
  3. Add regression coverage for that exact configuration: the target_parameter block, BR/DR no-scalar prose, and the to_dataframe("linear_trends") path on a ChaisemartinDHaultfoeuilleResults object with trends_linear=True, L_max=2, and linear_trends_effects=None.

…gation tag

R12 identified two issues, both addressed.

P1 (empty-surface dead-end guidance): on
``trends_linear=True, L_max>=2, linear_trends_effects=None`` (no
horizons survived), the PR's no-scalar prose still told users to
"see ``linear_trends_effects``" even though the dict is empty,
and ``to_dataframe("linear_trends")`` raised the wrong
remediation ("Pass ``trends_linear=True`` to fit()") — which the
user already did. Fixed by distinguishing the populated-surface
case from the empty-surface subcase in three places:

- ``describe_target_parameter`` (dCDH no-scalar branch): the
  ``definition`` on empty surfaces now names the empty state
  explicitly ("no cumulated level effects survived estimation")
  and points at re-fit remediation, rather than pointing at the
  nonexistent horizon dict.
- ``BusinessReport._build_schema`` (no-scalar headline): the
  ``reason`` field branches on
  ``getattr(self._results, "linear_trends_effects", None) is None``
  and emits the empty-state message accordingly.
- ``DiagnosticReport._execute`` (no-scalar headline): mirror
  branching for the DR ``headline_metric`` reason + name.
- ``ChaisemartinDHaultfoeuilleResults.to_dataframe("linear_trends")``
  now returns an empty DataFrame with the expected columns when
  ``trends_linear=True`` is already active but no horizons
  survived. The "Pass ``trends_linear=True`` to fit()" error
  fires only when the user actually did not request it.

P2 (ambiguous aggregation tag): both ``DifferenceInDifferences``
and ``TwoWayFixedEffects`` return ``DiDResults``, so the old
``"2x2"`` aggregation tag was not faithful for TWFE fits that can
be weighted averages with forbidden later-vs-earlier weights.
Renamed to ``"did_or_twfe"`` — a neutral tag that signals the
ambiguity until estimator provenance is persisted. Downstream
agents dispatching on the tag now know not to treat TWFE fits as
clean 2x2. REPORTING.md updated to enumerate the new tag.

Tests: 3 new regressions pin the empty-surface contract
(target-parameter definition names the empty state, BR/DR
headline reasons avoid "see linear_trends_effects", and
``to_dataframe("linear_trends")`` returns an empty frame rather
than raising). Existing ``"2x2"`` assertions updated to
``"did_or_twfe"``.

502 BR/DR + target-parameter + dCDH unit tests pass. Black clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 22, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 8343eeb2582ae19b4912774f2e0fdfb40b0f80c4


Overall Assessment

Needs changes

Highest unmitigated severity: P1.

Executive Summary

  • The prior aggregation="2x2" ambiguity is fixed: the PR now uses the neutral did_or_twfe tag and documents that deviation in the reporting docs, so I am not counting that as a defect.
  • The prior dCDH empty-surface blocker is only partially resolved. The schema reason and to_dataframe("linear_trends") path were fixed, but several user-facing surfaces still tell users to inspect linear_trends_effects even when that surface is empty.
  • Because this is the exact empty-result edge case on a changed dCDH no-scalar path, it remains a P1 under the review rubric.
  • I did not find a new variance/SE mismatch in the touched estimator code.
  • Static review only. I could not execute the new tests locally because the environment is missing runtime deps such as numpy, so this review is based on source inspection and AST parsing.

Methodology

Code Quality

  • No additional findings.

Performance

  • No findings.

Maintainability

  • No separate finding beyond the P1 above, but the root cause is duplicated no-scalar messaging across the helper, BR, DR, and dCDH result labeling. Centralizing that contract would prevent this class of partial fix.

Tech Debt

  • No mitigating entry exists for this issue under TODO.md#L51, so I did not treat it as deferred work.

Security

  • No findings.

Documentation/Tests

Path to Approval

  1. Make the empty-surface dCDH no-scalar message single-sourced and reuse it in BusinessReport headline rendering, DiagnosticReport overall interpretation, and dCDH overall labels, so those surfaces never say “see linear_trends_effects” when linear_trends_effects is None.
  2. Ensure that same shared message/label respects the controlled branch too, i.e. DID^{X,fd}_l when covariates are active instead of hard-coded DID^{fd}_l.
  3. Add regression tests for the empty-surface stub covering BR rendered prose, DR rendered prose, and dCDH to_dataframe("overall") / overall label surfaces.
  4. Update the reporting docs so headline_attribute=None on no_scalar_headline distinguishes the populated-surface and empty-surface subcases instead of always directing agents to linear_trends_effects.

…n; branch dCDH native label on empty surface

R13 identified three remaining surfaces that still hardcoded
"see ``linear_trends_effects``" on the empty-surface subcase
(``trends_linear=True, L_max>=2, linear_trends_effects=None``):

1. BR ``_render_headline_sentence`` (headline prose used by
   ``headline()``, ``summary()``, and ``full_report()``).
2. DR ``_render_overall_interpretation`` (top-level paragraph).
3. dCDH ``ChaisemartinDHaultfoeuilleResults._estimand_label``
   (also surfaced via ``to_dataframe("overall")``).

Fix: BR and DR renderers now read the headline-level ``reason``
field (already branched on populated-vs-empty surface in
``_build_schema`` / ``_execute``), so the rendered prose never
drifts from the schema message. ``_estimand_label`` on dCDH
results gains an empty-surface branch that returns
``DID^{fd}_l (no cumulated level effects survived estimation)``
(or ``DID^{X,fd}_l (...)`` when covariates are active) instead
of pointing at an empty dict.

Docs: REPORTING.md and business_report.rst now document the
``headline.reason``-based populated-vs-empty branching for
agents dispatching on ``aggregation="no_scalar_headline"``.

Tests: 4 new regressions pin the rendered-prose contract on
the empty-surface stub: (a) BR ``headline()`` /
``summary()`` / ``full_report()`` emit the empty-surface
remediation wording, NOT "see linear_trends_effects"; (b) DR
``interpretation`` does the same; (c) dCDH ``_estimand_label``
returns the empty-state label rather than pointing at the
empty dict; (d) ``to_dataframe("overall")`` surfaces the
empty-state label.

505 BR/DR + target-parameter + dCDH tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 22, 2026

/ai-review

@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 22, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 5046a51e98c1ad4a2a59d6f1cafec5c782cfc046


Overall Assessment

⚠️ Needs changes

Highest unmitigated severity: P1.

Executive Summary

  • The prior empty-surface dCDH blocker is mostly fixed: the changed BR/DR/native-label/docs paths no longer point users to a nonexistent linear_trends_effects table on the uncontrolled empty-surface branch.
  • One methodology-facing P1 remains. On the covariate-adjusted empty-surface dCDH branch, the new reporting layer still hardcodes the unadjusted label DID^{fd}_l in several user-facing strings instead of DID^{X,fd}_l.
  • This does not change estimates or SEs, but it does misstate the estimand on exactly the reporting surface this PR is adding.
  • The new empty-surface tests only exercise covariate_residuals=None, so the controlled branch regression slipped through.
  • I did not find a new variance/SE defect in the touched estimator code.
  • Static review only: I could not run the test suite because pytest is not installed in this environment.

Methodology

Code Quality

  • No additional findings beyond the P1 above.

Performance

  • No findings.

Maintainability

  • No separate finding, but the remaining P1 exists because the empty-surface wording is still duplicated across the helper and both report builders instead of being generated from one shared contract.

Tech Debt

  • No separate findings. TODO.md does not mitigate the P1 above.

Security

  • No findings.

Documentation/Tests

Path to Approval

  1. Make the dCDH empty-surface message control-aware everywhere it is emitted: describe_target_parameter(), BR headline.reason, and DR headline_metric.reason should use DID^{X,fd}_l when covariate_residuals is active.
  2. Add regression coverage for the covariate-adjusted empty-surface case (trends_linear=True, L_max>=2, covariate_residuals populated, linear_trends_effects=None) that asserts the helper definition and BR/DR reason/prose surfaces all use the DID^{X,fd}_l label.

When trends_linear=True + L_max>=2 + linear_trends_effects=None
(empty-surface subcase) AND covariate_residuals is populated, the
reporting-layer prose now emits the covariate-adjusted label
``DID^{X,fd}_l`` rather than the bare ``DID^{fd}_l``. Propagates to:

- ``_reporting_helpers.describe_target_parameter``: empty-surface
  definition_text now interpolates the already-control-aware
  estimand_label (``DID^{X,fd}_l`` when has_controls, else
  ``DID^{fd}_l``).
- ``BusinessReport._build_schema``: reads ``covariate_residuals`` to
  select the empty-surface label used in ``headline.reason``.
- ``DiagnosticReport._execute``: mirrors BR's control-aware label
  selection for ``headline_metric.reason``.

Regression test added: ``test_dcdh_empty_surface_with_controls_*``
(three tests covering target_parameter definition, BR/DR reason
fields, and rendered prose surfaces). Asserts every consumer emits
``DID^{X,fd}_l`` on the covariate-adjusted empty-surface subcase
and does NOT emit bare ``DID^{fd}_l`` as a stale fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 22, 2026

/ai-review R14 addressed: control-aware empty-surface label. The covariate-adjusted empty-surface subcase (trends_linear=True + L_max>=2 + covariate_residuals populated + linear_trends_effects=None) now emits DID^{X,fd}_l consistently across the helper definition, BR headline.reason, DR headline_metric.reason, and rendered prose. Regression tests added (3) asserting every surface uses the covariate-adjusted label and does not emit bare DID^{fd}_l as a stale fallback.

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: d6ab7ec9116b3f08fa82bcb38b3ee7cfbeea7a81


Overall Assessment

✅ Looks good

Executive Summary

Methodology

  • No findings. Affected methods reviewed were the BR/DR target-parameter dispatch across supported result classes, with special attention to dCDH trends_linear=True, L_max>=2. The prior P1 methodology-facing label mismatch appears fixed, and the new no-scalar branch is documented rather than silent.

Code Quality

  • No findings. Moving per-estimator target-parameter dispatch into the shared helper reduces BR/DR drift and fixes the earlier duplicated empty-surface wording problem by centralizing the core contract in diff_diff/_reporting_helpers.py#L25.

Performance

  • No findings. The PR is reporting-layer metadata/prose work plus one persisted result flag; I did not see a new hot-path estimator cost in the changed code.

Maintainability

Tech Debt

  • No findings. I did not identify a new deferrable limitation that should be added to TODO.md.

Security

  • No findings.

Documentation/Tests

@igerber igerber added the ready-for-ci Triggers CI test workflows label Apr 22, 2026
@igerber igerber merged commit 8999baf into main Apr 22, 2026
23 of 24 checks passed
@igerber igerber deleted the br-dr-target-parameter branch April 22, 2026 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant