Skip to content

Add v3.0 announcement carousel#283

Merged
igerber merged 3 commits intomainfrom
v3-announcement-carousel
Apr 7, 2026
Merged

Add v3.0 announcement carousel#283
igerber merged 3 commits intomainfrom
v3-announcement-carousel

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Apr 7, 2026

Summary

  • LinkedIn PDF carousel (8 slides) announcing diff-diff v3.0's survey design-based inference
  • Warm ivory (#FDFBF7) / burnt sienna (#B45309) paper aesthetic — visual break from previous carousels
  • Generation script at carousel/generate_v3_carousel.py using FPDF2 + matplotlib (same toolchain as existing carousels)

Slide content

  1. Hook — "The first DiD library with design-based survey inference"
  2. Competitive gap table (Strata, FPC, Replicate Wts: diff-diff vs R did vs Stata csdid)
  3. DEFF visualization — paired CI bars showing naive vs design-based conclusions
  4. Binder (1983) variance formula applied to DiD influence functions
  5. Feature cards — TSL, replicate weights, survey-aware bootstrap
  6. Validation — API, NHANES, RECS cross-validated against R (< 1e-10)
  7. Code example — CallawaySantAnna + SurveyDesign (validated against actual API signatures)
  8. CTA — pip install, GitHub link

Claim verification

  • Competitive gap sourced from docs/methodology/survey-theory.md Section 1.3 and actual R/Stata package docs
  • Precision badges sourced from tests/test_survey_real_data.py:40 ("observed gaps < 1e-10")
  • 7 estimators validated against R golden values (DiD, TWFE, CS, Imputation, Stacked, SA, DDD)
  • Bacon asterisked as diagnostic throughout
  • "All packages support cluster-robust inference" footnote added to gap table

Methodology references

  • N/A — no estimator or math changes; carousel generation script only

Validation

  • Tests added/updated: No test changes
  • PDF generated and visually reviewed (8 pages, 90KB)

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

LinkedIn PDF carousel (8 slides) announcing diff-diff v3.0's headline
feature: design-based variance estimation for complex surveys across
all estimators. Warm ivory/burnt sienna paper aesthetic.

Slides: claim hook, competitive gap table, DEFF CI visualization,
Binder theorem equation, feature cards, R validation, code example, CTA.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

Overall Assessment
⚠️ Needs changes. No P0s, but there are two unmitigated P1 methodology-claim mismatches in the carousel copy.

Executive Summary

  • Slides 1 and 5 overstate survey-feature coverage by pairing TSL, replicate weights, and survey bootstrap with “All 16 estimators,” while the published support matrix documents estimator-specific limits.
  • Slide 4 presents the Binder/influence-function argument as if it applies to all modern DiD estimators, but the source methodology explicitly excludes SyntheticDiD and TROP from that smooth-functional / IF chain.
  • The CallawaySantAnna code example and the listed replicate-weight methods appear aligned with the current API and survey docs.
  • No security or performance issues stood out in the new generator; the blockers are claim accuracy in the generated content.

Methodology

  • Severity P1. Affected methods: SyntheticDiD, TROP. Location: slide-4 reasoning text at carousel/generate_v3_carousel.py:509 and the surrounding slide block at carousel/generate_v3_carousel.py:459; compare docs/methodology/survey-theory.md:356 through line 368, plus docs/methodology/REGISTRY.md:1298 and docs/methodology/REGISTRY.md:1752.
    Impact: The slide says “Modern DiD estimators are smooth functionals of F” and that their IFs are well-defined/design-independent without qualification. The methodology docs explicitly carve out SyntheticDiD and TROP as non-smooth/bootstrap-only cases, so the public mathematical justification is broader than the documented theory.
    Concrete fix: Reword slide 4 to scope the Binder argument to IF-amenable / analytical-survey estimators, or add an explicit exception sentence that SyntheticDiD and TROP use Rao-Wu survey bootstrap rather than Binder/TSL linearization; then regenerate carousel/diff-diff-v3-carousel.pdf.

  • Severity P1. Affected methods: SyntheticDiD, TROP, WooldridgeDiD, BaconDecomposition, and the library-wide survey-support claim. Location: slide-1 teaser block at carousel/generate_v3_carousel.py:347, slide-5 feature cards at carousel/generate_v3_carousel.py:564, and slide-5 callout at carousel/generate_v3_carousel.py:599; compare the support matrix at docs/choosing_estimator.rst:602 through line 703 and the shipped-count note at docs/survey-roadmap.md:50 through line 54.
    Impact: The docs are clear that all estimators accept survey_design, but not all support the same variance paths. The carousel collapses those distinctions by putting TSL, replicate weights, survey bootstrap, and “All 16 estimators” into one message, even though the current docs say replicate weights are only on 12/16, SyntheticDiD/TROP use bootstrap-only full-design survey support, WooldridgeDiD has analytical TSL only, and BaconDecomposition is diagnostic-only.
    Concrete fix: Replace the blanket “All 16 estimators” wording with either the true per-feature counts or a qualified statement such as “survey support across 16 estimators, with estimator-specific variance paths,” and expand the asterisk text to cover the documented exceptions before regenerating the PDF.

Code Quality

  • No findings in scope.

Performance

  • No findings in scope.

Maintainability

  • No findings in scope.

Tech Debt

  • No findings. TODO.md does not currently track the two methodology-copy mismatches above, so they are unmitigated under the project’s deferred-work policy.

Security

  • No findings in scope.

Documentation/Tests

Path to Approval

  1. Update slide 4 in carousel/generate_v3_carousel.py:459 so the Binder/IF argument is explicitly scoped to IF-based analytical-survey estimators, or explicitly carve out SyntheticDiD and TROP, then regenerate carousel/diff-diff-v3-carousel.pdf.
  2. Update the slide 1 / slide 5 messaging in carousel/generate_v3_carousel.py:347, carousel/generate_v3_carousel.py:564, and carousel/generate_v3_carousel.py:599 so it matches the support matrix in docs/choosing_estimator.rst:602, then regenerate the PDF.

- Slide 4: "Most modern DiD estimators" (not all); add 4th bullet
  carving out SyntheticDiD/TROP as Rao-Wu bootstrap-only
  (survey-theory.md Section 4.2a)
- Slide 1: "Survey support across all 16 estimators*" with asterisk
  noting variance paths vary by estimator
- Slide 5: Expanded asterisk covering TSL/Rao-Wu split and Bacon
- Slide 6: "validated against R reference implementations" (not
  "golden values") to avoid implying all 7 used survey::svyglm()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 7, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

🔁 AI review rerun (requested by @igerber)

Head SHA: 7b7278ec933e613749e3b848edc7157fdf20f257


Overall Assessment

⚠️ Needs changes — the re-review resolves the earlier slide-4 theory-scope blocker, but one unmitigated P1 methodology/support-matrix mismatch remains in slide 5.

Executive Summary

  • Re-review result: the prior slide-4 Binder/IF overreach appears resolved. The revised copy now explicitly carves out SyntheticDiD and TROP from the IF/Binder path.
  • The earlier slide-6 validation-copy concern also appears resolved: “R reference implementations” is materially narrower and better aligned than implying seven direct survey::svyglm() parity checks.
  • One P1 remains on slide 5: the asterisk says TSL for 14 estimators, but the documented support matrix only supports 13 analytical/TSL-capable estimators.
  • Because the same slide still pairs “Replicate Weights” with “Survey support across all 16 estimators,” the safest fix is to use exact per-feature counts or a purely qualitative support-matrix qualifier.
  • No estimator code, inference code, NaN handling, or parameter-propagation paths changed in this PR, so the edge-case checklist did not surface new production-code risks.

Methodology

Code Quality

  • No findings in scope. This PR does not touch estimator or inference implementation, and I did not see the known statistical anti-patterns in the added generator.

Performance

  • No findings in scope. The script is offline asset generation and introduces no production-path performance risk.

Maintainability

  • No findings in scope.

Tech Debt

  • No mitigating tracked item applies here. I did not find a matching entry under TODO.md:51, so the methodology mismatch above remains unmitigated.

Security

  • No findings in scope.

Documentation/Tests

  • No additional findings. The revised validation wording looks materially better aligned with the current test surface. I did not re-render the PDF in this read-only review; this assessment is based on the checked-in generator and the committed artifact.

Path to Approval

  1. Update the slide-5 callout/asterisk in carousel/generate_v3_carousel.py:601 and carousel/generate_v3_carousel.py:606 so it matches the documented matrix exactly. If you keep counts, they need to reflect the shipped docs (TSL on 13 estimators; replicate weights only where documented; SyntheticDiD/TROP bootstrap-only; BaconDecomposition diagnostic-only).
  2. Regenerate carousel/diff-diff-v3-carousel.pdf from the corrected script so the binary matches the fixed copy.

AI review caught that "TSL for 14" should be 13 (SyntheticDiD,
TROP, and Bacon excluded). Replace with qualitative statement
to avoid count mismatches with the support matrix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Apr 7, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

🔁 AI review rerun (requested by @igerber)

Head SHA: c18f7e4b695ecac27fa9d7fc026765081ba20f7a


Overall Assessment

Looks good — no unmitigated P0 or P1 findings. The prior methodology concerns called out in the earlier AI review appear resolved in the current carousel copy.

Executive Summary

  • The previous slide-5 blocker is resolved: the script no longer hard-codes an unsupported TSL estimator count and now defers readers to the estimator-specific support matrix.
  • The previous slide-4 Binder/IF scope concern is also resolved: SyntheticDiD and TROP are now explicitly carved out to Rao-Wu bootstrap rather than being implied to follow the IF/Binder path.
  • The slide-1 “first DiD library” hook is directly supported by the survey methodology note.
  • The validation slide is materially aligned with the current test surface: real-data parity against R survey plus broader R reference-implementation coverage.
  • No estimator, weighting, variance/SE, identification, or default-behavior code changed in this PR, so the edge-case checklist did not surface new production correctness risks.

Methodology

Code Quality

  • No findings in scope. This PR adds a standalone asset generator and PDF; it does not modify estimator or inference code.

Performance

  • No findings in scope. The new code runs only for offline carousel generation and does not affect library runtime paths.

Maintainability

  • No findings in scope. For this kind of one-off generator, the slide-specific helpers and claim/source docstrings are sufficiently auditable.

Tech Debt

  • No findings in scope. I did not find a relevant unmitigated entry under TODO.md:51, and I did not identify a new issue that needs to be tracked there.

Security

  • No findings in scope. I did not see secrets, credentials, or unsafe data-handling patterns in the added script or committed artifact.

Documentation/Tests

@igerber igerber added the ready-for-ci Triggers CI test workflows label Apr 7, 2026
@igerber igerber merged commit 7021ce5 into main Apr 7, 2026
@igerber igerber deleted the v3-announcement-carousel branch April 7, 2026 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant