Comprehensive documentation review and update#206
Conversation
Fix incorrect parameter names across 12 documentation pages (treated=/post= should be treatment=/time=), including complete rewrites of diagnostics examples to match actual function signatures. Add 3 new API pages: TwoStageDiD (Gardner 2022), BaconDecomposition (Goodman-Bacon 2021), and built-in Datasets (Card & Krueger, Castle Doctrine, etc.). Restructure API reference to single entry point via api/index, eliminating confusing duplicate navigation. Add all missing estimators and functions to autosummary index. Expand Choosing an Estimator with 6 new estimators in flowchart, quick reference table, detailed guidance sections, and SE methods table. Add 9 new troubleshooting sections covering Rust backend, TROP tuning failures, ContinuousDiD discrete dose warnings, Imputation/TwoStage data issues, Bacon panel requirements, and deprecation warnings. Update front page features (5 → 7 bullets, all 13+ estimators), comparison pages (fix inaccurate feature flags, add 7-9 new feature rows), and data preparation docs (add 6 missing generation functions). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Overall Assessment Executive Summary
Methodology
No other methodology defects stood out in the changed estimator overview pages; the documented TwoStageDiD deviations already recorded in the registry are not blockers. Code Quality No findings in the underlying implementation from this docs-only diff. Performance No findings from the changed files. Maintainability No standalone finding beyond the documentation drift already called out above. Tech Debt No new tracked-tech-debt issue was added in TODO.md. Security No findings. Documentation/Tests
I did not run the docs build in this environment because project dependencies are missing here ( Path to Approval
|
Fix 27 documentation bugs where code examples used wrong kwargs, attributes, or calling conventions vs. the actual API: - CallawaySantAnna: SE description (influence function, not simple difference), .overall_att not .att, aggregate via fit() not method - BaconDecomposition/ImputationDiD: first_treat= not treatment= - MultiPeriodDiD: reference_period is fit() param, not __init__() - SyntheticDiD: remove nonexistent treatment_start param - DifferenceInDifferences: cluster= in __init__(), not cluster_col= in fit() - TROP: lambda_time_grid/lambda_unit_grid, min 2 periods not 4 - balance_panel: unit_column=/time_column= not unit=/time= - make_treatment_indicator: treated_values= (plural), returns DataFrame - create_event_time: time_column=/treatment_time_column= - aggregate_to_cohorts: unit_column=/time_column=/treatment_column= - generate_did_data: treatment_period= not treatment_start= - PlaceboTestResults: .placebo_effect not .effect - HonestDiD: method=/M= not delta=DeltaRM() - DeltaRM: Mbar= not M_bar= - datasets.rst: first_treat= not cohort= Add tests/test_doc_snippets.py that extracts and executes all .. code-block:: python snippets from 11 RST files, catching TypeError/AttributeError to prevent kwarg regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Highest unmitigated finding: a P1 methodology mismatch in the new Bacon guidance. Executive Summary
Methodology
Code Quality No findings. Performance No findings. Maintainability No separate findings beyond the documentation/test issues below. Tech Debt No separate finding, but the documentation-test blind spots below are not tracked in Security No findings. Documentation/Tests
Source-only note: I did not run the new snippet test in this environment because shell execution is failing here. Path to Approval
|
… test - Fix Bacon wording: "negative weights" → accurate Goodman-Bacon (2021) language - Fix first_treat column name in Bacon examples (2 files) - Fix 15 doc snippet bugs: wrong reference_period, column names, return types, and estimator-data mismatches (SyntheticDiD, wild bootstrap, diagnostics) - Harden test_doc_snippets: fail on all exceptions except NameError - Add mock dataset loaders so dataset page snippets execute without network - Add dose/exposure columns to test namespace for troubleshooting snippets Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Highest unmitigated finding: a P1 inference-documentation mismatch in Executive Summary
Execution note: shell access is failing in this environment, so this is a source-only re-review. Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…coverage - Fix CallawaySantAnna SE table: replace invalid 'bayes' bootstrap weight with list of valid types (rademacher, mammen, webb) - Fix HonestDiD examples: correct constructor (method/M instead of delta=DeltaRM), fix result attributes (ci_lb/ci_ub not robust_ci), fix plot return types (ax not fig) - Fix PreTrends examples: correct MultiPeriodDiD calling convention (post_periods/reference_period on fit()), fix compute_mdv and plot_pretrends_power kwargs - Fix visualization example: plot_event_study returns Axes not Figure - Add honest_did.rst and pretrends.rst to doc snippet test coverage - Add matplotlib>=3.5 to dev deps; remove _KNOWN_FAILURES xfail set Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Highest unmitigated finding: a P1 methodology-sensitive inference mismatch in docs/api/two_stage.rst:L162-L165, where the new comparison table labels Execution note: this was a source-only re-review. I could not run the doc tests locally because Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…overage - Fix CallawaySantAnna bootstrap label: "Wild bootstrap" → "Multiplier bootstrap (IF/WIF)" - Fix cluster_col → cluster on constructor in troubleshooting and choosing_estimator docs - Fix plot_honest_event_study to single-arg call, plot_bacon return type (ax not fig) - Fix dataset docstrings: cohort= → first_treat= - Fix HonestDiD/SyntheticDiD API calls in comparison pages - Add :: shorthand block extractor to test_doc_snippets.py - Add python_comparison.rst and r_comparison.rst to tested pages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Highest unmitigated finding: P2 documentation/test validation gap in tests/test_doc_snippets.py:150-154 and tests/test_doc_snippets.py:333-344; the new smoke test still lets several edited broken snippets through. Execution note: this was a source-only re-review. I could not run Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
- Add autouse fixture to restore diff_diff.datasets after each doc snippet test, preventing mock module from leaking into other tests - Add missing DifferenceInDifferences import in choosing_estimator.rst clustering example - Track context-dependent snippet validation gap in TODO.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Highest unmitigated finding: Execution note: this was a source-only re-review. I could not run Executive Summary
Methodology
Code Quality No findings. Performance No findings. Maintainability No findings. The prior Tech Debt
Security No findings. Documentation/Tests
Path to Approval
|
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Highest unmitigated finding: Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Highest unmitigated finding: Executive Summary
Methodology
Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt
Security No findings. Documentation/Tests
Path to Approval
|
…ation - Fix always-treated snippet that silently dropped never-treated controls by matching actual estimator logic (first_treat > 0 && <= min_period) - Replace stale results.bootstrap()/aggregate()/att with correct CS API (n_bootstrap= at constructor, aggregate= at fit time, overall_att) - Fix SyntheticDiD R comparison to derive ever-treated indicator and post_periods from data instead of passing time-varying treatment Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
No unmitigated Executive Summary
Methodology No unmitigated
Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt
Security No findings. Documentation/Tests
Review limitation: I could not execute |
…ypes Restrict test harness ImportError handling to known third-party modules (pyfixest, linearmodels, differences) so broken diff_diff imports fail CI. Fix REGISTRY.md Bacon never-treated assumption to match implementation. Fix fig= to ax= across visualization docs (plot_* returns Axes, not Figure). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology No findings. I cross-checked the changed methodology-facing docs against the Methodology Registry and in-code references; the PR does not change estimator math, weighting, SEs, assumptions, or defaults in code. Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt
Security No findings. Documentation/Tests
Review limitation: I could not execute |
The visualization module raises ImportError with a message (not exc.name) when matplotlib is missing. Also match against the error message string so optional-dependency guards are correctly suppressed in Pure Python CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
treated=/post=→treatment=/time=) across 12 documentation pages, including complete rewrites of diagnostics and utils examples to match actual function signaturesapi/index, eliminating confusing duplicate sidebar navigationMethodology references (required if estimator / math changes)
Validation
treated=/post=parameter name errors eliminated via grepapi/index.rstautosummary entries against__all__in__init__.pySecurity / privacy
Generated with Claude Code