diff --git a/README.md b/README.md index 82fa025a..5159a855 100644 --- a/README.md +++ b/README.md @@ -82,6 +82,7 @@ Measuring campaign lift? Evaluating a product launch? diff-diff handles the caus - **[Which method fits my problem?](docs/practitioner_decision_tree.rst)** - Start from your business scenario (campaign in some markets, staggered rollout, survey data) and find the right estimator - **[Getting started for practitioners](docs/practitioner_getting_started.rst)** - End-to-end walkthrough: marketing campaign -> causal estimate -> stakeholder-ready result - **[Brand awareness survey tutorial](docs/tutorials/17_brand_awareness_survey.ipynb)** - Full example with complex survey design, brand funnel analysis, and staggered rollouts +- **Have BRFSS/ACS/CPS individual records?** Use [`aggregate_survey()`](docs/api/prep.rst) to roll respondent-level microdata into a geographic-period panel with inverse-variance precision weights. The returned second-stage design uses analytic weights (`aweight`), so it works directly with `DifferenceInDifferences`, `TwoWayFixedEffects`, `MultiPeriodDiD`, `SunAbraham`, `ContinuousDiD`, and `EfficientDiD` (estimators marked **Full** in the [survey support matrix](docs/choosing_estimator.rst)) Already know DiD? The [academic quickstart](docs/quickstart.rst) and [estimator guide](docs/choosing_estimator.rst) cover the full technical details. @@ -106,6 +107,7 @@ Already know DiD? The [academic quickstart](docs/quickstart.rst) and [estimator - **Pre-trends power analysis**: Roth (2022) minimum detectable violation (MDV) and power curves for pre-trends tests - **Power analysis**: MDE, sample size, and power calculations for study design; simulation-based power for any estimator - **Data prep utilities**: Helper functions for common data preparation tasks +- **Survey microdata aggregation**: `aggregate_survey()` rolls individual-level survey data (BRFSS, ACS, CPS, NHANES) into geographic-period panels with design-based precision weights for second-stage DiD - **Validated against R**: Benchmarked against `did`, `synthdid`, and `fixest` packages (see [benchmarks](docs/benchmarks.rst)) ## Estimator Aliases diff --git a/ROADMAP.md b/ROADMAP.md index d4f5e5a6..4a0b7ab4 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -102,7 +102,7 @@ Parallel track targeting data science practitioners — marketing, product, oper | **B3a.** `BusinessReport` class — plain-English summaries, markdown export; rich export via optional `[reporting]` extra | HIGH | Not started | | **B3b.** `DiagnosticReport` — unified diagnostic runner with plain-English interpretation. Includes making `practitioner_next_steps()` context-aware (substitute actual column names from fitted results into code snippets instead of generic placeholders). | HIGH | Not started | | **B3c.** Practitioner data generator wrappers (thin wrappers around existing generators with business-friendly names) | MEDIUM | Not started | -| **B3d.** `survey_aggregate()` helper (see [Survey Aggregation Helper](#future-survey-aggregation-helper)) | MEDIUM | Not started | +| **B3d.** `aggregate_survey()` helper (microdata-to-panel bridge for BRFSS/ACS/CPS) | MEDIUM | Shipped (v3.0.1) | ### Phase B4: Platform (Longer-term) @@ -116,14 +116,6 @@ Parallel track targeting data science practitioners — marketing, product, oper --- -## Future: Survey Aggregation Helper - -**`survey_aggregate()` helper function** for the microdata-to-panel workflow. Bridges individual-level survey data (BRFSS, ACS, CPS) collected as repeated cross-sections to geographic-level (state, city) panel DiD. Computes design-based cell means and precision weights that estimators can consume directly. - -Also cross-referenced as **B3d** — directly enables the practitioner survey tutorial workflow beyond the original academic framing. - ---- - ## Future Estimators ### de Chaisemartin-D'Haultfouille Estimator diff --git a/TODO.md b/TODO.md index f1c649a6..f3d45ebc 100644 --- a/TODO.md +++ b/TODO.md @@ -60,6 +60,7 @@ Deferred items from PR reviews that were not addressed before merge. | ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) | | Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium | | EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. With `anticipation>0`, code is arguably more conservative (excludes anticipation-contaminated periods). Either align REGISTRY with code or change code to `t < last_g` — needs design decision. | `efficient_did.py` | #230 | Low | +| `aggregate_survey()` returns `SurveyDesign(weight_type="aweight")`, but most modern staggered estimators (`CallawaySantAnna`, `ImputationDiD`, `TwoStageDiD`, `StackedDiD`, `TripleDifference`, `StaggeredTripleDifference`, `SyntheticDiD`, `TROP`, `WooldridgeDiD`) reject `aweight`. The microdata-to-staggered-DiD workflow currently requires a manual second-stage `SurveyDesign` reconstruction for these estimators. Investigate whether `aggregate_survey()` should offer an opt-in `output_weight_type="pweight"` mode (statistically dubious but practically useful) or whether the manual workaround should be documented prominently. | `diff_diff/prep.py` | #288 | Medium | | TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. Add a `generate_ddd_panel_data` for panel DDD power analysis. | `prep_dgp.py`, `power.py` | #208 | Low | | Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low | | Survey-weighted Silverman bandwidth in EfficientDiD conditional Omega* — `_silverman_bandwidth()` uses unweighted mean/std for bandwidth selection; survey-weighted statistics would better reflect the population distribution but is a second-order refinement | `efficient_did_covariates.py` | — | Low | @@ -88,9 +89,12 @@ Deferred items from PR reviews that were not addressed before merge. |-------|----------|----|----------| | R comparison tests spawn separate `Rscript` per test (slow CI) | `tests/test_methodology_twfe.py:294` | #139 | Low | | CS R helpers hard-code `xformla = ~ 1`; no covariate-adjusted R benchmark for IRLS path | `tests/test_methodology_callaway.py` | #202 | Low | -| ~376 `duplicate object description` Sphinx warnings — restructure `docs/api/*.rst` to avoid duplicate `:members:` + `autosummary` | `docs/api/*.rst` | — | Low | +| ~1583 `duplicate object description` Sphinx warnings — restructure `docs/api/*.rst` to avoid duplicate `:members:` + `autosummary` (count grew from ~376 as API surface expanded) | `docs/api/*.rst` | — | Low | | Doc-snippet smoke tests only cover `.rst` files; `.txt` AI guides outside CI validation | `tests/test_doc_snippets.py` | #239 | Low | | Add CI validation for `docs/doc-deps.yaml` integrity (stale paths, unmapped source files) | `docs/doc-deps.yaml` | #269 | Low | +| Sphinx autodoc fails to import 3 result members: `DiDResults.ci`, `MultiPeriodDiDResults.att`, `CallawaySantAnnaResults.aggregate` — investigate whether these are renamed/removed or just unresolvable from autosummary template | `docs/api/results.rst`, `docs/api/staggered.rst` | — | Medium | +| `EDiDBootstrapResults` cross-reference is ambiguous — class is exported from both `diff_diff` and `diff_diff.efficient_did_bootstrap`, producing 3 "more than one target found" warnings. Add `:noindex:` to one source or use full-path refs | `diff_diff/efficient_did_results.py`, `docs/api/efficient_did.rst` | — | Low | +| Tracked Sphinx autosummary stubs in `docs/api/_autosummary/*.rst` are stale — every sphinx build regenerates them with new attributes (e.g., `coef_var`, `survey_metadata`) that have been added to result classes. Either commit a refresh or move the directory to `.gitignore` and treat as build output. Also 6 untracked stubs exist for newer estimators (`WooldridgeDiD`, `SimulationMDEResults`, etc.) that have never been committed. | `docs/api/_autosummary/` | — | Low | --- diff --git a/docs/business-strategy.md b/docs/business-strategy.md index cd548f73..92a5d787 100644 --- a/docs/business-strategy.md +++ b/docs/business-strategy.md @@ -329,7 +329,7 @@ The project has an existing ROADMAP.md covering Phase 10 (survey academic credib **Directly subsumed items:** - **10g. "Practitioner guidance: when does survey design matter?"** -- this becomes part of the business tutorials and Getting Started guide. No longer a standalone item. -- **survey_aggregate() helper** -- the microdata-to-panel workflow helper is directly relevant for Persona A (survey data from BRFSS/ACS -> geographic panel). Should be prioritized alongside business tutorials. +- **`aggregate_survey()` helper** -- shipped in v3.0.1. The microdata-to-panel workflow helper is in place for Persona A (survey data from BRFSS/ACS -> geographic panel). Practitioner-facing tutorials should reference it. **Reprioritized by business use cases:** - **de Chaisemartin-D'Haultfouille (reversible treatments)** -- marketing interventions frequently switch on/off (seasonal campaigns, promotions). This estimator becomes higher priority for business DS than for academics. Should move up in the roadmap. @@ -373,7 +373,7 @@ Tutorials in priority order (ship incrementally, not all at once): 12. `BusinessReport` class (3a) -- core uses only numpy/pandas/scipy; rich export via optional `[reporting]` extra 13. `DiagnosticReport` descriptive assessment (3b) 14. Business data generator wrappers (3c) -15. `survey_aggregate()` helper from existing roadmap -- directly enables the survey tutorial workflow +15. ~~`survey_aggregate()` helper from existing roadmap~~ -- shipped in v3.0.1 as `aggregate_survey()`; directly enables the survey tutorial workflow ### Phase 4: Platform (Longer-term) *Goal: Integrate into business DS workflows* diff --git a/docs/choosing_estimator.rst b/docs/choosing_estimator.rst index 8af57097..932dc4a3 100644 --- a/docs/choosing_estimator.rst +++ b/docs/choosing_estimator.rst @@ -603,6 +603,22 @@ All estimators accept an optional ``survey_design`` parameter in ``fit()``. Pass a :class:`~diff_diff.SurveyDesign` object to get design-based variance estimation. The depth of support varies by estimator: +.. note:: + + If your data starts as **individual-level survey microdata** (e.g., BRFSS, + ACS, CPS, NHANES respondent records), use :func:`~diff_diff.aggregate_survey` + as a preprocessing step. It pools microdata into geographic-period cells with + inverse-variance precision weights and returns a pre-configured + :class:`~diff_diff.SurveyDesign` with ``weight_type="aweight"``. This + second-stage design is directly compatible with estimators marked **Full** in + the matrix below: :class:`~diff_diff.DifferenceInDifferences`, + :class:`~diff_diff.TwoWayFixedEffects`, :class:`~diff_diff.MultiPeriodDiD`, + :class:`~diff_diff.SunAbraham`, :class:`~diff_diff.ContinuousDiD`, and + :class:`~diff_diff.EfficientDiD`. Estimators marked **pweight only** (CS, + ImputationDiD, TwoStageDiD, StackedDiD, TripleDifference, etc.) explicitly + reject ``aweight`` and require a manually constructed second-stage + ``SurveyDesign`` instead. See :doc:`api/prep` for the API reference. + .. list-table:: :header-rows: 1 :widths: 25 12 18 18 18 diff --git a/docs/doc-deps.yaml b/docs/doc-deps.yaml index 473ee1e5..b4e045d0 100644 --- a/docs/doc-deps.yaml +++ b/docs/doc-deps.yaml @@ -550,6 +550,14 @@ sources: docs: - path: docs/api/prep.rst type: api_reference + - path: docs/practitioner_getting_started.rst + type: user_guide + - path: docs/practitioner_decision_tree.rst + type: user_guide + - path: docs/choosing_estimator.rst + type: user_guide + - path: docs/survey-roadmap.md + type: roadmap diff_diff/prep_dgp.py: drift_risk: low diff --git a/docs/index.rst b/docs/index.rst index de65253e..bddf9a38 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -48,6 +48,7 @@ Quick Links - :doc:`practitioner_decision_tree` - Which method fits your business problem? - :doc:`quickstart` - Installation and your first DiD analysis - :doc:`choosing_estimator` - Which estimator should I use? +- :func:`~diff_diff.aggregate_survey` - Have BRFSS/ACS/CPS microdata? Bridge it to a geographic panel for DiD - :doc:`tutorials/01_basic_did` - Hands-on basic tutorial - :doc:`troubleshooting` - Common issues and solutions - :doc:`r_comparison` - Coming from R? @@ -99,6 +100,8 @@ Quick Links tutorials/13_stacked_did tutorials/14_continuous_did tutorials/15_efficient_did + tutorials/16_survey_did + tutorials/16_wooldridge_etwfe .. toctree:: :maxdepth: 1 diff --git a/docs/practitioner_decision_tree.rst b/docs/practitioner_decision_tree.rst index 2c5ac34c..5ff0d2f3 100644 --- a/docs/practitioner_decision_tree.rst +++ b/docs/practitioner_decision_tree.rst @@ -233,6 +233,15 @@ Ignoring survey weights and clustering makes your confidence intervals too narro you will be overconfident about the result. Passing a ``SurveyDesign`` to ``fit()`` corrects for this automatically. +**If your data is individual-level microdata** (e.g., BRFSS, ACS, CPS, or NHANES +respondent records), use :func:`~diff_diff.aggregate_survey` first to roll it up +to a geographic-period panel with inverse-variance precision weights. The +returned second-stage design uses ``weight_type="aweight"``, so it works with +estimators marked **Full** in the :ref:`survey-design-support` matrix (DiD, +TWFE, MultiPeriodDiD, SunAbraham, ContinuousDiD, EfficientDiD) but not with +``pweight``-only estimators like ``CallawaySantAnna`` or ``ImputationDiD``. +See :doc:`practitioner_getting_started` for an end-to-end example. + .. code-block:: python from diff_diff import DifferenceInDifferences, SurveyDesign diff --git a/docs/practitioner_getting_started.rst b/docs/practitioner_getting_started.rst index 0dd7d5e0..7027b93c 100644 --- a/docs/practitioner_getting_started.rst +++ b/docs/practitioner_getting_started.rst @@ -293,6 +293,42 @@ Ignoring these makes your confidence intervals too narrow. diff-diff handles this via :class:`~diff_diff.SurveyDesign` - pass it to any estimator's ``fit()`` method. +If your data is **individual-level microdata** - one row per respondent, with +sampling weights and strata/PSU columns (BRFSS, ACS, CPS, NHANES) - use +:func:`~diff_diff.aggregate_survey` first to roll it up to a geographic-period +panel. The helper computes design-based cell means with inverse-variance +precision weights and returns a pre-configured ``SurveyDesign`` (with +``weight_type="aweight"``) for the second-stage fit. This second-stage design +works directly with estimators marked **Full** in the +:ref:`survey-design-support` matrix - notably +:class:`~diff_diff.DifferenceInDifferences`, :class:`~diff_diff.SunAbraham`, +:class:`~diff_diff.MultiPeriodDiD`, and :class:`~diff_diff.EfficientDiD`. +``pweight``-only estimators (``CallawaySantAnna``, ``ImputationDiD``, etc.) +require a manually constructed ``SurveyDesign`` instead. + +.. code-block:: python + + from diff_diff import aggregate_survey, SurveyDesign, SunAbraham + + # 1. Describe the microdata's sampling design + design = SurveyDesign(weights="finalwt", strata="strat", psu="psu") + + # 2. Roll up respondent records into a state-year panel + panel, stage2 = aggregate_survey( + microdata, by=["state", "year"], + outcomes="brand_awareness", survey_design=design, + ) + + # 3. Add the campaign launch year per state, then fit a modern staggered + # estimator with the pre-configured second-stage SurveyDesign: + # panel["first_treat"] = panel["state"].map(campaign_launch_year) # NaN = control + # results = SunAbraham().fit( + # panel, outcome="brand_awareness_mean", + # unit="state", time="year", first_treat="first_treat", + # survey_design=stage2, + # ) + # results.print_summary() + For a complete walkthrough with brand funnel metrics and survey design corrections, see `Tutorial 17: Brand Awareness Survey `_. diff --git a/docs/survey-roadmap.md b/docs/survey-roadmap.md index 537a84b6..0018e40c 100644 --- a/docs/survey-roadmap.md +++ b/docs/survey-roadmap.md @@ -112,6 +112,18 @@ Files: `benchmarks/R/benchmark_realdata_*.R`, `tests/test_survey_real_data.py`, - **10d.** Tutorial rewrite — flat-weight vs design-based comparison with known ground truth - **10f.** WooldridgeDiD survey support — OLS, logit, Poisson paths with `pweight` + strata/PSU/FPC + TSL variance +### v3.0.1: Survey Aggregation Helper + +`aggregate_survey()` (in `diff_diff.prep`) bridges individual-level survey +microdata (BRFSS, ACS, CPS, NHANES) to geographic-period panels for +second-stage DiD estimation. Computes design-based cell means and precision +weights using domain estimation (Lumley 2004 §3.4), with SRS fallback for +small cells. Returns a panel DataFrame plus a pre-configured `SurveyDesign` +for the second-stage fit. Supports both TSL and replicate-weight variance. + +See `docs/api/prep.rst` for the API reference and `docs/methodology/REGISTRY.md` +for the methodology entry. + --- ## Phase 10: Remaining Items