Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ Measuring campaign lift? Evaluating a product launch? diff-diff handles the caus
- **[Which method fits my problem?](docs/practitioner_decision_tree.rst)** - Start from your business scenario (campaign in some markets, staggered rollout, survey data) and find the right estimator
- **[Getting started for practitioners](docs/practitioner_getting_started.rst)** - End-to-end walkthrough: marketing campaign -> causal estimate -> stakeholder-ready result
- **[Brand awareness survey tutorial](docs/tutorials/17_brand_awareness_survey.ipynb)** - Full example with complex survey design, brand funnel analysis, and staggered rollouts
- **Have BRFSS/ACS/CPS individual records?** Use [`aggregate_survey()`](docs/api/prep.rst) to roll respondent-level microdata into a geographic-period panel with inverse-variance precision weights. The returned second-stage design uses analytic weights (`aweight`), so it works directly with `DifferenceInDifferences`, `TwoWayFixedEffects`, `MultiPeriodDiD`, `SunAbraham`, `ContinuousDiD`, and `EfficientDiD` (estimators marked **Full** in the [survey support matrix](docs/choosing_estimator.rst))

Already know DiD? The [academic quickstart](docs/quickstart.rst) and [estimator guide](docs/choosing_estimator.rst) cover the full technical details.

Expand All @@ -106,6 +107,7 @@ Already know DiD? The [academic quickstart](docs/quickstart.rst) and [estimator
- **Pre-trends power analysis**: Roth (2022) minimum detectable violation (MDV) and power curves for pre-trends tests
- **Power analysis**: MDE, sample size, and power calculations for study design; simulation-based power for any estimator
- **Data prep utilities**: Helper functions for common data preparation tasks
- **Survey microdata aggregation**: `aggregate_survey()` rolls individual-level survey data (BRFSS, ACS, CPS, NHANES) into geographic-period panels with design-based precision weights for second-stage DiD
- **Validated against R**: Benchmarked against `did`, `synthdid`, and `fixest` packages (see [benchmarks](docs/benchmarks.rst))

## Estimator Aliases
Expand Down
10 changes: 1 addition & 9 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ Parallel track targeting data science practitioners — marketing, product, oper
| **B3a.** `BusinessReport` class — plain-English summaries, markdown export; rich export via optional `[reporting]` extra | HIGH | Not started |
| **B3b.** `DiagnosticReport` — unified diagnostic runner with plain-English interpretation. Includes making `practitioner_next_steps()` context-aware (substitute actual column names from fitted results into code snippets instead of generic placeholders). | HIGH | Not started |
| **B3c.** Practitioner data generator wrappers (thin wrappers around existing generators with business-friendly names) | MEDIUM | Not started |
| **B3d.** `survey_aggregate()` helper (see [Survey Aggregation Helper](#future-survey-aggregation-helper)) | MEDIUM | Not started |
| **B3d.** `aggregate_survey()` helper (microdata-to-panel bridge for BRFSS/ACS/CPS) | MEDIUM | Shipped (v3.0.1) |

### Phase B4: Platform (Longer-term)

Expand All @@ -116,14 +116,6 @@ Parallel track targeting data science practitioners — marketing, product, oper

---

## Future: Survey Aggregation Helper

**`survey_aggregate()` helper function** for the microdata-to-panel workflow. Bridges individual-level survey data (BRFSS, ACS, CPS) collected as repeated cross-sections to geographic-level (state, city) panel DiD. Computes design-based cell means and precision weights that estimators can consume directly.

Also cross-referenced as **B3d** — directly enables the practitioner survey tutorial workflow beyond the original academic framing.

---

## Future Estimators

### de Chaisemartin-D'Haultfouille Estimator
Expand Down
6 changes: 5 additions & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ Deferred items from PR reviews that were not addressed before merge.
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) |
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. With `anticipation>0`, code is arguably more conservative (excludes anticipation-contaminated periods). Either align REGISTRY with code or change code to `t < last_g` — needs design decision. | `efficient_did.py` | #230 | Low |
| `aggregate_survey()` returns `SurveyDesign(weight_type="aweight")`, but most modern staggered estimators (`CallawaySantAnna`, `ImputationDiD`, `TwoStageDiD`, `StackedDiD`, `TripleDifference`, `StaggeredTripleDifference`, `SyntheticDiD`, `TROP`, `WooldridgeDiD`) reject `aweight`. The microdata-to-staggered-DiD workflow currently requires a manual second-stage `SurveyDesign` reconstruction for these estimators. Investigate whether `aggregate_survey()` should offer an opt-in `output_weight_type="pweight"` mode (statistically dubious but practically useful) or whether the manual workaround should be documented prominently. | `diff_diff/prep.py` | #288 | Medium |
| TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. Add a `generate_ddd_panel_data` for panel DDD power analysis. | `prep_dgp.py`, `power.py` | #208 | Low |
| Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
| Survey-weighted Silverman bandwidth in EfficientDiD conditional Omega* — `_silverman_bandwidth()` uses unweighted mean/std for bandwidth selection; survey-weighted statistics would better reflect the population distribution but is a second-order refinement | `efficient_did_covariates.py` | — | Low |
Expand Down Expand Up @@ -88,9 +89,12 @@ Deferred items from PR reviews that were not addressed before merge.
|-------|----------|----|----------|
| R comparison tests spawn separate `Rscript` per test (slow CI) | `tests/test_methodology_twfe.py:294` | #139 | Low |
| CS R helpers hard-code `xformla = ~ 1`; no covariate-adjusted R benchmark for IRLS path | `tests/test_methodology_callaway.py` | #202 | Low |
| ~376 `duplicate object description` Sphinx warnings — restructure `docs/api/*.rst` to avoid duplicate `:members:` + `autosummary` | `docs/api/*.rst` | — | Low |
| ~1583 `duplicate object description` Sphinx warnings — restructure `docs/api/*.rst` to avoid duplicate `:members:` + `autosummary` (count grew from ~376 as API surface expanded) | `docs/api/*.rst` | — | Low |
| Doc-snippet smoke tests only cover `.rst` files; `.txt` AI guides outside CI validation | `tests/test_doc_snippets.py` | #239 | Low |
| Add CI validation for `docs/doc-deps.yaml` integrity (stale paths, unmapped source files) | `docs/doc-deps.yaml` | #269 | Low |
| Sphinx autodoc fails to import 3 result members: `DiDResults.ci`, `MultiPeriodDiDResults.att`, `CallawaySantAnnaResults.aggregate` — investigate whether these are renamed/removed or just unresolvable from autosummary template | `docs/api/results.rst`, `docs/api/staggered.rst` | — | Medium |
| `EDiDBootstrapResults` cross-reference is ambiguous — class is exported from both `diff_diff` and `diff_diff.efficient_did_bootstrap`, producing 3 "more than one target found" warnings. Add `:noindex:` to one source or use full-path refs | `diff_diff/efficient_did_results.py`, `docs/api/efficient_did.rst` | — | Low |
| Tracked Sphinx autosummary stubs in `docs/api/_autosummary/*.rst` are stale — every sphinx build regenerates them with new attributes (e.g., `coef_var`, `survey_metadata`) that have been added to result classes. Either commit a refresh or move the directory to `.gitignore` and treat as build output. Also 6 untracked stubs exist for newer estimators (`WooldridgeDiD`, `SimulationMDEResults`, etc.) that have never been committed. | `docs/api/_autosummary/` | — | Low |

---

Expand Down
4 changes: 2 additions & 2 deletions docs/business-strategy.md
Original file line number Diff line number Diff line change
Expand Up @@ -329,7 +329,7 @@ The project has an existing ROADMAP.md covering Phase 10 (survey academic credib

**Directly subsumed items:**
- **10g. "Practitioner guidance: when does survey design matter?"** -- this becomes part of the business tutorials and Getting Started guide. No longer a standalone item.
- **survey_aggregate() helper** -- the microdata-to-panel workflow helper is directly relevant for Persona A (survey data from BRFSS/ACS -> geographic panel). Should be prioritized alongside business tutorials.
- **`aggregate_survey()` helper** -- shipped in v3.0.1. The microdata-to-panel workflow helper is in place for Persona A (survey data from BRFSS/ACS -> geographic panel). Practitioner-facing tutorials should reference it.

**Reprioritized by business use cases:**
- **de Chaisemartin-D'Haultfouille (reversible treatments)** -- marketing interventions frequently switch on/off (seasonal campaigns, promotions). This estimator becomes higher priority for business DS than for academics. Should move up in the roadmap.
Expand Down Expand Up @@ -373,7 +373,7 @@ Tutorials in priority order (ship incrementally, not all at once):
12. `BusinessReport` class (3a) -- core uses only numpy/pandas/scipy; rich export via optional `[reporting]` extra
13. `DiagnosticReport` descriptive assessment (3b)
14. Business data generator wrappers (3c)
15. `survey_aggregate()` helper from existing roadmap -- directly enables the survey tutorial workflow
15. ~~`survey_aggregate()` helper from existing roadmap~~ -- shipped in v3.0.1 as `aggregate_survey()`; directly enables the survey tutorial workflow

### Phase 4: Platform (Longer-term)
*Goal: Integrate into business DS workflows*
Expand Down
16 changes: 16 additions & 0 deletions docs/choosing_estimator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -603,6 +603,22 @@ All estimators accept an optional ``survey_design`` parameter in ``fit()``.
Pass a :class:`~diff_diff.SurveyDesign` object to get design-based variance
estimation. The depth of support varies by estimator:

.. note::

If your data starts as **individual-level survey microdata** (e.g., BRFSS,
ACS, CPS, NHANES respondent records), use :func:`~diff_diff.aggregate_survey`
as a preprocessing step. It pools microdata into geographic-period cells with
inverse-variance precision weights and returns a pre-configured
:class:`~diff_diff.SurveyDesign` with ``weight_type="aweight"``. This
second-stage design is directly compatible with estimators marked **Full** in
the matrix below: :class:`~diff_diff.DifferenceInDifferences`,
:class:`~diff_diff.TwoWayFixedEffects`, :class:`~diff_diff.MultiPeriodDiD`,
:class:`~diff_diff.SunAbraham`, :class:`~diff_diff.ContinuousDiD`, and
:class:`~diff_diff.EfficientDiD`. Estimators marked **pweight only** (CS,
ImputationDiD, TwoStageDiD, StackedDiD, TripleDifference, etc.) explicitly
reject ``aweight`` and require a manually constructed second-stage
``SurveyDesign`` instead. See :doc:`api/prep` for the API reference.

.. list-table::
:header-rows: 1
:widths: 25 12 18 18 18
Expand Down
8 changes: 8 additions & 0 deletions docs/doc-deps.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -550,6 +550,14 @@ sources:
docs:
- path: docs/api/prep.rst
type: api_reference
- path: docs/practitioner_getting_started.rst
type: user_guide
- path: docs/practitioner_decision_tree.rst
type: user_guide
- path: docs/choosing_estimator.rst
type: user_guide
- path: docs/survey-roadmap.md
type: roadmap

diff_diff/prep_dgp.py:
drift_risk: low
Expand Down
3 changes: 3 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Quick Links
- :doc:`practitioner_decision_tree` - Which method fits your business problem?
- :doc:`quickstart` - Installation and your first DiD analysis
- :doc:`choosing_estimator` - Which estimator should I use?
- :func:`~diff_diff.aggregate_survey` - Have BRFSS/ACS/CPS microdata? Bridge it to a geographic panel for DiD
- :doc:`tutorials/01_basic_did` - Hands-on basic tutorial
- :doc:`troubleshooting` - Common issues and solutions
- :doc:`r_comparison` - Coming from R?
Expand Down Expand Up @@ -99,6 +100,8 @@ Quick Links
tutorials/13_stacked_did
tutorials/14_continuous_did
tutorials/15_efficient_did
tutorials/16_survey_did
tutorials/16_wooldridge_etwfe

.. toctree::
:maxdepth: 1
Expand Down
9 changes: 9 additions & 0 deletions docs/practitioner_decision_tree.rst
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,15 @@ Ignoring survey weights and clustering makes your confidence intervals too narro
you will be overconfident about the result. Passing a ``SurveyDesign`` to ``fit()``
corrects for this automatically.

**If your data is individual-level microdata** (e.g., BRFSS, ACS, CPS, or NHANES
respondent records), use :func:`~diff_diff.aggregate_survey` first to roll it up
to a geographic-period panel with inverse-variance precision weights. The
returned second-stage design uses ``weight_type="aweight"``, so it works with
estimators marked **Full** in the :ref:`survey-design-support` matrix (DiD,
TWFE, MultiPeriodDiD, SunAbraham, ContinuousDiD, EfficientDiD) but not with
``pweight``-only estimators like ``CallawaySantAnna`` or ``ImputationDiD``.
See :doc:`practitioner_getting_started` for an end-to-end example.

.. code-block:: python

from diff_diff import DifferenceInDifferences, SurveyDesign
Expand Down
36 changes: 36 additions & 0 deletions docs/practitioner_getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,42 @@ Ignoring these makes your confidence intervals too narrow.
diff-diff handles this via :class:`~diff_diff.SurveyDesign` - pass it to any estimator's
``fit()`` method.

If your data is **individual-level microdata** - one row per respondent, with
sampling weights and strata/PSU columns (BRFSS, ACS, CPS, NHANES) - use
:func:`~diff_diff.aggregate_survey` first to roll it up to a geographic-period
panel. The helper computes design-based cell means with inverse-variance
precision weights and returns a pre-configured ``SurveyDesign`` (with
``weight_type="aweight"``) for the second-stage fit. This second-stage design
works directly with estimators marked **Full** in the
:ref:`survey-design-support` matrix - notably
:class:`~diff_diff.DifferenceInDifferences`, :class:`~diff_diff.SunAbraham`,
:class:`~diff_diff.MultiPeriodDiD`, and :class:`~diff_diff.EfficientDiD`.
``pweight``-only estimators (``CallawaySantAnna``, ``ImputationDiD``, etc.)
require a manually constructed ``SurveyDesign`` instead.

.. code-block:: python

from diff_diff import aggregate_survey, SurveyDesign, SunAbraham

# 1. Describe the microdata's sampling design
design = SurveyDesign(weights="finalwt", strata="strat", psu="psu")

# 2. Roll up respondent records into a state-year panel
panel, stage2 = aggregate_survey(
microdata, by=["state", "year"],
outcomes="brand_awareness", survey_design=design,
)

# 3. Add the campaign launch year per state, then fit a modern staggered
# estimator with the pre-configured second-stage SurveyDesign:
# panel["first_treat"] = panel["state"].map(campaign_launch_year) # NaN = control
# results = SunAbraham().fit(
# panel, outcome="brand_awareness_mean",
# unit="state", time="year", first_treat="first_treat",
# survey_design=stage2,
# )
# results.print_summary()

For a complete walkthrough with brand funnel metrics and survey design corrections,
see `Tutorial 17: Brand Awareness Survey
<tutorials/17_brand_awareness_survey.ipynb>`_.
Expand Down
12 changes: 12 additions & 0 deletions docs/survey-roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,18 @@ Files: `benchmarks/R/benchmark_realdata_*.R`, `tests/test_survey_real_data.py`,
- **10d.** Tutorial rewrite — flat-weight vs design-based comparison with known ground truth
- **10f.** WooldridgeDiD survey support — OLS, logit, Poisson paths with `pweight` + strata/PSU/FPC + TSL variance

### v3.0.1: Survey Aggregation Helper

`aggregate_survey()` (in `diff_diff.prep`) bridges individual-level survey
microdata (BRFSS, ACS, CPS, NHANES) to geographic-period panels for
second-stage DiD estimation. Computes design-based cell means and precision
weights using domain estimation (Lumley 2004 §3.4), with SRS fallback for
small cells. Returns a panel DataFrame plus a pre-configured `SurveyDesign`
for the second-stage fit. Supports both TSL and replicate-weight variance.

See `docs/api/prep.rst` for the API reference and `docs/methodology/REGISTRY.md`
for the methodology entry.

---

## Phase 10: Remaining Items
Expand Down