Skip to content

Consolidate CPS-passthrough/split out of us.py into a declarative spec (clean up #226/#227/#228) #229

@MaxGhenis

Description

@MaxGhenis

Context

The CPS-passthrough / income-split fixes (#226, #227, #228) closed the dominant MP-vs-eCPS gap: MP was re-imputing CPS-measured income (pension, SS, interest, …) from PUF donors instead of carrying the CPS-reported total, which under-covered pension/SS (~half) and over-imputed interest. The fixes anchor PUF tax-detail leaves to CPS totals and derive the taxable/exempt split, and they are correct and tested.

They were landed under time pressure directly in src/microplex_us/pipelines/us.py (the large pipeline file). Per AGENTS.md ("Prefer spec-driven behavior over ad hoc logic in large pipeline files"; "push shared abstractions into core"), this warrants a consolidation pass to reach production-clean.

Do this after the approach is validated (a passthrough+interest candidate beats a cleanly-scored production eCPS) — it is polish, not a blocker, and shouldn't churn mid-iteration.

Clean-up items

  1. Extract out of us.py into a dedicated, unit-testable module/spec. The passthrough/split lives as methods on the giant pipeline class (_preserve_cps_measured_puf_clone_totals, plus the interest-split logic in _augment_policyengine_person_inputs). Move it to a focused module (or a spec the donor framework consumes), so it's testable in isolation and not more ad-hoc logic in the pipeline file.

  2. Unify the repetitive pandas. _preserve_cps_measured_puf_clone_totals has three near-duplicate to_numeric/fillna/clip blocks — direct passthrough (PUF_SUPPORT_CLONE_CPS_DIRECT_PASSTHROUGH_ALIASES), 2-component split (PUF_SUPPORT_CLONE_CPS_SPLIT_TOTALS), and a special-cased dividend block outside the spec loop. Collapse into one declarative "CPS total → component split" splitter that covers dividends too (qualified/non-qualified) rather than a bespoke branch.

  3. Source / parameterize the magic constant. PUF_SUPPORT_CLONE_CPS_TAXABLE_INTEREST_FALLBACK_SHARE = 0.680 is a bare unsourced literal (taxable share of interest used as the fallback when a per-record split is unavailable). Add a citation/derivation and make it a parameter (consistent with how the pension TAXABLE_PENSION_FRACTION is named), not an inline number.

  4. Minor: the pension_income → taxable_private_pension_income / tax_exempt_private_pension_income aliasing is special-cased onto the split loop; fold it into the unified spec.

Done when

  • The CPS-passthrough/split logic is a declarative spec + thin module outside us.py, with unit tests.
  • One splitter path handles direct passthrough, 2-component, and dividend splits.
  • The interest fallback share is sourced and parameterized.
  • Behavior is unchanged vs the validated candidate (regression-tested against the same export/identity gates that #228 added).

Refs: #226, #227, #228.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions