Context
The CPS-passthrough / income-split fixes (#226, #227, #228) closed the dominant MP-vs-eCPS gap: MP was re-imputing CPS-measured income (pension, SS, interest, …) from PUF donors instead of carrying the CPS-reported total, which under-covered pension/SS (~half) and over-imputed interest. The fixes anchor PUF tax-detail leaves to CPS totals and derive the taxable/exempt split, and they are correct and tested.
They were landed under time pressure directly in src/microplex_us/pipelines/us.py (the large pipeline file). Per AGENTS.md ("Prefer spec-driven behavior over ad hoc logic in large pipeline files"; "push shared abstractions into core"), this warrants a consolidation pass to reach production-clean.
Do this after the approach is validated (a passthrough+interest candidate beats a cleanly-scored production eCPS) — it is polish, not a blocker, and shouldn't churn mid-iteration.
Clean-up items
-
Extract out of us.py into a dedicated, unit-testable module/spec. The passthrough/split lives as methods on the giant pipeline class (_preserve_cps_measured_puf_clone_totals, plus the interest-split logic in _augment_policyengine_person_inputs). Move it to a focused module (or a spec the donor framework consumes), so it's testable in isolation and not more ad-hoc logic in the pipeline file.
-
Unify the repetitive pandas. _preserve_cps_measured_puf_clone_totals has three near-duplicate to_numeric/fillna/clip blocks — direct passthrough (PUF_SUPPORT_CLONE_CPS_DIRECT_PASSTHROUGH_ALIASES), 2-component split (PUF_SUPPORT_CLONE_CPS_SPLIT_TOTALS), and a special-cased dividend block outside the spec loop. Collapse into one declarative "CPS total → component split" splitter that covers dividends too (qualified/non-qualified) rather than a bespoke branch.
-
Source / parameterize the magic constant. PUF_SUPPORT_CLONE_CPS_TAXABLE_INTEREST_FALLBACK_SHARE = 0.680 is a bare unsourced literal (taxable share of interest used as the fallback when a per-record split is unavailable). Add a citation/derivation and make it a parameter (consistent with how the pension TAXABLE_PENSION_FRACTION is named), not an inline number.
-
Minor: the pension_income → taxable_private_pension_income / tax_exempt_private_pension_income aliasing is special-cased onto the split loop; fold it into the unified spec.
Done when
- The CPS-passthrough/split logic is a declarative spec + thin module outside
us.py, with unit tests.
- One splitter path handles direct passthrough, 2-component, and dividend splits.
- The interest fallback share is sourced and parameterized.
- Behavior is unchanged vs the validated candidate (regression-tested against the same export/identity gates that
#228 added).
Refs: #226, #227, #228.
Context
The CPS-passthrough / income-split fixes (#226, #227, #228) closed the dominant MP-vs-eCPS gap: MP was re-imputing CPS-measured income (pension, SS, interest, …) from PUF donors instead of carrying the CPS-reported total, which under-covered pension/SS (~half) and over-imputed interest. The fixes anchor PUF tax-detail leaves to CPS totals and derive the taxable/exempt split, and they are correct and tested.
They were landed under time pressure directly in
src/microplex_us/pipelines/us.py(the large pipeline file). PerAGENTS.md("Prefer spec-driven behavior over ad hoc logic in large pipeline files"; "push shared abstractions into core"), this warrants a consolidation pass to reach production-clean.Do this after the approach is validated (a passthrough+interest candidate beats a cleanly-scored production eCPS) — it is polish, not a blocker, and shouldn't churn mid-iteration.
Clean-up items
Extract out of
us.pyinto a dedicated, unit-testable module/spec. The passthrough/split lives as methods on the giant pipeline class (_preserve_cps_measured_puf_clone_totals, plus the interest-split logic in_augment_policyengine_person_inputs). Move it to a focused module (or a spec the donor framework consumes), so it's testable in isolation and not more ad-hoc logic in the pipeline file.Unify the repetitive pandas.
_preserve_cps_measured_puf_clone_totalshas three near-duplicateto_numeric/fillna/clipblocks — direct passthrough (PUF_SUPPORT_CLONE_CPS_DIRECT_PASSTHROUGH_ALIASES), 2-component split (PUF_SUPPORT_CLONE_CPS_SPLIT_TOTALS), and a special-cased dividend block outside the spec loop. Collapse into one declarative "CPS total → component split" splitter that covers dividends too (qualified/non-qualified) rather than a bespoke branch.Source / parameterize the magic constant.
PUF_SUPPORT_CLONE_CPS_TAXABLE_INTEREST_FALLBACK_SHARE = 0.680is a bare unsourced literal (taxable share of interest used as the fallback when a per-record split is unavailable). Add a citation/derivation and make it a parameter (consistent with how the pensionTAXABLE_PENSION_FRACTIONis named), not an inline number.Minor: the
pension_income → taxable_private_pension_income / tax_exempt_private_pension_incomealiasing is special-cased onto the split loop; fold it into the unified spec.Done when
us.py, with unit tests.#228added).Refs: #226, #227, #228.