Centralize generalizable pipeline logic from microplex-us into microplex core

## Context

Per `AGENTS.md` ("keep the US pack thin; push shared abstractions upstream into core; if a seam is useful for both UK and US, move it to `microplex`"), several heavy US-local modules reimplement — or could move to — core surfaces. Core already owns sources, targets, fusion, calibration (`Calibrator` + microcalibrate adapter), reweighting (`reweighting.py`, `targets/reweighting.py`, `targets/bundles.py`), and an eval harness (`eval/harness.py`, `eval/reweighting_benchmark.py`).

Candidates ranked by leverage. **Verify each US version genuinely duplicates core (vs intentional country-specific extension) before relocating.**

## 1. eCPS-replacement comparison harness → core `eval` (biggest win)

`src/microplex_us/pipelines/ecps_replacement_comparison.py` (1,607 lines) is ~90% country-agnostic: ~28 of ~33 functions are matched-household sampling (`_write_matched_dataset`, `_household_weights`, `_entity_*`), symmetric refit (`_fit_dense_refit`, `_objective`), holdout (`_build_holdout_target_mask`, `_validate_common_targets`, `_filter_loss_inputs_by_scope`), scoring/diagnostics (`_target_loss_diagnostics`, `_refit_matrix_score_*`, `_target_family_breakdown`, `_target_bucket_breakdown`, `_diagnostic_unweighted_msre`, `_protected_family_losses`), and utils (`_sha256`, `_dataset_descriptor`, …).

The **only** US-specific seam is `_extract_pe_native_loss_inputs` (shells to the PE-US scorer + US target DB to build the loss matrix) plus the US bad-target list.

→ Move the harness to `microplex.eval` (likely merging with the existing, apparently-overlapping `eval/reweighting_benchmark.py`), parameterized by a `loss-input extractor` protocol + target provider + baseline resolver. US becomes a ~200-line provider implementing the PE-US extractor. Also unblocks #117 (CI eval).

## 2. PE-native refit solver → core reweighting

`src/microplex_us/pipelines/pe_native_optimization.py` (`optimize_pe_native_loss_weights` — the monotone accelerated projected-gradient refit; `rewrite_policyengine_us_dataset_weights`). `AGENTS.md` says reweighting/solver belongs in core and local code "should remain a thin adapter over core bundle/reweighting surfaces." The projected-gradient + simplex-projection solver is pure numerics (loss matrix → weights) → core. Keep only the PE-entity weight I/O (household → tax_unit/spm_unit/family/marital rewrite) local, parameterized by the entity list (the #221 empty-derived-weight-group guard generalizes).

## 3. CPS-passthrough / income-split mechanism → core fusion/donor

(#226–#228) The splitter — preserve survey-measured totals when collapsing donor clones onto a survey scaffold; derive component splits from the survey total; impute only donor-specific detail + clone records — is identical for UK FRS + admin clones. US keeps only the variable specs + split fractions. This is the core-targeted version of #229.

## 4. De-dup drifted modules

`pe_targets.py`, `target_registry.py`, `unified_calibration.py`, `supabase_targets.py` exist in **both** `microplex` core and `microplex_us`. Confirm whether the US copies are drifted duplicates and collapse to a single source of truth in core.

Refs: #229 (passthrough extraction — target should be core), #117 (CI eval).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Centralize generalizable pipeline logic from microplex-us into microplex core #231

Context

1. eCPS-replacement comparison harness → core `eval` (biggest win)

2. PE-native refit solver → core reweighting

3. CPS-passthrough / income-split mechanism → core fusion/donor

4. De-dup drifted modules

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Centralize generalizable pipeline logic from microplex-us into microplex core #231

Description

Context

1. eCPS-replacement comparison harness → core eval (biggest win)

2. PE-native refit solver → core reweighting

3. CPS-passthrough / income-split mechanism → core fusion/donor

4. De-dup drifted modules

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. eCPS-replacement comparison harness → core `eval` (biggest win)