Skip to content

Restore eCPS export and entity ID parity#112

Merged
MaxGhenis merged 1 commit into
mainfrom
codex/ecps-entity-export-parity-20260530
May 30, 2026
Merged

Restore eCPS export and entity ID parity#112
MaxGhenis merged 1 commit into
mainfrom
codex/ecps-entity-export-parity-20260530

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

What changed

  • Derive and export is_household_head from relationship_to_head when PE entity tables do not already carry the explicit input.
  • Preserve complete existing family_id, spm_unit_id, and marital_unit_id columns during PE entity-table construction, normalizing IDs only when raw IDs repeat across households.
  • Add regression coverage for the derived export and complete existing group-ID preservation.

Why

The no-ACS eCPS-shaped comparison showed Microplex still missed one non-formula eCPS-exported column, is_household_head. The remaining eCPS-only columns are PE formula-owned and should stay excluded under the no-formula-export contract. The same comparison also exposed entity structure drift, especially SPM/family/marital group construction; preserving complete source IDs is the safest first step toward source-stage parity.

Validation

  • uv run --python 3.13 --extra dev --extra policyengine pytest tests/policyengine/test_us.py::TestPolicyEngineUSProjection tests/pipelines/test_us.py::TestUSMicroplexPipeline -k 'policyengine_entity_tables or export_variable_maps or time_period_arrays or group_tables or tax_unit'
  • uv run --python 3.13 --extra dev --extra policyengine ruff check src/microplex_us/policyengine/us.py src/microplex_us/pipelines/us.py tests/policyengine/test_us.py tests/pipelines/test_us.py
  • uv run --python 3.13 --extra dev --extra policyengine ruff format --check src/microplex_us/policyengine/us.py src/microplex_us/pipelines/us.py tests/policyengine/test_us.py tests/pipelines/test_us.py
  • Re-export smoke on the completed no-ACS artifact confirmed is_household_head is now present; the only remaining eCPS-only variables are the nine PE formula-owned columns intentionally excluded from Microplex export.

@MaxGhenis MaxGhenis marked this pull request as ready for review May 30, 2026 09:41
@MaxGhenis MaxGhenis merged commit 76fb8e6 into main May 30, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the codex/ecps-entity-export-parity-20260530 branch May 30, 2026 09:41
MaxGhenis added a commit that referenced this pull request May 30, 2026
Add microunit as a dependency and route the reconstruction-from-scratch
tax-unit path through microunit.construct_tax_units when the person frame
carries microunit's raw CPS input columns (PH_SEQ, A_LINENO, A_MARITL,
A_SPOUSE, PEPAR1, PEPAR2, A_EXPRRP). When those columns are absent -- the
current production case, since microplex's reconstruction frame collapses
relationship_to_head and drops the spouse/parent pointers -- the new
USPipeline._build_policyengine_tax_units_via_microunit returns None and the
legacy role-flag reconstruction runs unchanged. The authoritative-ID path
(#112) is never routed here.

Net effect is behavior-preserving on today's data: the delegation stays
inert until an upstream change threads CPS columns through to entity
construction. microunit IS eCPS's tax-unit engine, so activating the
delegation converges microplex's tax units toward eCPS's; any resulting
loss movement is an entity-convergence effect and must be interpreted as
such, not as a quality win (see #113).

Adds tests/pipelines/test_us_microunit_delegation.py (4 passing); ruff clean.
Implementation produced by the parallel wire-microunit agent; verified
(ruff + delegation tests) and committed here.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit that referenced this pull request May 31, 2026
…ing; part of #113) (#114)

* Delegate PE tax-unit reconstruction to microunit (part of #113)

Add microunit as a dependency and route the reconstruction-from-scratch
tax-unit path through microunit.construct_tax_units when the person frame
carries microunit's raw CPS input columns (PH_SEQ, A_LINENO, A_MARITL,
A_SPOUSE, PEPAR1, PEPAR2, A_EXPRRP). When those columns are absent -- the
current production case, since microplex's reconstruction frame collapses
relationship_to_head and drops the spouse/parent pointers -- the new
USPipeline._build_policyengine_tax_units_via_microunit returns None and the
legacy role-flag reconstruction runs unchanged. The authoritative-ID path
(#112) is never routed here.

Net effect is behavior-preserving on today's data: the delegation stays
inert until an upstream change threads CPS columns through to entity
construction. microunit IS eCPS's tax-unit engine, so activating the
delegation converges microplex's tax units toward eCPS's; any resulting
loss movement is an entity-convergence effect and must be interpreted as
such, not as a quality win (see #113).

Adds tests/pipelines/test_us_microunit_delegation.py (4 passing); ruff clean.
Implementation produced by the parallel wire-microunit agent; verified
(ruff + delegation tests) and committed here.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Depend on microunit from PyPI (>=0.1.0) instead of the git pin

microunit 0.1.0 is now published to PyPI, so drop the pre-PyPI
git+https commit pin in favor of a standard version constraint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Harden microunit delegation: defensive filing-status normalize + order test

- Route microunit's filing_status through _normalize_policyengine_filing_status
  so the delegated path cannot diverge from the legacy paths if microunit ever
  changes its spelling/casing (today the vocabularies already match).
- Add a regression test feeding rows out of PH_SEQ/A_LINENO order, asserting
  correct unit/role/filing assignment — locks in microunit's input-row-order
  contract that the positional TAX_ID mapping relies on.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant