Skip to content

Depend on microunit for tax-unit construction instead of in-repo copy#1157

Draft
MaxGhenis wants to merge 2 commits into
mainfrom
wire-microunit
Draft

Depend on microunit for tax-unit construction instead of in-repo copy#1157
MaxGhenis wants to merge 2 commits into
mainfrom
wire-microunit

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

@MaxGhenis MaxGhenis commented May 30, 2026

Fixes #1156

Summary

Replaces this repository's in-repo tax-unit construction engine and rule helpers with a dependency on the standalone microunit package, and re-points every call site. microunit is the canonical home for these rules going forward; the engine logic was extracted from this repo and is byte-identical.

This is a behavior-preserving refactor: tax-unit output is unchanged (verified — see below).

What changed

Deleted (logic now lives in microunit):

  • policyengine_us_data/datasets/cps/tax_unit_construction.py — the engine.
  • policyengine_us_data/datasets/cps/tax_unit_rule_helpers.py — the federal rule helpers.

Re-pointed to import from microunit:

  • policyengine_us_data/datasets/cps/census_cps.py (construct_tax_units)
  • policyengine_us_data/datasets/acs/tax_unit_construction.py — the ACS wrapper (POLICYENGINE_MODE, construct_tax_units)
  • validation/cps_tax_unit_validation.py (POLICYENGINE_MODE, SUPPORTED_TAX_UNIT_CONSTRUCTION_MODES, CPSRelationshipCode, construct_tax_units, qualifying_child_age_test)
  • validation/cps_tax_unit_outcome_validation.py (construct_tax_units)
  • policyengine_us_data/datasets/acs/acs_to_cps_columns.py — docstring reference only.

Kept in this repo (deliberately excluded from microunit):

  • policyengine_us_data/datasets/acs/acs_to_cps_columns.py — ACS-specific column mapping.
  • policyengine_us_data/datasets/acs/tax_unit_construction.py — the thin ACS wrapper that maps ACS columns to CPS-like columns and adds the ACS imputed-link flags before delegating to microunit.construct_tax_units.

Dependency: added microunit @ git+https://github.com/PolicyEngine/microunit@d3eccbbd33aa51f1c310bd6c2f37c9c3735beeb1 to pyproject.toml (pinned to a SHA until the first PyPI release) and refreshed uv.lock.

Test coverage: moved vs kept

The CPS tax-unit construction engine unit tests are pure-engine and now live in microunit (its tests/test_tax_unit_construction.py is the same 22-test suite, function-for-function). To avoid maintaining those rule tests in two repos while keeping this repository's own integration paths covered, tests/unit/datasets/test_cps_tax_unit_construction.py is rewritten from a copy of the engine suite into thin integration tests that exercise this repo's use of microunitCensusCPS._create_tax_unit_table wiring (construction-mode and dataset-year pass-through, constructed-TAX_ID write-back, and CENSUS_TAX_ID preservation).

The ACS integration tests (tests/unit/datasets/test_acs_tax_unit_construction.py) are unchanged: they still cover ACS column mapping and ACS.add_id_variables wiring, which remain in this repo.

Behavior preservation

The engine body (everything from the HEAD = "HEAD" marker onward) is byte-identical between the deleted module and microunit; only the helper-import lines differ.

Verified empirically against the pre-refactor engine extracted from main, under the pinned policyengine-us==1.715.3:

  • dependent_gross_income_limit returns identical values for every year 2019–2026 (4,200 / 4,300 / 4,400 / 4,700 / 5,050 / 5,200 / 5,300).
  • 3,000 comparisons across randomized households × {2022, 2024, 2026} × {policyengine, census_documented} produced 0 differences in either the person-assignment frame or the tax-unit frame.

Note for reviewers: the data source for the qualifying-relative gross-income limit changes — microunit vendors its own dependent_gross_income_limit.yaml rather than reading policyengine-us parameters at runtime — but the resolved values are identical for the pinned policyengine-us, so output does not change. This repository never shipped that YAML itself, so nothing was removed from its package data.

Checks

  • make lint: clean.
  • uv run pytest tests/unit/datasets/test_cps_tax_unit_construction.py tests/unit/datasets/test_acs_tax_unit_construction.py: 18 passed.
  • uv run pytest tests/unit: 2044 passed, 4 skipped.

Replace the in-repo tax-unit construction engine and rule helpers with a
dependency on the standalone microunit package, re-pointing all call sites
(census_cps, the ACS wrapper, both validation scripts, and the tests). The
engine logic was extracted from this repo and is byte-identical; tax-unit
output is unchanged.

Delete policyengine_us_data/datasets/cps/tax_unit_construction.py and
tax_unit_rule_helpers.py. Keep the ACS column mapping and ACS wrapper, which
microunit deliberately excludes. Rewrite the CPS engine unit tests (which now
live in microunit) into thin integration tests for this repo's wiring; the
ACS integration tests are unchanged.

Pin microunit to a SHA until its first PyPI release and refresh uv.lock.

Fixes #1156

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
microunit 0.1.0 is now published to PyPI; drop the pre-PyPI git+https
commit pin in favor of a standard version constraint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Depend on microunit for tax-unit construction instead of in-repo copy

1 participant