Skip to content

[v4] Extract MicrosimulationModelVersion base class#294

Closed
MaxGhenis wants to merge 4 commits intov4-unify-program-statsfrom
v4-base-extraction
Closed

[v4] Extract MicrosimulationModelVersion base class#294
MaxGhenis wants to merge 4 commits intov4-unify-program-statsfrom
v4-base-extraction

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

Extracts ~300 lines of duplicated logic from PolicyEngineUSLatest and PolicyEngineUKLatest into a shared MicrosimulationModelVersion base class in tax_benefit_models.common.

What moves to the base:

  • Release-manifest fetch + installed-version warning
  • Data-release certification
  • Variable/parameter population from the country system
  • save() / load() + output-dataset filepath convention
  • _build_entity_relationships using declared group_entities

What subclasses now declare (class-level):

  • country_code ("us" / "uk")
  • package_name ("policyengine-us" / "policyengine-uk")
  • group_entities
  • entity_variables

Plus four thin hooks:

  • _load_system() — country system object
  • _load_region_registry() — country RegionRegistry
  • _dataset_class property — country PolicyEngine{Country}Dataset
  • _get_runtime_data_build_metadata() — optional build-metadata dict

run() intentionally stays per-country. The US applies reforms at Microsimulation construction and manually copies structural ID/weight columns; the UK wraps inputs as UKSingleYearDataset and applies reforms via a modifier after construction. Hiding that behind a shared skeleton would mask real divergence.

Safety

Behaviour preservation is guarded by tests/test_base_extraction_snapshot.py — byte-level JSON snapshots for four US + four UK household cases plus a model-surface snapshot. Snapshots were frozen pre-refactor; they re-run clean post-refactor (zero drift).

  • 391/391 tests pass
  • Reviewer (code-simplifier) cleared pydantic / hook-placement / patch-path concerns; no substantive issues

Stack

Stacked on #293 (unify ProgramStatistics / ProgrammeStatistics). Base branch will retarget to main once #288-#293 merge.

Test plan

  • pytest tests/ green locally (391 passed)
  • Snapshot test locks 8 household cases + 2 model surfaces
  • test_manifest_version_mismatch still exercises the real warning branch (patch paths updated to the shared base module)

🤖 Generated with Claude Code

Pulls ~300 lines of shared init/save/load logic out of
PolicyEngineUSLatest and PolicyEngineUKLatest into a
MicrosimulationModelVersion base in tax_benefit_models.common.

The base handles:
- Release-manifest fetch + installed-version warning
- Data-release certification
- Variable/parameter population from the country system
- save() / load() + output-dataset filepath convention
- _build_entity_relationships via declared group_entities

Subclasses declare country_code, package_name, group_entities,
entity_variables, and implement four thin hooks (_load_system,
_load_region_registry, _dataset_class, _get_runtime_data_build_metadata).
run() intentionally stays per-country: the US applies reforms at
Microsimulation construction and manually copies structural columns,
while the UK wraps inputs as UKSingleYearDataset and applies reforms
after construction. Hiding those behind a shared skeleton would mask
real divergence.

Behaviour preservation is guarded by a byte-level snapshot test
(tests/test_base_extraction_snapshot.py) covering four US and four
UK household cases plus a model-surface snapshot. All 391 tests pass
with zero snapshot drift.
@MaxGhenis
Copy link
Copy Markdown
Contributor Author

Superseded by #298 (consolidated v4 launch PR). All commits cherry-picked cleanly onto v4.

@MaxGhenis MaxGhenis closed this Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant