Preserve set_input values across apply_reform by MaxGhenis · Pull Request #475 · PolicyEngine/policyengine-core

MaxGhenis · 2026-04-17T22:14:09Z

Summary

Closes #473 (remaining piece), fixes PolicyEngine/policyengine.py#1628, unblocks PolicyEngine/policyengine-us#8058.

Two related bugs prevent country packages that build a situation/dataset then apply a structural reform during Simulation.__init__ (the policyengine-uk and policyengine-us pattern) from working under core 3.24.x. Plus a latent bug in GroupPopulation.clone that surfaced while wiring up the fix.

The problem

Wipe bug. The H3 fix (#463) added _invalidate_all_caches to wipe holder._memory_storage._arrays for every variable after apply_reform, so formula output caches couldn't survive a reform that invalidated them. But user-provided set_input values share the same storage, so they got wiped too. Surfaced as policyengine-uk household-impact tests returning 0 because age, employment_income, would_claim_* were wiped before any calculation ran (#1628).

Nested-branch fallback bug. The C1 fix (#457) correctly stopped Holder.get_array from returning values stored under an arbitrary sibling branch — that had caused silent reform↔baseline cross-contamination. But it only fell back to the default branch, which broke nested-branch patterns. policyengine-us uses a two-level pattern: tax_liability_if_itemizing sets tax_unit_itemizes=True on an itemizing branch, and ctc_limiting_tax_liability forks a no_salt sub-branch from it. Without parent-branch fallback, tax_unit_itemizes re-runs its formula on no_salt, which calls tax_liability_if_itemizing again, producing an infinite recursion (surfaced as RecursionError and a CycleError).

Latent GroupPopulation.clone bug. GroupPopulation.clone called holder.clone(self) — passing the source population rather than the cloned population. Holder.clone then set new_dict["simulation"] = population.simulation, pointing the cloned holder's .simulation back at the original simulation. Branch-aware lookups therefore started from the wrong simulation.

Fixes

Simulation._invalidate_all_caches preserves user inputs. Track every (variable, branch, period) populated via Simulation.set_input OR Holder.set_input (the latter covers the Simulation(situation=...) path through SimulationBuilder.finalize_variables_init). On invalidation: snapshot those entries, wipe everything, replay the snapshotted inputs back. Formula-output caches are still invalidated; user inputs survive.
Holder.get_array walks simulation.parent_branch. Walk up the parent-branch chain before falling back to default. Sibling branches (no parent relationship) still don't leak into each other — the C1 guarantee holds.
GroupPopulation.clone passes the cloned population to holders. holder.clone(result) instead of holder.clone(self).

The _user_input_keys attribute is lazy-initialised inside both Simulation.set_input and Holder.set_input so country-package subclasses that bypass super().__init__ (same pattern _fast_cache was guarded for in #474) automatically pick up the preservation without downstream code changes.

Tests

tests/core/test_apply_reform_preserves_user_inputs.py:

a set_input value survives a no-op reform
multiple set_input values across different variables all survive
a situation-dict input survives a reform (covers the holder.set_input path via SimulationBuilder.finalize_variables_init, not just Simulation.set_input)
the H3 fix (formula cache invalidation) still holds — neutralize_variable still drops the cached formula output

tests/core/test_holder_branch_fallback.py:

(pre-existing) sibling-branch cross-contamination stays blocked
(pre-existing) fallback to the default branch still works
new: get_array on a nested branch walks parent_branch and returns the ancestor's value
new: GroupPopulation.clone points each holder's .simulation at the clone, not the source

All three existing test_fast_cache_guards.py tests (bare-Simulation case) still pass because of the hasattr-gated lazy init.

End-to-end verification

Installing this branch into a policyengine.py checkout with policyengine-uk==2.88.0:

Test	Before	After
`test_single_adult_with_employment_income` (income_tax > 0)	0.0 (FAIL)	£7,486 ✓
`test_single_adult_with_employment_income` (NI > 0)	0.0 (FAIL)	£2,994 ✓
`test_family_with_children` (child_benefit > 0)	0.0 (FAIL)	£2,328 ✓

Installing this branch into a policyengine-us==1.647.0 checkout (core 3.24.1 floor):

tax_unit_itemizes.yaml integration test that crashed with TypeError: int() argument ... not 'NoneType' → 7/7 pass.

Test plan

tests/core/test_apply_reform_preserves_user_inputs.py — 4/4 pass
tests/core/test_holder_branch_fallback.py — 4/4 pass
tests/core/test_apply_reform_invalidates_cache.py (H3 regression) — 1/1 still passes
tests/core/test_fast_cache_guards.py — 3/3 still pass
tests/core/ full suite — 514/515 pass (1 pre-existing test_parameter_security failure unrelated to these changes)
uvx ruff format --check / uvx ruff check clean
End-to-end with policyengine-uk 2.88.0 + policyengine.py: failing household-impact tests now pass
End-to-end with policyengine-us 1.647.0: tax_unit_itemizes.yaml passes
CI passes

The H3 fix (PR #463) added `_invalidate_all_caches` to wipe `holder._memory_storage._arrays` for every variable after `apply_reform`, so formula output caches couldn't survive a reform that invalidated them. But user-provided `set_input` values share the same storage, so they got wiped too. Country packages (e.g. `policyengine_uk.Simulation.__init__`) follow this pattern: 1. build populations, call `self.set_input(...)` from the dataset 2. apply a structural reform derived from parameters 3. calculate downstream variables Step 2's `apply_reform` triggered `_invalidate_all_caches`, silently discarding the dataset loaded in step 1. Surfaced in PolicyEngine/policyengine.py#1628 (UK household-impact tests returning 0 because age, employment_income, would_claim_* — every single input — were wiped before any calculation ran). Fix: track every `(variable, branch, period)` populated via `Simulation.set_input` in a new `_user_input_keys` set. On `_invalidate_all_caches`, snapshot those entries from storage, perform the wipe, then replay them back. Formula-output caches are still invalidated; user inputs survive. The attribute is lazy-initialised inside `set_input` so country- package subclasses that bypass `super().__init__` (the same pattern `_fast_cache` was guarded for in PR #474) automatically pick up the preservation without a downstream code change. Three new tests under `tests/core/test_apply_reform_preserves_user_inputs.py`: - a set_input value survives a no-op reform - multiple set_input values survive - the H3 fix (formula cache invalidation) still holds — a neutralize_variable reform still drops the cached formula output End-to-end: installing this branch into a policyengine.py checkout with `policyengine-uk==2.88.0` turns the failing `TestUKHouseholdImpact.test_single_adult_with_employment_income` (0.0 > 0) into a passing £7,486 income tax on a £50k salary, and `test_family_with_children` into a passing £2,328 child benefit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…input Two follow-up fixes on top of the ``Simulation.set_input`` preservation in this PR, plus a latent bug that surfaced while wiring them up. Together they unblock ``policyengine-us`` bumping its core floor past 3.24.0 (``PolicyEngine/policyengine-us#8066``). ## 1. ``Holder.set_input`` now also records ``_user_input_keys`` The previous patch only recorded keys inside ``Simulation.set_input``. But ``SimulationBuilder.finalize_variables_init`` (the path taken when ``Simulation(situation=...)`` is called) routes inputs through ``holder.set_input`` directly, bypassing the simulation-level hook. So every situation-dict input got wiped by the post-``apply_reform`` cache invalidation in country-package subclasses that apply a structural reform during construction (the ``policyengine-us`` pattern). Recording the key inside ``Holder.set_input`` covers both paths: ``Simulation.set_input`` still adds its own entry (harmless duplicate in a set), and ``holder.set_input`` picks up the situation- dict and dataset loader paths. ## 2. ``Holder.get_array`` walks ``simulation.parent_branch`` The C1 fix (``fix-holder-get-array-branch-leak``) correctly stopped ``get_array`` from returning values stored under an arbitrary sibling branch — that had caused silent reform↔baseline cross-contamination. But it only fell back to the ``default`` branch, so a nested branch (``no_salt`` cloned from ``itemizing``) could no longer read inputs set on its parent. ``policyengine-us`` uses that two-level pattern: ``tax_liability_if_itemizing`` sets ``tax_unit_itemizes=True`` on an ``itemizing`` branch, and ``ctc_limiting_tax_liability`` forks a ``no_salt`` sub-branch from it. Without parent-branch fallback, ``tax_unit_itemizes`` re-runs its formula on ``no_salt``, which calls ``tax_liability_if_itemizing`` again, producing a ``CycleError`` → eventually surfaced as a recursion exception. The fix walks ``simulation.parent_branch`` up to the root and returns the first ancestor that has a value. Sibling branches (no parent relationship) still don't leak into each other — the C1 guarantee holds. ## 3. ``GroupPopulation.clone`` passes the cloned population to holders Latent bug that surfaced while fixing #2: ``GroupPopulation.clone`` was calling ``holder.clone(self)`` — passing the *source* population to each holder. ``Holder.clone`` then set ``new_dict["simulation"] = population.simulation``, pointing the cloned holder's ``.simulation`` reference back at the original sim rather than the clone. That meant a holder on the ``no_salt`` clone thought it belonged to the ``itemizing`` simulation, so the ``parent_branch`` walk started from the wrong simulation and missed the ancestor's inputs. Pass ``result`` (the cloned population) so the holder's ``.simulation`` points at the clone. ## Tests - ``test_apply_reform_preserves_situation_dict_inputs`` — covers the ``Simulation(situation=...)`` path that bypasses ``Simulation.set_input`` (fails without #1). - ``test_get_array_falls_back_through_parent_branch_chain`` — covers nested-branch parent inheritance (fails without #2). - ``test_group_population_clone_sets_holder_simulation_to_clone`` — pins the cloned holder's ``.simulation`` to the clone (fails without #3). All existing core tests still pass (514 pass, 1 pre-existing parameter security failure unrelated to these changes). The ``tax_unit_itemizes.yaml`` integration test (7/7) and the full ``gov/irs/income/taxable_income`` suite (253/253) in ``policyengine-us`` 1.647.0 pass under this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Core 3.24.4 ships the fixes from PolicyEngine/policyengine-core#475: - `Simulation.apply_reform` preserves `set_input` values (the H3 cache invalidation used to wipe user-provided dataset inputs) - `Holder.get_array` walks up `simulation.parent_branch` before falling back to `default`, so nested branches (e.g. `no_salt` under `itemizing`) still see ancestor inputs - `GroupPopulation.clone` passes the cloned population to holder.clone so branch-aware lookups resolve correctly Together these unblock the strict breakdown/children validator added in core 3.24.0. Before 3.24.4, bumping the floor triggered `TypeError: int() argument ... not 'NoneType'` in the `tax_unit_itemizes.yaml` integration test (state_fips was wiped by apply_reform) plus an infinite recursion in `tax_liability_if_itemizing`. With 3.24.4 (or later), the existing `test_system_import.py` regression test now actually catches breakdown/children mismatches at CI time — the class of bug that caused issue PolicyEngine#8055 and the three partial-fix patches PolicyEngine#8045, PolicyEngine#8049, PolicyEngine#8051. Verified locally against core 3.25.0: all 794 IRS tests pass. Also remove the `[tool.uv]` environments restriction (`python_version >= '3.11'`) since core 3.24+ supports 3.9/3.10 directly on PyPI — the lockfile now resolves cleanly across the full 3.9-3.14 range. Closes PolicyEngine#8066. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PE-core 3.24.0-3.24.3 introduced a cache-invalidation cascade that wiped set_input values whenever a reform was applied during simulation construction. Test suite was green on 3.23.6 and broken on 3.24.x; 3.25.0 (PolicyEngine/policyengine-core#475) preserves user-provided inputs across apply_reform while still invalidating formula-output caches. Verified locally: test_latest_data_smoke, test_lha_freeze, test_parametric_reform_impacts, and test_uc_rebalancing all pass after the bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PE-core 3.24.0-3.24.3 cache-invalidation wiped set_input values across apply_reform, causing UK household-impact calculations to return zero (#1628). 3.25.0 (PolicyEngine/policyengine-core#475) preserves user inputs while still invalidating formula-output caches. All 11 tests/test_household_impact.py cases pass on the new pin. Closes #1628 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PE-core 3.24.0-3.24.3 introduced a cache-invalidation cascade that wiped set_input values whenever a reform was applied during simulation construction. Test suite was green on 3.23.6 and broken on 3.24.x; 3.25.0 (PolicyEngine/policyengine-core#475) preserves user-provided inputs across apply_reform while still invalidating formula-output caches. Verified locally: test_latest_data_smoke, test_lha_freeze, test_parametric_reform_impacts, and test_uc_rebalancing all pass after the bump. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…283) PE-core 3.24.0-3.24.3 cache-invalidation wiped set_input values across apply_reform, causing UK household-impact calculations to return zero (#1628). 3.25.0 (PolicyEngine/policyengine-core#475) preserves user inputs while still invalidating formula-output caches. All 11 tests/test_household_impact.py cases pass on the new pin. Closes #1628 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The 3.24.4-era implementation walked every variable in the tax-benefit system via `get_holder(variable)`, which lazy-creates a `Holder` for each one. With thousands of variables in policyengine-us, this inflated `apply_reform` from milliseconds to seconds — the YAML full-suite went from ~17 min to ~51 min per job and started timing out at the 1-hour GitHub Actions limit. Untouched variables have no holder and therefore nothing to wipe. Iterate `population._holders.values()` on each population to hit only the variables that actually exist, preserving the set_input replay behaviour from PolicyEngine#475. Verified against tests/core/test_apply_reform_preserves_user_inputs and tests/core/test_apply_reform_invalidates_cache (all 5 pass). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The 3.24.4-era implementation walked every variable in the tax-benefit system via `get_holder(variable)`, which lazy-creates a `Holder` for each one. With thousands of variables in policyengine-us, this inflated `apply_reform` from milliseconds to seconds — the YAML full-suite went from ~17 min to ~51 min per job and started timing out at the 1-hour GitHub Actions limit. Untouched variables have no holder and therefore nothing to wipe. Iterate `population._holders.values()` on each population to hit only the variables that actually exist, preserving the set_input replay behaviour from #475. Verified against tests/core/test_apply_reform_preserves_user_inputs and tests/core/test_apply_reform_invalidates_cache (all 5 pass). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Core 3.24.4 ships the fixes from PolicyEngine/policyengine-core#475: - Simulation.apply_reform preserves set_input values - Holder.get_array walks up simulation.parent_branch before default - GroupPopulation.clone passes the cloned population to holder.clone Together these unblock the strict breakdown/children validator added in core 3.24.0. The existing test_system_import.py (added in PolicyEngine#8058) now actually validates the parameter tree at CI time. Verified locally against core 3.25.0: all 794 IRS tests pass. Remove the [tool.uv] environments restriction — core 3.24+ supports 3.9/3.10 directly on PyPI. Closes PolicyEngine#8066. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Core 3.24.4 ships the fixes from PolicyEngine/policyengine-core#475: - Simulation.apply_reform preserves set_input values - Holder.get_array walks up simulation.parent_branch before default - GroupPopulation.clone passes the cloned population to holder.clone Together these unblock the strict breakdown/children validator added in core 3.24.0. The existing test_system_import.py (added in #8058) now actually validates the parameter tree at CI time. Verified locally against core 3.25.0: all 794 IRS tests pass. Remove the [tool.uv] environments restriction — core 3.24+ supports 3.9/3.10 directly on PyPI. Closes #8066. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

MaxGhenis force-pushed the fix-invalidate-preserves-user-inputs branch from a000f89 to 02a9161 Compare April 17, 2026 22:19

MaxGhenis mentioned this pull request Apr 17, 2026

Core 3.24.x Enum cast regression blocks core floor bump PolicyEngine/policyengine-us#8066

Closed

4 tasks

MaxGhenis force-pushed the fix-invalidate-preserves-user-inputs branch from 1ce8b05 to 745ffdf Compare April 18, 2026 00:06

MaxGhenis merged commit 4397550 into master Apr 18, 2026
22 checks passed

MaxGhenis deleted the fix-invalidate-preserves-user-inputs branch April 18, 2026 00:10

MaxGhenis mentioned this pull request Apr 18, 2026

Bump policyengine-core floor to 3.24.4 (closes #8066) PolicyEngine/policyengine-us#8069

Closed

2 tasks

This was referenced Apr 18, 2026

Bump policyengine-core to >=3.25.0 PolicyEngine/policyengine-uk#1633

Merged

Bump policyengine-core to >=3.25.0 PolicyEngine/policyengine-us#8070

Closed

MaxGhenis mentioned this pull request Apr 18, 2026

Bump policyengine_core to >=3.25.0 PolicyEngine/policyengine.py#283

Merged

1 task

policyengine bot mentioned this pull request Apr 18, 2026

Update PolicyEngine UK to 2.88.5 PolicyEngine/policyengine-api#3479

Open

MaxGhenis mentioned this pull request Apr 18, 2026

Fix _invalidate_all_caches performance regression #478

Merged

5 tasks

MaxGhenis mentioned this pull request Apr 18, 2026

Bump policyengine-core to >=3.25.1 PolicyEngine/policyengine-us#8077

Merged

2 tasks

MaxGhenis mentioned this pull request Apr 18, 2026

Bump policyengine-core floor to 3.24.4 (closes #8066) PolicyEngine/policyengine-us#8078

Merged

2 tasks

policyengine bot mentioned this pull request Apr 19, 2026

Update PolicyEngine UK to 2.88.6 PolicyEngine/policyengine-api#3483

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve set_input values across apply_reform#475

Preserve set_input values across apply_reform#475
MaxGhenis merged 2 commits intomasterfrom
fix-invalidate-preserves-user-inputs

MaxGhenis commented Apr 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxGhenis commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The problem

Fixes

Tests

End-to-end verification

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MaxGhenis commented Apr 17, 2026 •

edited

Loading