Skip to content

Preserve set_input values across apply_reform#475

Merged
MaxGhenis merged 2 commits intomasterfrom
fix-invalidate-preserves-user-inputs
Apr 18, 2026
Merged

Preserve set_input values across apply_reform#475
MaxGhenis merged 2 commits intomasterfrom
fix-invalidate-preserves-user-inputs

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

@MaxGhenis MaxGhenis commented Apr 17, 2026

Summary

Closes #473 (remaining piece), fixes PolicyEngine/policyengine.py#1628, unblocks PolicyEngine/policyengine-us#8058.

Two related bugs prevent country packages that build a situation/dataset then apply a structural reform during Simulation.__init__ (the policyengine-uk and policyengine-us pattern) from working under core 3.24.x. Plus a latent bug in GroupPopulation.clone that surfaced while wiring up the fix.

The problem

Wipe bug. The H3 fix (#463) added _invalidate_all_caches to wipe holder._memory_storage._arrays for every variable after apply_reform, so formula output caches couldn't survive a reform that invalidated them. But user-provided set_input values share the same storage, so they got wiped too. Surfaced as policyengine-uk household-impact tests returning 0 because age, employment_income, would_claim_* were wiped before any calculation ran (#1628).

Nested-branch fallback bug. The C1 fix (#457) correctly stopped Holder.get_array from returning values stored under an arbitrary sibling branch — that had caused silent reform↔baseline cross-contamination. But it only fell back to the default branch, which broke nested-branch patterns. policyengine-us uses a two-level pattern: tax_liability_if_itemizing sets tax_unit_itemizes=True on an itemizing branch, and ctc_limiting_tax_liability forks a no_salt sub-branch from it. Without parent-branch fallback, tax_unit_itemizes re-runs its formula on no_salt, which calls tax_liability_if_itemizing again, producing an infinite recursion (surfaced as RecursionError and a CycleError).

Latent GroupPopulation.clone bug. GroupPopulation.clone called holder.clone(self) — passing the source population rather than the cloned population. Holder.clone then set new_dict["simulation"] = population.simulation, pointing the cloned holder's .simulation back at the original simulation. Branch-aware lookups therefore started from the wrong simulation.

Fixes

  1. Simulation._invalidate_all_caches preserves user inputs. Track every (variable, branch, period) populated via Simulation.set_input OR Holder.set_input (the latter covers the Simulation(situation=...) path through SimulationBuilder.finalize_variables_init). On invalidation: snapshot those entries, wipe everything, replay the snapshotted inputs back. Formula-output caches are still invalidated; user inputs survive.

  2. Holder.get_array walks simulation.parent_branch. Walk up the parent-branch chain before falling back to default. Sibling branches (no parent relationship) still don't leak into each other — the C1 guarantee holds.

  3. GroupPopulation.clone passes the cloned population to holders. holder.clone(result) instead of holder.clone(self).

The _user_input_keys attribute is lazy-initialised inside both Simulation.set_input and Holder.set_input so country-package subclasses that bypass super().__init__ (same pattern _fast_cache was guarded for in #474) automatically pick up the preservation without downstream code changes.

Tests

tests/core/test_apply_reform_preserves_user_inputs.py:

  • a set_input value survives a no-op reform
  • multiple set_input values across different variables all survive
  • a situation-dict input survives a reform (covers the holder.set_input path via SimulationBuilder.finalize_variables_init, not just Simulation.set_input)
  • the H3 fix (formula cache invalidation) still holds — neutralize_variable still drops the cached formula output

tests/core/test_holder_branch_fallback.py:

  • (pre-existing) sibling-branch cross-contamination stays blocked
  • (pre-existing) fallback to the default branch still works
  • new: get_array on a nested branch walks parent_branch and returns the ancestor's value
  • new: GroupPopulation.clone points each holder's .simulation at the clone, not the source

All three existing test_fast_cache_guards.py tests (bare-Simulation case) still pass because of the hasattr-gated lazy init.

End-to-end verification

Installing this branch into a policyengine.py checkout with policyengine-uk==2.88.0:

Test Before After
test_single_adult_with_employment_income (income_tax > 0) 0.0 (FAIL) £7,486 ✓
test_single_adult_with_employment_income (NI > 0) 0.0 (FAIL) £2,994 ✓
test_family_with_children (child_benefit > 0) 0.0 (FAIL) £2,328 ✓

Installing this branch into a policyengine-us==1.647.0 checkout (core 3.24.1 floor):

  • tax_unit_itemizes.yaml integration test that crashed with TypeError: int() argument ... not 'NoneType' → 7/7 pass.

Test plan

  • tests/core/test_apply_reform_preserves_user_inputs.py — 4/4 pass
  • tests/core/test_holder_branch_fallback.py — 4/4 pass
  • tests/core/test_apply_reform_invalidates_cache.py (H3 regression) — 1/1 still passes
  • tests/core/test_fast_cache_guards.py — 3/3 still pass
  • tests/core/ full suite — 514/515 pass (1 pre-existing test_parameter_security failure unrelated to these changes)
  • uvx ruff format --check / uvx ruff check clean
  • End-to-end with policyengine-uk 2.88.0 + policyengine.py: failing household-impact tests now pass
  • End-to-end with policyengine-us 1.647.0: tax_unit_itemizes.yaml passes
  • CI passes

The H3 fix (PR #463) added `_invalidate_all_caches` to wipe
`holder._memory_storage._arrays` for every variable after
`apply_reform`, so formula output caches couldn't survive a reform
that invalidated them. But user-provided `set_input` values share the
same storage, so they got wiped too.

Country packages (e.g. `policyengine_uk.Simulation.__init__`) follow
this pattern:

  1. build populations, call `self.set_input(...)` from the dataset
  2. apply a structural reform derived from parameters
  3. calculate downstream variables

Step 2's `apply_reform` triggered `_invalidate_all_caches`, silently
discarding the dataset loaded in step 1. Surfaced in
PolicyEngine/policyengine.py#1628 (UK household-impact tests
returning 0 because age, employment_income, would_claim_* — every
single input — were wiped before any calculation ran).

Fix: track every `(variable, branch, period)` populated via
`Simulation.set_input` in a new `_user_input_keys` set. On
`_invalidate_all_caches`, snapshot those entries from storage,
perform the wipe, then replay them back. Formula-output caches are
still invalidated; user inputs survive.

The attribute is lazy-initialised inside `set_input` so country-
package subclasses that bypass `super().__init__` (the same pattern
`_fast_cache` was guarded for in PR #474) automatically pick up the
preservation without a downstream code change.

Three new tests under `tests/core/test_apply_reform_preserves_user_inputs.py`:
- a set_input value survives a no-op reform
- multiple set_input values survive
- the H3 fix (formula cache invalidation) still holds — a
  neutralize_variable reform still drops the cached formula output

End-to-end: installing this branch into a policyengine.py checkout
with `policyengine-uk==2.88.0` turns the failing
`TestUKHouseholdImpact.test_single_adult_with_employment_income`
(0.0 > 0) into a passing £7,486 income tax on a £50k salary, and
`test_family_with_children` into a passing £2,328 child benefit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…input

Two follow-up fixes on top of the ``Simulation.set_input`` preservation
in this PR, plus a latent bug that surfaced while wiring them up.
Together they unblock ``policyengine-us`` bumping its core floor past
3.24.0 (``PolicyEngine/policyengine-us#8066``).

## 1. ``Holder.set_input`` now also records ``_user_input_keys``

The previous patch only recorded keys inside ``Simulation.set_input``.
But ``SimulationBuilder.finalize_variables_init`` (the path taken
when ``Simulation(situation=...)`` is called) routes inputs through
``holder.set_input`` directly, bypassing the simulation-level hook.
So every situation-dict input got wiped by the post-``apply_reform``
cache invalidation in country-package subclasses that apply a
structural reform during construction (the ``policyengine-us``
pattern). Recording the key inside ``Holder.set_input`` covers both
paths: ``Simulation.set_input`` still adds its own entry (harmless
duplicate in a set), and ``holder.set_input`` picks up the situation-
dict and dataset loader paths.

## 2. ``Holder.get_array`` walks ``simulation.parent_branch``

The C1 fix (``fix-holder-get-array-branch-leak``) correctly stopped
``get_array`` from returning values stored under an arbitrary sibling
branch — that had caused silent reform↔baseline cross-contamination.
But it only fell back to the ``default`` branch, so a nested branch
(``no_salt`` cloned from ``itemizing``) could no longer read inputs
set on its parent. ``policyengine-us`` uses that two-level pattern:
``tax_liability_if_itemizing`` sets ``tax_unit_itemizes=True`` on an
``itemizing`` branch, and ``ctc_limiting_tax_liability`` forks a
``no_salt`` sub-branch from it. Without parent-branch fallback,
``tax_unit_itemizes`` re-runs its formula on ``no_salt``, which calls
``tax_liability_if_itemizing`` again, producing a ``CycleError`` →
eventually surfaced as a recursion exception.

The fix walks ``simulation.parent_branch`` up to the root and returns
the first ancestor that has a value. Sibling branches (no parent
relationship) still don't leak into each other — the C1 guarantee
holds.

## 3. ``GroupPopulation.clone`` passes the cloned population to holders

Latent bug that surfaced while fixing #2: ``GroupPopulation.clone``
was calling ``holder.clone(self)`` — passing the *source* population
to each holder. ``Holder.clone`` then set
``new_dict["simulation"] = population.simulation``, pointing the
cloned holder's ``.simulation`` reference back at the original sim
rather than the clone. That meant a holder on the ``no_salt`` clone
thought it belonged to the ``itemizing`` simulation, so the
``parent_branch`` walk started from the wrong simulation and missed
the ancestor's inputs. Pass ``result`` (the cloned population) so
the holder's ``.simulation`` points at the clone.

## Tests

- ``test_apply_reform_preserves_situation_dict_inputs`` — covers the
  ``Simulation(situation=...)`` path that bypasses
  ``Simulation.set_input`` (fails without #1).
- ``test_get_array_falls_back_through_parent_branch_chain`` — covers
  nested-branch parent inheritance (fails without #2).
- ``test_group_population_clone_sets_holder_simulation_to_clone`` —
  pins the cloned holder's ``.simulation`` to the clone (fails
  without #3).

All existing core tests still pass (514 pass, 1 pre-existing parameter
security failure unrelated to these changes). The
``tax_unit_itemizes.yaml`` integration test (7/7) and the full
``gov/irs/income/taxable_income`` suite (253/253) in
``policyengine-us`` 1.647.0 pass under this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis force-pushed the fix-invalidate-preserves-user-inputs branch from 1ce8b05 to 745ffdf Compare April 18, 2026 00:06
@MaxGhenis MaxGhenis merged commit 4397550 into master Apr 18, 2026
22 checks passed
@MaxGhenis MaxGhenis deleted the fix-invalidate-preserves-user-inputs branch April 18, 2026 00:10
MaxGhenis added a commit to MaxGhenis/policyengine-us that referenced this pull request Apr 18, 2026
Core 3.24.4 ships the fixes from PolicyEngine/policyengine-core#475:
- `Simulation.apply_reform` preserves `set_input` values (the H3
  cache invalidation used to wipe user-provided dataset inputs)
- `Holder.get_array` walks up `simulation.parent_branch` before
  falling back to `default`, so nested branches (e.g. `no_salt`
  under `itemizing`) still see ancestor inputs
- `GroupPopulation.clone` passes the cloned population to
  holder.clone so branch-aware lookups resolve correctly

Together these unblock the strict breakdown/children validator
added in core 3.24.0. Before 3.24.4, bumping the floor triggered
`TypeError: int() argument ... not 'NoneType'` in the
`tax_unit_itemizes.yaml` integration test (state_fips was wiped
by apply_reform) plus an infinite recursion in
`tax_liability_if_itemizing`.

With 3.24.4 (or later), the existing `test_system_import.py`
regression test now actually catches breakdown/children mismatches
at CI time — the class of bug that caused issue PolicyEngine#8055 and the
three partial-fix patches PolicyEngine#8045, PolicyEngine#8049, PolicyEngine#8051. Verified locally
against core 3.25.0: all 794 IRS tests pass.

Also remove the `[tool.uv]` environments restriction
(`python_version >= '3.11'`) since core 3.24+ supports 3.9/3.10
directly on PyPI — the lockfile now resolves cleanly across the
full 3.9-3.14 range.

Closes PolicyEngine#8066.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit to PolicyEngine/policyengine-uk that referenced this pull request Apr 18, 2026
PE-core 3.24.0-3.24.3 introduced a cache-invalidation cascade that
wiped set_input values whenever a reform was applied during simulation
construction. Test suite was green on 3.23.6 and broken on 3.24.x;
3.25.0 (PolicyEngine/policyengine-core#475) preserves user-provided
inputs across apply_reform while still invalidating formula-output
caches.

Verified locally: test_latest_data_smoke, test_lha_freeze,
test_parametric_reform_impacts, and test_uc_rebalancing all pass after
the bump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit to PolicyEngine/policyengine.py that referenced this pull request Apr 18, 2026
PE-core 3.24.0-3.24.3 cache-invalidation wiped set_input values across
apply_reform, causing UK household-impact calculations to return zero
(#1628). 3.25.0 (PolicyEngine/policyengine-core#475) preserves user
inputs while still invalidating formula-output caches.

All 11 tests/test_household_impact.py cases pass on the new pin.

Closes #1628

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit to PolicyEngine/policyengine-uk that referenced this pull request Apr 18, 2026
PE-core 3.24.0-3.24.3 introduced a cache-invalidation cascade that
wiped set_input values whenever a reform was applied during simulation
construction. Test suite was green on 3.23.6 and broken on 3.24.x;
3.25.0 (PolicyEngine/policyengine-core#475) preserves user-provided
inputs across apply_reform while still invalidating formula-output
caches.

Verified locally: test_latest_data_smoke, test_lha_freeze,
test_parametric_reform_impacts, and test_uc_rebalancing all pass after
the bump.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit to PolicyEngine/policyengine.py that referenced this pull request Apr 18, 2026
…283)

PE-core 3.24.0-3.24.3 cache-invalidation wiped set_input values across
apply_reform, causing UK household-impact calculations to return zero
(#1628). 3.25.0 (PolicyEngine/policyengine-core#475) preserves user
inputs while still invalidating formula-output caches.

All 11 tests/test_household_impact.py cases pass on the new pin.

Closes #1628

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit to MaxGhenis/policyengine-core that referenced this pull request Apr 18, 2026
The 3.24.4-era implementation walked every variable in the
tax-benefit system via `get_holder(variable)`, which lazy-creates a
`Holder` for each one. With thousands of variables in policyengine-us,
this inflated `apply_reform` from milliseconds to seconds — the
YAML full-suite went from ~17 min to ~51 min per job and started
timing out at the 1-hour GitHub Actions limit.

Untouched variables have no holder and therefore nothing to wipe.
Iterate `population._holders.values()` on each population to hit
only the variables that actually exist, preserving the set_input
replay behaviour from PolicyEngine#475.

Verified against tests/core/test_apply_reform_preserves_user_inputs
and tests/core/test_apply_reform_invalidates_cache (all 5 pass).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit to MaxGhenis/policyengine-core that referenced this pull request Apr 18, 2026
The 3.24.4-era implementation walked every variable in the
tax-benefit system via `get_holder(variable)`, which lazy-creates a
`Holder` for each one. With thousands of variables in policyengine-us,
this inflated `apply_reform` from milliseconds to seconds — the
YAML full-suite went from ~17 min to ~51 min per job and started
timing out at the 1-hour GitHub Actions limit.

Untouched variables have no holder and therefore nothing to wipe.
Iterate `population._holders.values()` on each population to hit
only the variables that actually exist, preserving the set_input
replay behaviour from PolicyEngine#475.

Verified against tests/core/test_apply_reform_preserves_user_inputs
and tests/core/test_apply_reform_invalidates_cache (all 5 pass).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit that referenced this pull request Apr 18, 2026
The 3.24.4-era implementation walked every variable in the
tax-benefit system via `get_holder(variable)`, which lazy-creates a
`Holder` for each one. With thousands of variables in policyengine-us,
this inflated `apply_reform` from milliseconds to seconds — the
YAML full-suite went from ~17 min to ~51 min per job and started
timing out at the 1-hour GitHub Actions limit.

Untouched variables have no holder and therefore nothing to wipe.
Iterate `population._holders.values()` on each population to hit
only the variables that actually exist, preserving the set_input
replay behaviour from #475.

Verified against tests/core/test_apply_reform_preserves_user_inputs
and tests/core/test_apply_reform_invalidates_cache (all 5 pass).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit to MaxGhenis/policyengine-us that referenced this pull request Apr 18, 2026
Core 3.24.4 ships the fixes from PolicyEngine/policyengine-core#475:
- Simulation.apply_reform preserves set_input values
- Holder.get_array walks up simulation.parent_branch before default
- GroupPopulation.clone passes the cloned population to holder.clone

Together these unblock the strict breakdown/children validator
added in core 3.24.0. The existing test_system_import.py (added
in PolicyEngine#8058) now actually validates the parameter tree at CI time.
Verified locally against core 3.25.0: all 794 IRS tests pass.

Remove the [tool.uv] environments restriction — core 3.24+
supports 3.9/3.10 directly on PyPI.

Closes PolicyEngine#8066.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MaxGhenis added a commit to PolicyEngine/policyengine-us that referenced this pull request Apr 18, 2026
Core 3.24.4 ships the fixes from PolicyEngine/policyengine-core#475:
- Simulation.apply_reform preserves set_input values
- Holder.get_array walks up simulation.parent_branch before default
- GroupPopulation.clone passes the cloned population to holder.clone

Together these unblock the strict breakdown/children validator
added in core 3.24.0. The existing test_system_import.py (added
in #8058) now actually validates the parameter tree at CI time.
Verified locally against core 3.25.0: all 794 IRS tests pass.

Remove the [tool.uv] environments restriction — core 3.24+
supports 3.9/3.10 directly on PyPI.

Closes #8066.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3.24.1: Microsimulation has no attribute _fast_cache when set_input is called

1 participant