Skip to content

Support heterogeneous state sets in initial conditions#315

Merged
hmgaudecker merged 136 commits into
mainfrom
fix/heterogeneous-initial-conditions
Apr 18, 2026
Merged

Support heterogeneous state sets in initial conditions#315
hmgaudecker merged 136 commits into
mainfrom
fix/heterogeneous-initial-conditions

Conversation

@hmgaudecker
Copy link
Copy Markdown
Member

@hmgaudecker hmgaudecker commented Apr 9, 2026

Summary

Support models where different regimes have different state sets in
initial_conditions_from_dataframe — e.g., a claimed_ss state that only
exists for retired subjects.

pandas_utils.py:

  • initial_conditions_from_dataframe: skip columns that aren't states of
    the current regime (fixes NA/string conversion errors for regime-specific
    states).
  • Pre-allocate result arrays with NaN so unused slots surface bugs.
  • Replace remaining NaN in discrete columns with MISSING_CAT_CODE
    (jnp.iinfo(jnp.int32).min, imported from simulation.initial_conditions
    — single source of truth) before the int32 cast, so JAX indexing
    produces obviously wrong values instead of silently selecting the last
    element.
  • _validate_state_columns: "Missing required state columns" error names
    which regime(s) require each missing column. The message for
    pseudo-states vs regime-required vs fallback cases is now formatted via
    _format_missing_state_detail; "Unknown columns" clarifies it's about
    columns not matching any state of an initial regime.

simulation/initial_conditions.py:

  • _validate_discrete_state_values: validate each discrete state only for
    subjects in regimes that actually have it. Previously the sentinel for
    missing states was falsely rejected.
  • _raise_feasibility_type_error (-> Never): adds diagnostic context
    when a feasibility check TypeError is likely caused by a discrete
    state with the wrong dtype (e.g. float where int is expected).

ages.py:

  • New module-level PSEUDO_STATE_NAMES = frozenset({"age"}) constant.
    Used by both the validator and the data-loader so the two paths stay
    in sync on what counts as a pseudo-state.

Test plan

  • test_initial_conditions_heterogeneous_state_sets — regimes with
    different state sets.
  • test_initial_conditions_shock_grid_heterogeneous_state_sets — one
    regime has a shock state, another does not.
  • Heterogeneous discrete grids in test_regime_state_mismatch.py.
  • Full test suite passes; ty clean; prek clean.

Generated with Claude Code.

hmgaudecker and others added 7 commits April 8, 2026 19:28
Shock grid states (Rouwenhorst, Uniform, etc.) are continuous states that
should be accepted as float columns in the DataFrame. Previously they were
rejected as "unknown columns" because _collect_all_state_names excluded
_ShockGrid instances.

Split state name collection into required (non-shock states + age) and
optional (shock grid states). Shock columns are accepted but not required,
since the model draws fresh shock values each period.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shock grid states (Rouwenhorst, Uniform, etc.) are continuous states that
should be accepted as float columns in the DataFrame. Previously they were
rejected as "unknown columns" because _collect_all_state_names excluded
_ShockGrid instances.

Split state name collection into required (non-shock states + age) and
optional (shock grid states). Shock columns are accepted but not required,
since the model draws fresh shock values each period.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The type-assignment loop set discount type for indices 0–7 (low
education) but not for indices 8–15 (high education). All high-education
agents were incorrectly assigned to the low discount factor type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fixed_params passed pd.Series raw to functools.partial, causing JAX
TypeError during tracing. Runtime params already auto-convert via
_maybe_convert_series. Add the same conversion to the fixed_params
path using a callback pattern to avoid circular imports between
model.py and model_processing.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hmgaudecker hmgaudecker changed the title Support heterogeneous state sets in initial conditions Improve initial conditions: heterogeneous state sets and validation Apr 9, 2026
@hmgaudecker hmgaudecker force-pushed the improve/jax-settings branch from fd2de49 to eae2ad9 Compare April 10, 2026 05:02
@hmgaudecker hmgaudecker force-pushed the fix/heterogeneous-initial-conditions branch from e243a84 to 983de99 Compare April 10, 2026 05:03
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented Apr 10, 2026

hmgaudecker and others added 2 commits April 10, 2026 07:08
Partition active target regimes into complete (have all required
stochastic transitions) and incomplete (missing transitions, thus
unreachable). Only build continuation value functions for complete
targets. Guard at runtime that incomplete targets have zero transition
probability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cache JIT-compiled functions in ~/.cache/jax so subsequent runs skip
compilation. Add JAX Settings section to installation docs with HPC
override instructions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hmgaudecker hmgaudecker force-pushed the improve/jax-settings branch from eae2ad9 to 9bb0f92 Compare April 10, 2026 05:09
- Skip columns that aren't states of the current regime in
  initial_conditions_from_dataframe (fixes NA/string conversion errors)
- Pre-allocate result arrays with NaN so unused slots surface bugs
- Validate discrete state codes per-regime only
- Add diagnostic context to feasibility check TypeErrors
- Add test for heterogeneous discrete grids across regimes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hmgaudecker hmgaudecker force-pushed the fix/heterogeneous-initial-conditions branch from 983de99 to c670e40 Compare April 10, 2026 05:09
hmgaudecker and others added 3 commits April 10, 2026 10:51
'*' glob does not match branch names with '/' (e.g. fix/foo).
Change to '**' so stacked PRs get CI runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 10, 2026

Benchmark comparison (main → HEAD)

Comparing 719b5334 (main) → fabcd872 (HEAD)

Benchmark Statistic before after Ratio Alert
Mahler-Yum execution time 3.71±0.01s 3.67±0.01s 0.99
peak GPU mem 262M 262M 1.00
compilation time 1.92m 1.92m 1.00
peak CPU mem 2.24G 2.23G 0.99
Mortality execution time 270±2ms 275±10ms 1.02
peak GPU mem 542M 542M 1.00
compilation time 10.7s 10.6s 0.99
peak CPU mem 1.25G 1.25G 1.00
Precautionary Savings - Solve execution time 47.5±2ms 43.2±3ms 0.91
peak GPU mem 8.44M 8.44M 1.00
compilation time 4.94s 5.05s 1.02
peak CPU mem 1.07G 1.07G 1.00
Precautionary Savings - Simulate execution time 143±2ms 139±0.1ms 0.97
peak GPU mem 138M 138M 1.00
compilation time 7.08s 7.12s 1.01
peak CPU mem 1.2G 1.2G 1.00
Precautionary Savings - Solve & Simulate execution time 167±1ms 172±6ms 1.03
peak GPU mem 565M 565M 1.00
compilation time 11.3s 11.1s 0.98
peak CPU mem 1.22G 1.2G 0.99
Precautionary Savings - Solve & Simulate (irreg) execution time 308±1ms 306±1ms 0.99
peak GPU mem 2.18G 2.18G 1.00
compilation time 12.2s 12.1s 0.99
peak CPU mem 1.27G 1.27G 1.00

hmgaudecker and others added 10 commits April 10, 2026 12:46
- Replace jax.debug.callback with jax.experimental.io_callback so the
  zero-probability guard reliably propagates exceptions under JIT
- Fix comment that claimed incomplete targets are "unreachable" — they
  are assumed to have zero probability, validated at runtime
- Add docstring and Mapping annotation to _check_zero_probs
- Add test for incomplete target partition and zero-prob validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When per-target transitions exist, derive the set of reachable targets
from their keys and skip unreachable regimes. This prevents spurious
transition entries (e.g., tied targets from a retiree source) that have
shock transitions but missing non-shock stochastic transitions, causing
shape mismatches in Q_and_F.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When per-target transitions exist, derive the set of reachable targets
from their keys and skip unreachable regimes. This prevents spurious
transition entries (e.g., tied targets from a retiree source) that have
shock transitions but missing non-shock stochastic transitions, causing
shape mismatches in Q_and_F.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For per-target stochastic transitions that cross grid sizes (e.g.
3-state health → 2-state health), the outcome axis must use the
target regime's grid, not the source's. Without this fix, the
converted transition probability array has the wrong last dimension,
causing a shape mismatch in jnp.average during Q_and_F evaluation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shock states were made optional in the initial commit, but this is
wrong: AR(1) shocks depend on the current value for transitions, and
observed persistent shocks (e.g., wage residuals) represent real data.
Making them optional would silently fill with NaN.

Simplify _collect_state_names to return a single set including all
states (shocks and non-shocks alike). This is consistent with
validate_initial_conditions in simulate(), which already requires them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hmgaudecker and others added 20 commits April 16, 2026 20:35
…nditions

# Conflicts:
#	tests/test_regime_state_mismatch.py
Review follow-ups on #316:
- Remove the `target == regime_name` skip in _validate_no_reachable_
  incomplete_targets; omitting the self-entry in a per-target dict is a
  common user error that must be caught.
- Report all missing state transitions (not just stochastic) when a
  target is entirely absent from `transitions`; soften wording to cover
  sources that use only simple transitions.
- Remove the dead `Q_and_F.incomplete_targets` attribute; it was never
  read and `jit` would strip it anyway.
- Replace stale "NaN-poisoning" / "HLO independent of the partition"
  comments with accurate notes on trace-time resolution and pre-solve
  validation.
- Add a NaN-free assertion to test_incomplete_target_zero_prob_succeeds
  (previously only checked that solve didn't raise).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Path.home()` raises `RuntimeError` when HOME/USERPROFILE is unset, which
happens in some HPC container setups — exactly the environments where a
user would want to override the cache directory. Previously this crashed
at import time, making the docs instruction to "set the variable before
importing lcm" impossible to follow.

Guard with try/except and skip the default when unavailable. Also switch
from `setdefault` (which evaluates the default eagerly) to `if not
os.environ.get(...)`, so the cost is only paid when needed and an
explicitly-empty value doesn't count as "set".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Missing-state error now lists which regime(s) require each missing
  column — essential diagnostic when regimes have heterogeneous state
  sets. Previously "Missing required state columns: ['status']" gave
  no hint that only some regimes need it.
- Rename "Unknown columns" message to clarify it's about columns that
  don't match any state of an *initial* regime.
- Move `_INT32_SENTINEL` from a function-local to a module-level
  constant. It was reinitialised on every call and not reusable in
  tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Replace `isinstance(grids.get(shock), _ShockGrid)` guard (which ty cannot
  narrow back into `grids[shock]`) with a walrus-assigned narrowing
  `isinstance(grid := grids.get(shock), _ShockGrid)` so ty sees a single
  narrowed `_ShockGrid` value. Drops the `ty: ignore[invalid-assignment]`.
- Type regime-name strings as `RegimeName` in `processing.py`:
  - `states_per_regime: dict[RegimeName, set[str]]`
  - `_build_solve_functions` / `_build_simulate_functions` `regime_name`
    parameters
  - `_extract_transitions_from_regime` / `_get_reachable_targets`
    `states_per_regime` parameter
  - `_get_reachable_targets` return type `set[RegimeName]`
  - `target_shock_grids` key type `tuple[RegimeName, str]`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Use `regime_name` instead of `name` in the `all_active_next_period`
enumeration and the subsequent complete-targets filter, so the intent
of the loop variable is obvious without having to trace its source.
Also tighten `complete_targets` to `list[RegimeName]`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mention `validate_regime_transitions_all_periods` explicitly — the entry
point that invokes `_validate_no_reachable_incomplete_targets` — so a
reader tracing the flow does not have to grep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_validate_regime_transition_single` and
`_validate_no_reachable_incomplete_targets` both received `internal_regime`
alongside `internal_regimes` and `regime_name`, and at every call site
`internal_regime == internal_regimes[regime_name]`. Derive it from
`internal_regimes` inside the functions.

Also tighten `active_regimes_next_period: tuple[str, ...]` →
`tuple[RegimeName, ...]`, related `regime_name: str` → `RegimeName`, and
rename the `for target in active_regimes_next_period` loop variable to
`target_regime_name` for clarity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Base automatically changed from improve/jax-settings to main April 17, 2026 04:16
hmgaudecker and others added 2 commits April 17, 2026 06:17
The helpers (`_next_health_3to3`, `_next_wealth`, constants) and the two
new tests (`test_complete_per_target_stochastic_cross_grid`,
`test_incomplete_per_target_unreachable_target`) were duplicated in a
block inserted before `test_discrete_state_same_count_different_names`.

Move them to match main's layout:
- `_next_health_3to3`, `_next_wealth`, constants stay in their main-side
  location (after `test_both_ordered_contradictory_raises`); add only
  the two new helpers `_next_health_3to2` and `_next_health_2to2` next
  to them.
- Both new tests now live at the end of the file, alongside the
  existing `test_incomplete_per_target_reachable_target`.

Pure reshuffle — no behavioural change; tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member Author

@hmgaudecker hmgaudecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autoreview.

@hmgaudecker
Copy link
Copy Markdown
Member Author

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

- Consolidate int32 sentinel: pandas_utils now imports MISSING_CAT_CODE
  from simulation.initial_conditions instead of redefining it.
- Add PSEUDO_STATE_NAMES = {"age"} in lcm.ages and use it uniformly in
  the data-loader and the validator so the two paths stay in sync.
- Drop the dead regime_name == "age" skip in _validate_state_columns.
- Rephrase the required-by fallback message; new test covers shock-grid
  + heterogeneous state sets.
- Add -> None to two new tests; document that
  _raise_feasibility_type_error always raises.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member Author

@hmgaudecker hmgaudecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autoreview.

@hmgaudecker hmgaudecker merged commit 6af04b2 into main Apr 18, 2026
10 checks passed
@hmgaudecker hmgaudecker deleted the fix/heterogeneous-initial-conditions branch April 18, 2026 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant