Support heterogeneous state sets in initial conditions#315
Merged
Conversation
Shock grid states (Rouwenhorst, Uniform, etc.) are continuous states that should be accepted as float columns in the DataFrame. Previously they were rejected as "unknown columns" because _collect_all_state_names excluded _ShockGrid instances. Split state name collection into required (non-shock states + age) and optional (shock grid states). Shock columns are accepted but not required, since the model draws fresh shock values each period. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shock grid states (Rouwenhorst, Uniform, etc.) are continuous states that should be accepted as float columns in the DataFrame. Previously they were rejected as "unknown columns" because _collect_all_state_names excluded _ShockGrid instances. Split state name collection into required (non-shock states + age) and optional (shock grid states). Shock columns are accepted but not required, since the model draws fresh shock values each period. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The type-assignment loop set discount type for indices 0–7 (low education) but not for indices 8–15 (high education). All high-education agents were incorrectly assigned to the low discount factor type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fixed_params passed pd.Series raw to functools.partial, causing JAX TypeError during tracing. Runtime params already auto-convert via _maybe_convert_series. Add the same conversion to the fixed_params path using a callback pattern to avoid circular imports between model.py and model_processing.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fd2de49 to
eae2ad9
Compare
e243a84 to
983de99
Compare
Partition active target regimes into complete (have all required stochastic transitions) and incomplete (missing transitions, thus unreachable). Only build continuation value functions for complete targets. Guard at runtime that incomplete targets have zero transition probability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cache JIT-compiled functions in ~/.cache/jax so subsequent runs skip compilation. Add JAX Settings section to installation docs with HPC override instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
eae2ad9 to
9bb0f92
Compare
- Skip columns that aren't states of the current regime in initial_conditions_from_dataframe (fixes NA/string conversion errors) - Pre-allocate result arrays with NaN so unused slots surface bugs - Validate discrete state codes per-regime only - Add diagnostic context to feasibility check TypeErrors - Add test for heterogeneous discrete grids across regimes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
983de99 to
c670e40
Compare
'*' glob does not match branch names with '/' (e.g. fix/foo). Change to '**' so stacked PRs get CI runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Benchmark comparison (main → HEAD)Comparing
|
- Replace jax.debug.callback with jax.experimental.io_callback so the zero-probability guard reliably propagates exceptions under JIT - Fix comment that claimed incomplete targets are "unreachable" — they are assumed to have zero probability, validated at runtime - Add docstring and Mapping annotation to _check_zero_probs - Add test for incomplete target partition and zero-prob validation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When per-target transitions exist, derive the set of reachable targets from their keys and skip unreachable regimes. This prevents spurious transition entries (e.g., tied targets from a retiree source) that have shock transitions but missing non-shock stochastic transitions, causing shape mismatches in Q_and_F. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When per-target transitions exist, derive the set of reachable targets from their keys and skip unreachable regimes. This prevents spurious transition entries (e.g., tied targets from a retiree source) that have shock transitions but missing non-shock stochastic transitions, causing shape mismatches in Q_and_F. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For per-target stochastic transitions that cross grid sizes (e.g. 3-state health → 2-state health), the outcome axis must use the target regime's grid, not the source's. Without this fix, the converted transition probability array has the wrong last dimension, causing a shape mismatch in jnp.average during Q_and_F evaluation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shock states were made optional in the initial commit, but this is wrong: AR(1) shocks depend on the current value for transitions, and observed persistent shocks (e.g., wage residuals) represent real data. Making them optional would silently fill with NaN. Simplify _collect_state_names to return a single set including all states (shocks and non-shocks alike). This is consistent with validate_initial_conditions in simulate(), which already requires them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nditions # Conflicts: # tests/test_regime_state_mismatch.py
Review follow-ups on #316: - Remove the `target == regime_name` skip in _validate_no_reachable_ incomplete_targets; omitting the self-entry in a per-target dict is a common user error that must be caught. - Report all missing state transitions (not just stochastic) when a target is entirely absent from `transitions`; soften wording to cover sources that use only simple transitions. - Remove the dead `Q_and_F.incomplete_targets` attribute; it was never read and `jit` would strip it anyway. - Replace stale "NaN-poisoning" / "HLO independent of the partition" comments with accurate notes on trace-time resolution and pre-solve validation. - Add a NaN-free assertion to test_incomplete_target_zero_prob_succeeds (previously only checked that solve didn't raise). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Path.home()` raises `RuntimeError` when HOME/USERPROFILE is unset, which happens in some HPC container setups — exactly the environments where a user would want to override the cache directory. Previously this crashed at import time, making the docs instruction to "set the variable before importing lcm" impossible to follow. Guard with try/except and skip the default when unavailable. Also switch from `setdefault` (which evaluates the default eagerly) to `if not os.environ.get(...)`, so the cost is only paid when needed and an explicitly-empty value doesn't count as "set". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Missing-state error now lists which regime(s) require each missing column — essential diagnostic when regimes have heterogeneous state sets. Previously "Missing required state columns: ['status']" gave no hint that only some regimes need it. - Rename "Unknown columns" message to clarify it's about columns that don't match any state of an *initial* regime. - Move `_INT32_SENTINEL` from a function-local to a module-level constant. It was reinitialised on every call and not reusable in tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Replace `isinstance(grids.get(shock), _ShockGrid)` guard (which ty cannot
narrow back into `grids[shock]`) with a walrus-assigned narrowing
`isinstance(grid := grids.get(shock), _ShockGrid)` so ty sees a single
narrowed `_ShockGrid` value. Drops the `ty: ignore[invalid-assignment]`.
- Type regime-name strings as `RegimeName` in `processing.py`:
- `states_per_regime: dict[RegimeName, set[str]]`
- `_build_solve_functions` / `_build_simulate_functions` `regime_name`
parameters
- `_extract_transitions_from_regime` / `_get_reachable_targets`
`states_per_regime` parameter
- `_get_reachable_targets` return type `set[RegimeName]`
- `target_shock_grids` key type `tuple[RegimeName, str]`
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Use `regime_name` instead of `name` in the `all_active_next_period` enumeration and the subsequent complete-targets filter, so the intent of the loop variable is obvious without having to trace its source. Also tighten `complete_targets` to `list[RegimeName]`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mention `validate_regime_transitions_all_periods` explicitly — the entry point that invokes `_validate_no_reachable_incomplete_targets` — so a reader tracing the flow does not have to grep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_validate_regime_transition_single` and `_validate_no_reachable_incomplete_targets` both received `internal_regime` alongside `internal_regimes` and `regime_name`, and at every call site `internal_regime == internal_regimes[regime_name]`. Derive it from `internal_regimes` inside the functions. Also tighten `active_regimes_next_period: tuple[str, ...]` → `tuple[RegimeName, ...]`, related `regime_name: str` → `RegimeName`, and rename the `for target in active_regimes_next_period` loop variable to `target_regime_name` for clarity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The helpers (`_next_health_3to3`, `_next_wealth`, constants) and the two new tests (`test_complete_per_target_stochastic_cross_grid`, `test_incomplete_per_target_unreachable_target`) were duplicated in a block inserted before `test_discrete_state_same_count_different_names`. Move them to match main's layout: - `_next_health_3to3`, `_next_wealth`, constants stay in their main-side location (after `test_both_ordered_contradictory_raises`); add only the two new helpers `_next_health_3to2` and `_next_health_2to2` next to them. - Both new tests now live at the end of the file, alongside the existing `test_incomplete_per_target_reachable_target`. Pure reshuffle — no behavioural change; tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Member
Author
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
- Consolidate int32 sentinel: pandas_utils now imports MISSING_CAT_CODE
from simulation.initial_conditions instead of redefining it.
- Add PSEUDO_STATE_NAMES = {"age"} in lcm.ages and use it uniformly in
the data-loader and the validator so the two paths stay in sync.
- Drop the dead regime_name == "age" skip in _validate_state_columns.
- Rephrase the required-by fallback message; new test covers shock-grid
+ heterogeneous state sets.
- Add -> None to two new tests; document that
_raise_feasibility_type_error always raises.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Support models where different regimes have different state sets in
initial_conditions_from_dataframe— e.g., aclaimed_ssstate that onlyexists for retired subjects.
pandas_utils.py:initial_conditions_from_dataframe: skip columns that aren't states ofthe current regime (fixes NA/string conversion errors for regime-specific
states).
MISSING_CAT_CODE(
jnp.iinfo(jnp.int32).min, imported fromsimulation.initial_conditions— single source of truth) before the int32 cast, so JAX indexing
produces obviously wrong values instead of silently selecting the last
element.
_validate_state_columns: "Missing required state columns" error nameswhich regime(s) require each missing column. The message for
pseudo-states vs regime-required vs fallback cases is now formatted via
_format_missing_state_detail; "Unknown columns" clarifies it's aboutcolumns not matching any state of an initial regime.
simulation/initial_conditions.py:_validate_discrete_state_values: validate each discrete state only forsubjects in regimes that actually have it. Previously the sentinel for
missing states was falsely rejected.
_raise_feasibility_type_error(-> Never): adds diagnostic contextwhen a feasibility check
TypeErroris likely caused by a discretestate with the wrong dtype (e.g. float where int is expected).
ages.py:PSEUDO_STATE_NAMES = frozenset({"age"})constant.Used by both the validator and the data-loader so the two paths stay
in sync on what counts as a pseudo-state.
Test plan
test_initial_conditions_heterogeneous_state_sets— regimes withdifferent state sets.
test_initial_conditions_shock_grid_heterogeneous_state_sets— oneregime has a shock state, another does not.
test_regime_state_mismatch.py.tyclean;prekclean.Generated with Claude Code.