Skip to content

Accept pd.Series in params and pd.DataFrame in initial conditions#289

Merged
hmgaudecker merged 146 commits intomainfrom
general-purpose-series-converter
Mar 26, 2026
Merged

Accept pd.Series in params and pd.DataFrame in initial conditions#289
hmgaudecker merged 146 commits intomainfrom
general-purpose-series-converter

Conversation

@hmgaudecker
Copy link
Copy Markdown
Member

@hmgaudecker hmgaudecker commented Mar 24, 2026

Summary

@timmens was right in his comments on #272
— requiring a separate preprocessor call before solve() / simulate()
was too clunky. Took me a while to come around, but here we are: pandas
objects are now accepted directly.

  • model.solve() and model.simulate() accept pd.Series values in
    params — labeled Series with a named MultiIndex are converted to
    correctly shaped JAX arrays automatically.
  • model.simulate() accepts a pd.DataFrame as initial_conditions
    regime name mapping and discrete label encoding happen transparently.
  • Add derived_categoricals parameter to solve() / simulate() for
    DAG function outputs not in the model's state/action grids.
  • Extract broadcast_to_template as the single params tree traversal
    primitive — process_params and _resolve_fixed_params become
    one-line wrappers.
  • AST-based _get_func_indexing_params — inspects function source for
    subscript patterns to determine array dimensions, with required
    array_param_name to avoid false positives.

Test plan

  • All 813 tests pass (pixi run -e tests-cpu tests)
  • Pre-commit hooks pass (prek run --all-files)
  • Type checking passes (pixi run -e type-checking ty)
  • Review auto-conversion in model.py (_maybe_convert_series,
    _maybe_convert_dataframe)
  • Review convert_series_in_params and array_from_series in
    pandas_utils.py
  • Review broadcast_to_template in params_processing.py

🤖 Generated with Claude Code

hmgaudecker and others added 24 commits March 23, 2026 20:05
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_states_for_subjects

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move function-level imports to top-level, add noqa for print statements,
fix line length in docstring, use keyword arg for jax.config.update.
Exclude tests/data/ from name-tests-test hook.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move jax.config.update("jax_enable_x64") back to module level (before
lcm_examples imports) to ensure 64-bit precision during model construction.
Regenerate all benchmark pickle files with correct precision.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…al_states

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e_state_space_info

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the fragile `expected_levels` tuple and inferred-mode fallbacks with
a `param_path` parameter that resolves the parameter's position in the model
template. The function now inspects the owning function's signature to determine
indexing dimensions (states, actions, period) automatically.

Key changes:

- `array_from_series`: new API with `sr`, `model` (required), `param_path`
  (required 1-3 tuple). No more `data`, `ages`, `expected_levels`, or inferred
  mode. Always strict validation.

- `_LevelMapping` dataclass + `_scatter_series`: replace `_OutcomeMapping`,
  `_compute_shape`, `_map_labels_to_codes` with a clean abstraction. Both
  `array_from_series` and `_build_probs_array` (transition probs) use it.

- `_resolve_param_indexing`: resolves 1/2/3-part param paths by scanning
  model regimes, verifying consistent indexing across matching functions.

- NaN-fill in `_build_probs_array`: `np.full(shape, np.nan)` instead of
  `np.zeros`, making data gaps visible.

- `_resolve_categoricals`: now raises on conflicts instead of silently
  overriding model grids.

Removed: `_OutcomeMapping`, `_multiindex_series_to_array`,
`_age_series_to_array`, `_categorical_series_to_array`,
`_build_level_mappings_from_grids`, `_compute_shape`, `_map_labels_to_codes`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…idspecs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…idspecs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ctions

Introduce two explicit, phase-specific containers that replace the
fragile override pattern. Each consumer now knows which phase it
operates in. Eliminates simulate_overrides, with_simulate_overrides(),
and internal use of PhaseVariant for regime_transition_probs.

PhaseVariant is retained in the user-facing API (Regime.functions)
for PR 6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- _get_func_indexing_params: drop regime parameter, use AST inspection to
  determine indexing params from subscript patterns in the function source.
  Raises TypeError for lambdas, ValueError for computed indices (with recipe
  to extract into a DAG function).

- _validate_probs_array_indexing → _validate_array_param_indexing: accept
  array_param_name and indexing_params instead of hardcoding "probs_array".

- _collect_probs_array_subscripts → _collect_subscripts: parameterized.

- Remove _get_indexing_params (replaced by AST-based _get_func_indexing_params).

- Remove regime param from _build_probs_array (no longer needed).

- Make all private functions keyword-only throughout pandas_utils.py and
  error_handling.py.

- Rename test file: test_validate_probs_indexing → test_validate_array_indexing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community bot commented Mar 24, 2026

Documentation build overview

📚 pylcm | 🛠️ Build #31992545 | 📁 Comparing 9b570a5 against latest (14b3fef)


🔍 Preview build

Show files changed (32 files in total): 📝 32 modified | ➕ 0 added | ➖ 0 deleted
File Status
index.html 📝 modified
approximating-continuous-shocks/index.html 📝 modified
benchmarking/index.html 📝 modified
benchmarking-1/index.html 📝 modified
beta-delta/index.html 📝 modified
conventions/index.html 📝 modified
debugging/index.html 📝 modified
defining-models/index.html 📝 modified
dispatchers/index.html 📝 modified
function-representation/index.html 📝 modified
grids/index.html 📝 modified
index-1/index.html 📝 modified
index-2/index.html 📝 modified
index-3/index.html 📝 modified
index-4/index.html 📝 modified
installation/index.html 📝 modified
interpolation/index.html 📝 modified
mahler-yum-2024/index.html 📝 modified
mortality/index.html 📝 modified
pandas-interop/index.html 📝 modified
parameters/index.html 📝 modified
precautionary-savings/index.html 📝 modified
precautionary-savings-health/index.html 📝 modified
regimes/index.html 📝 modified
setup/index.html 📝 modified
shocks/index.html 📝 modified
solving-and-simulating/index.html 📝 modified
stochastic-transitions/index.html 📝 modified
tiny/index.html 📝 modified
tiny-example/index.html 📝 modified
transitions/index.html 📝 modified
write-economics/index.html 📝 modified

hmgaudecker and others added 5 commits March 24, 2026 14:13
…nctions

Enforce consistent ordering: flat_param_names, age, period, functions,
constraints, transitions, stochastic_transition_names, then remaining
args. Also fix notebook cell 9 source format (string → list of strings).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@timmens timmens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I prefer this UI 😄

Only one comment:

The pandas_interop docs explain the format and the "why labels" well, but I think there's a gap for users trying to build a mental model of what's happening. The function code uses JAX integer indexing (probs_array[age, health]), but the user provides string-labeled Series, and nothing explains how those two worlds connect.

A short paragraph somewhere before or after the "Series format" section could help. Something along the lines of: your model functions work with plain JAX arrays and integer indexing as usual; the Series is purely an input convenience. Before any model code runs, the conversion inspects your function to figure out which dimensions the array is indexed over, maps labels to integer positions using the model's grids, and hands the function a normal JAX array. Your function never sees pandas.

Not blocking, but I think it'd save users some head-scratching.

@hmgaudecker hmgaudecker force-pushed the refactor/remove-label-translator branch from 808c7aa to 5e18a9c Compare March 26, 2026 18:31
@hmgaudecker hmgaudecker force-pushed the general-purpose-series-converter branch from 8dd49b6 to ca5601c Compare March 26, 2026 18:31
hmgaudecker and others added 17 commits March 26, 2026 19:38
Addresses review feedback on PR #289: users needed a bridge between
the labeled Series they provide and the integer-indexed JAX arrays
their functions receive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dataframe

- Add solve+simulate tests for models with different discrete categories
  per regime (e.g. health with/without "disabled")
- Fix to_dataframe: remap per-regime codes to labels using each regime's
  own category ordering before building the merged Categorical
- Fix initial_conditions_from_dataframe: cast discrete states to int32
  (was float32, rejected by JAX as indexer)
- Fix validate_initial_conditions: union valid codes across regimes
  instead of overwriting

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…series-converter

# Conflicts:
#	src/lcm/pandas_utils.py
#	tests/test_pandas_utils.py
Base automatically changed from refactor/remove-label-translator to main March 26, 2026 20:52
@hmgaudecker hmgaudecker merged commit 923e62e into main Mar 26, 2026
4 of 10 checks passed
@hmgaudecker hmgaudecker deleted the general-purpose-series-converter branch March 26, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants