From b616b6939ef99b9b02aaa2a8cb1506e12a0250f9 Mon Sep 17 00:00:00 2001 From: lmoresi Date: Wed, 20 May 2026 22:28:56 +1000 Subject: [PATCH] =?UTF-8?q?docs:=20snapshot=20toolkit=20=E2=80=94=20CHANGE?= =?UTF-8?q?S=20entry,=20current=20API=20names,=20toctree=20wiring?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Post-merge cleanup for the snapshot-toolkit work (PRs #195, #196, #198) now landed on development. - CHANGES.md: new 2026-05-20 entry covering the toolkit and the #184 Lagrangian typo fix. Documents save_state/load_state surface, Model.tracker, on-disk v1.1 format, format-aware read_timestep, the state-as-dataclass contract, and how the existing write_timestep / write_checkpoint paths remain unchanged (different use cases). - docs/developer/design/in_memory_checkpoint_design.md: API-shape code example updated to current names (save_state/load_state with file= kwarg) — the draft snapshot/restore verbs were renamed in phase 5; the design discussion itself is unchanged. - docs/developer/guides/state-as-dataclass.md: example code updated to the current save_state/load_state names; pointer to the user guide added. - docs/developer/index.md: added guides/state-as-dataclass to the Guides toctree (it was orphaned, generating a "not included in any toctree" warning since #195 landed). docs-build succeeds with no snapshot-related warnings. Underworld development team with AI support from Claude Code (https://claude.com/claude-code) --- CHANGES.md | 70 +++++++++++++++++++ .../design/in_memory_checkpoint_design.md | 18 ++--- docs/developer/guides/state-as-dataclass.md | 4 +- docs/developer/index.md | 1 + 4 files changed, 83 insertions(+), 10 deletions(-) diff --git a/CHANGES.md b/CHANGES.md index d1a1f6bc..ad9cbd36 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,5 +1,75 @@ # CHANGES: Underworld3 +## 2026-05-20 + + - **Snapshot toolkit (PRs #195, #196, #198)**: unified + state-capture mechanism for Underworld3 models. + + User-facing API on `Model`: + + ```python + token = model.save_state() # in-memory "stash for timesteps" + model.load_state(token) # exact restore from token + + model.save_state(file="step42.snap.h5") # persistent on-disk snapshot + model.load_state("step42.snap.h5") # restore from disk + ``` + + Same call, two storage modes. Captures the full model — every + registered mesh and mesh-variable, every swarm with per-particle + data, every solver-internal state-bearer (`ModelTracker`, `DDt` + instances, anything exposing the `Snapshottable` contract). User + guide: `docs/advanced/snapshot-restore.md`. + + - **In-memory mode** (#195): bit-exact discard-of-a-step + guarantee, proven through real PETSc solves; parallel-correct + under MPI at any fixed rank count (recovers from genuine + cross-rank particle loss); rebuild-on-restore semantics for + swarms (the discarded step *is* what restore exists to undo). + - **`Model.tracker`** (#196): model-dwelling, snapshot-managed + record of where a run is. Holds `time` / `step` / `dt` plus + any quantity the user parks on it (`model.tracker.foo = ...`) + — anything on the tracker reverts with the model. Solvers do + not depend on it; using it is optional. A loose Python + variable is not reverted by `load_state`; the same value on + the tracker is. + - **On-disk mode (v1.1, #198)**: HDF5 wrapper file + companion + `.bulk/` directory. Wrapper is `h5ls`-inspectable without UW3 + in the loop (carries run name, schema version, sim time, step, + MPI rank count, mesh/swarm/variable inventories). Bulk data + uses PR #146's PETSc DMPlex primitives for mesh + meshvars; + swarms get per-rank h5py sidecars for parallel correctness. + Same-rank-count restart contract; clean errors on rank-count + mismatch. + + Related API changes: + + - `MeshVariable.read_timestep` is now format-aware: detects + whether the file is a legacy `write_timestep` per-variable + file or a v1.1 snapshot wrapper and dispatches internally. + Existing scripts that call `var.read_timestep(...)` work + transparently against new files via a KDTree bridge over + `MeshVariable.read_checkpoint` (#146). + - `mesh.write_timestep()` / `mesh.write_checkpoint()` (PR #146) + remain unchanged — they serve different use cases + (visualisation + flexible/cross-resolution restart; + memory-efficient same-rank PETSc reload for postprocessing). + See "Choosing between paths" in the user guide. + + State-as-dataclass contract for solver helpers: + `docs/developer/guides/state-as-dataclass.md` — declare + mutable evolution state as a `SnapshottableState` dataclass + and the snapshot mechanism captures/restores it with no extra + plumbing. Retrofitted for all five DDt flavors in this work + (`Symbolic`, `Eulerian`, `SemiLagrangian`, `Lagrangian`, + `Lagrangian_Swarm`). + + - **Fix(ddt): `Lagrangian.__init__` typo (`uw.swarm.UWSwarm` → + `uw.swarm.Swarm`)** (PR #184). Lagrangian DDt had been + unconstructible since commit `0778b7d` (2025-07-07) — typo + introduced during the unrelated `evalf` cleanup. Surfaced + during the snapshot toolkit's retrofit work. + ## 2026-03-14 - **Release v3.0.0**: Merged development (398 commits) to main, tagged v3.0.0 diff --git a/docs/developer/design/in_memory_checkpoint_design.md b/docs/developer/design/in_memory_checkpoint_design.md index 10fec6ad..3c4c5fd5 100644 --- a/docs/developer/design/in_memory_checkpoint_design.md +++ b/docs/developer/design/in_memory_checkpoint_design.md @@ -263,19 +263,21 @@ different need: faithful state restore for algorithmic uses where ## API shape -Full-state always; backend chosen at snapshot time by passing (or -omitting) a path. +Full-state always; backend chosen at save time by passing (or +omitting) a ``file=``. (Final names landed after a phase-5 rename +pass that replaced the draft ``snapshot``/``restore`` verbs — see +the user guide at ``docs/advanced/snapshot-restore.md``.) ```python -# Backend selection — same capture, different storage layer -token = model.snapshot() # in-memory (default) -model.snapshot(path='step42.snap.h5') # on-disk full-state +# Same call, different storage — dispatch on whether a file is given. +token = model.save_state() # in-memory (default) +model.save_state(file='step42.snap.h5') # on-disk full-state -model.restore(token) # in-memory restore -model.restore('step42.snap.h5') # on-disk restore +model.load_state(token) # in-memory restore +model.load_state('step42.snap.h5') # on-disk restore # Existing per-variable selective on-disk path is unchanged: -mesh.write_timestep('step42.h5', ...) # visualisation; not full-state +mesh.write_timestep('step42.h5', ...) # visualisation; not full-state ``` Backends share a single `Snapshot` structure — only the serialisation diff --git a/docs/developer/guides/state-as-dataclass.md b/docs/developer/guides/state-as-dataclass.md index 37f34718..5128a28d 100644 --- a/docs/developer/guides/state-as-dataclass.md +++ b/docs/developer/guides/state-as-dataclass.md @@ -191,12 +191,12 @@ def test_my_helper_roundtrip(): h.step(0.2) state_pre = h.state - snap = model.snapshot() + snap = model.save_state() # Mutate. h.step(0.5) - model.restore(snap) + model.load_state(snap) # Verify primary state recovered. assert h.state == state_pre diff --git a/docs/developer/index.md b/docs/developer/index.md index c53dcfc8..1238e6e1 100644 --- a/docs/developer/index.md +++ b/docs/developer/index.md @@ -113,6 +113,7 @@ guides/CODE-REVIEW-PROCESS guides/SPELLING_CONVENTION guides/version-management guides/branching-strategy +guides/state-as-dataclass guides/BINDER_CONTAINER_SETUP guides/hpc-cluster-setup ```