Skip to content

layout: generate_layout_with_config is nondeterministic per seed (HashMap/HashSet iteration order in force accumulation) #633

@bpowers

Description

@bpowers

Summary

generate_layout_with_config(project, model_name, config, db_state) in src/simlin-engine/src/layout/mod.rs is nondeterministic per seed: the same (model, annealing_random_seed) pair produces different layouts on repeated calls -- both serially within a single process and run-to-run across processes.

LayoutConfig carries an annealing_random_seed field (src/simlin-engine/src/layout/config.rs:48, default 42) whose entire purpose is to make layout reproducible per seed. The observed nondeterminism defeats that intent, so this is a real bug rather than a design choice.

Reproduction (verified)

Two identical invocations of the layout_eval example with the same models and seed produced different results:

LAYOUT_EVAL_MODELS=teacup,sir LAYOUT_EVAL_SEEDS=8
  • SIR median weighted_cost: 0.7348 vs 0.7071 across the two runs
  • Corpus geomean_of_medians: 0.6496 vs 0.6372
  • Teacup (tiny model) happened to be stable

The drift appears on the SIR (small but non-trivial) model and is masked on teacup.

Likely root cause (to be confirmed)

Per-process-randomized std HashMap/HashSet iteration order in the layout pipeline, where floating-point force accumulation / tie-breaking iterates an unordered container, so the same seed drifts. Rust's std HashMap uses a per-instance random hash seed, so iteration order varies between HashMap instances and between process runs.

Candidate sites:

  • src/simlin-engine/src/layout/sfdp.rs:68 -- build_node_index<N>(nodes: &[N]) -> HashMap<N, usize> and the HashMap-based force maps it feeds.
  • src/simlin-engine/src/layout/annealing.rs:346 -- HashSet<N> move-tracking in perturb.

Order-sensitive iteration over these containers when accumulating floating-point forces would cause the same seed to produce different rounding/accumulation results across instances and runs.

Why it matters

  • Production impact: generate_best_layout (best-of-k over LAYOUT_SEEDS) is affected, so production auto-layout output is not reproducible run-to-run.
  • Blocks layout-quality-eval Phase 5, which requires a determinism check plus a deterministic weighted_cost-vs-threshold regression guard. A nondeterministic pipeline cannot have a stable threshold gate.
  • Makes Phase 4's reference-pair ordering test fragile (weighted_cost(human) < weighted_cost(ai)): if the cost depends on iteration order, the inequality can flip run-to-run.

Components affected

src/simlin-engine/src/layout/ -- mod.rs (generate_layout_with_config, generate_best_layout), sfdp.rs, annealing.rs, and any other force-accumulation path that iterates an unordered container.

Test gap

src/simlin-engine/tests/layout.rs asserts only structural validity (valid elements, connectors present, link UIDs reference existing elements, round-trips, metadata). It has no test asserting that the same (model, seed) yields an identical view across repeated calls, so this reproducibility gap is currently untested.

Suggested fix direction (not prescriptive)

Replace order-sensitive HashMap/HashSet iteration in the layout pipeline with deterministic ordered containers (BTreeMap, a fixed-seed/deterministic hasher such as FxHashMap with a fixed seed, or IndexMap), and/or sort before any iteration whose order affects floating-point accumulation. Verify with a new determinism test asserting that the same (model, seed) produces an identical view across repeated calls (and ideally across separate process runs).

Discovery context

Identified during the layout-quality-eval work (branch layout-quality-eval), reproduced directly via the layout_eval example as described above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions