Skip to content

engine: model.rs init/module dep split uses ASCII '.' but canonical module separator is U+00B7 (dead/incorrect branch) #567

@bpowers

Description

@bpowers

Problem

src/simlin-engine/src/model.rs:166-169 and src/simlin-engine/src/model.rs:185-188 both split a ModuleInput.src on an ASCII . to extract the "direct dependency" (the leading module-path component) for stock/module initialization dependency ordering:

// model.rs:166-169 (initial phase)
let direct_dep = match src.as_str().find('.') {
    Some(pos) => &src.as_str()[..pos],
    None => src.as_str(),
};

// model.rs:185-188 (dt phase)
let direct_dep = match src.as_str().find('.') {
    Some(pos) => &src.as_str()[..pos],
    None => src.as_str(),
};

src here is an Ident<Canonical> (canonical form). In the canonical form the module-hierarchy separator is the U+00B7 MIDDLE DOT ·, not an ASCII . (see canonicalize() in src/simlin-engine/src/common.rs, which rewrites unquoted ASCII . -> · U+00B7 at parse time). So find('.') never matches a real module-path separator:

  • For a normal module-qualified input (e.g. canonical parent·child·var), find('.') returns None and the code falls back to using the whole src as the "direct dep" instead of the intended leading path component.
  • The only canonical idents that ever contained a raw ASCII . were literal-period quoted identifiers (e.g. user identifier "a.b" -> canonical a.b). For those, this code would wrongly split a.b at the ., treating a as a module prefix — a mis-classification of a single variable name as a module-qualified reference.

The split appears to intend module-path splitting (take the first hierarchy component) but uses the wrong separator, making the Some(pos) branch effectively dead for valid module references and actively wrong for literal-period quoted identifiers.

Why it matters

  • Correctness (latent): dependency analysis for module / stock-init ordering relies on a separator split that can never fire correctly for valid module-qualified inputs. The intended "use the leading module component as the direct dep" behavior is silently not happening; the whole qualified name is used instead. No known incorrect simulation results today because the dead branch never matches for valid module refs, and the historical mis-split case (literal-period quoted idents) is being removed independently (see Context).
  • Maintainability: this is a separator-layer confusion bug of exactly the class tracked by engine: three visually-indistinguishable separator chars (U+205A / U+00B7 / ASCII .) are load-bearing across stdlib + macro subsystems with no type-level or lint guard #565 — a hand-written ASCII . where the canonical layer requires U+00B7. It is a concrete, file-pinned instance, not the general guard concern. It reads as intentional module-path splitting but is dead/incorrect, so a future reader cannot tell whether the fallback-to-whole-string behavior is correct-by-design or an accident.

Components affected

  • src/simlin-engine/src/model.rsmodule_input_deps / init+dt dependency resolution (lines 166-169 and 185-188), feeding stock-and-module initialization topological ordering.

Possible approaches

  • If module-path splitting is intended: split on the canonical separator instead — find('·') (U+00B7), or a CanonicalStr::split_at_dot()-style helper if one exists/should exist — and add a test with a genuinely module-qualified canonical input asserting the leading component is selected.
  • If the split is actually dead code (the whole-src fallback is the desired behavior for all valid inputs): remove the find('.') match entirely and use src.as_str() directly, with a comment explaining why no module-path stripping is needed here.
  • Either way, add a regression test covering a module-qualified input so the intended behavior is pinned.

This is closely related to #565 (the general "three visually-indistinguishable separators have no type-level/lint guard" concern, which lists model.rs as one of the ~20 affected files) but is a distinct, concrete correctness/maintainability defect at specific lines that #565's proposed fix (named constants + a literal-presence guard test) would not by itself correct — the wrong separator would still be the wrong separator even after constants are introduced.

Context

Identified while implementing issue #559 blocker B4 (canonicalize idempotency for quoted-period idents) in src/simlin-engine; out of scope for B4, so filed separately rather than fixed there. Note: after the B4 fix, literal periods in quoted identifiers canonicalize to the U+2024 ONE DOT LEADER sentinel instead of a raw ASCII ., so find('.') now always returns None at these sites — incidentally eliminating the only historical mis-split case, but the underlying code smell (a separator split that uses the wrong separator and is effectively dead/incorrect for its apparent purpose) remains and is worth tracking on its own.

Severity: low (no known incorrect simulation behavior today; latent).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingengineIssues with the rust-based simulation engine

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions