Skip to content

Add simlin-serve binary and refactor simlin-mcp into core library#476

Open
bpowers wants to merge 145 commits intomainfrom
server-rewrite
Open

Add simlin-serve binary and refactor simlin-mcp into core library#476
bpowers wants to merge 145 commits intomainfrom
server-rewrite

Conversation

@bpowers
Copy link
Copy Markdown
Owner

@bpowers bpowers commented Apr 26, 2026

Summary

  • Adds a new simlin-serve Rust binary plus React SPA (@simlin/serve) that opens a local-first model viewer/editor against any directory of .stmx/.xmile/.mdl/.sd.json files. Includes Loro-based merge, file watcher, in-process MCP server (5 tools), and WebSocket live updates with push notifications (simlin/projectChanged, projectFocused, selectionChanged, diagnosticsChanged).
  • Extracts a transport-agnostic simlin-mcp-core library; refactors simlin-mcp to a thin rmcp-stdio binary on top of the new library. Wire format preserved (PascalCase tool names, structuredContent, -32002 resource-not-found, protocol version 2025-11-25).
  • Documents the V1 threat model (docs/threat-model.md) and adds Host/Origin allowlist middleware as DNS-rebinding defense (CVE-2025-66414).

Implementation plan

8-phase plan executed end-to-end with per-phase code review:

Every acceptance criterion in the design plan is covered by at least one automated test (verified by the test analyst).

Test plan

  • CI smoke matrix passes on ubuntu-latest, macos-latest, windows-latest
  • cargo test --workspace passes locally
  • pnpm --filter @simlin/serve-web run test passes
  • Manual: cargo run -p simlin-serve <dir-of-models> opens browser to the project list
  • Manual: edit a model in the browser, verify on-disk write
  • Manual: edit a .mdl, verify .sd.json sidecar created and original .mdl unchanged
  • Manual: configure Claude Code CLI per src/simlin-serve/README.md; verify all 5 MCP tools work
  • Manual: open two browser tabs, edit different stocks concurrently, verify both edits land
  • Manual: external vim edit while browser is open, verify in-flight edits preserved via merge
  • Manual: git commit a tracked file, verify sidebar git-status flips within ~1s
  • Manual: port-conflict check (--mcp-port 7878 twice) shows the friendly diagnostic

Tracked deferrals

bpowers added 30 commits April 25, 2026 15:42
Adds the detailed phase-by-phase implementation plan and AC-to-test
mapping under docs/implementation-plans/2026-04-05-server-rewrite/.
Refines docs/design-plans/2026-04-05-server-rewrite.md to align with
the structured implementation plan, and links both from docs/README.md.
First step of the @simlin/serve npm package: register a new Cargo
workspace member at src/simlin-serve with a placeholder main.rs and the
two dependencies the next task needs (tokio + axum 0.8). Subsequent
tasks promote this to a lib+bin crate and add the rest of the deps as
they are needed, keeping each commit small and reviewable.
Promote simlin-serve to a lib+bin crate so an integration test can
exercise the router via tower::ServiceExt::oneshot without binding a
TCP port. The library exposes build_router() returning an Axum 0.8
Router with a /healthz endpoint and a TraceLayer; the binary parses
--port via clap (default 0 lets the OS pick), binds 127.0.0.1, prints
the resolved URL, and serves the router.

Defaults RUST_LOG to simlin_serve=info when no env filter is set so
pre-commit and CI runs stay quiet by default while still letting users
opt into request traces. The integration test asserts the /healthz
contract (200 OK, body 'ok') so later refactors can't silently break
the operational healthcheck used by the npm bootstrap.
Move CLI parsing into its own module so future tasks can plumb the
resolved arguments through ProjectRegistry without expanding main.
The Args struct exposes the four flags called out in the design plan:
ROOT (positional, defaults to current_dir at use site), --port (default
0 for ephemeral), --mcp-port (default 7878, parsed today but unused
until the MCP server arrives in a later phase), and --no-open.

root_or_cwd() returns std::io::Result so a missing or unreadable cwd
surfaces as a clean error instead of an unwrap during startup. The CLI
flag is added now -- ahead of the MCP server -- so users do not have to
relearn the surface once the MCP integration lands. Inline tests cover
the defaults, explicit overrides, and per-flag parsing.
Introduces the in-memory registry that subsequent tasks populate with
discovered models and git state. Path keys are absolute and canonicalized
so HTTP handlers can look up by fs::canonicalize results, while the
ProjectMeta.path field is relativized to the registry root because that
is what the SPA renders.

Phase 1 always reports version=0; Phase 2 introduces the optimistic-lock
counter that increments on save.
Walks the configured root via the ignore crate so the user's .gitignore
rules are honored automatically (covers project-specific build artifacts
without us hardcoding them). Adds a small universal denylist for
directories that have no business being scanned regardless of git state:
node_modules, .git, target, playwright-report, test-results.

follow_links is left at false so symlink cycles cannot loop the walker;
the integration test creates a self-referential symlink and asserts the
real file is discovered exactly once.
Shells out to the system git binary rather than linking libgit2/gix so
the user's hooks, custom config, and submodule layout are honored
verbatim. Each GitProbe lazily caches porcelain + ls-files output per
(repo_root, mtime_of_index); the next call after a stage/commit
recomputes transparently because git updates the index mtime.

Per the design, files appearing in porcelain output (including ?? lines
emitted by --untracked-files=all) are reported Tracked { dirty: true }
because they are inside the working tree but not yet committed. Files
in ls-files but absent from porcelain are Tracked { dirty: false }.
Anything else, including .gitignore matches, falls through to Untracked.

When git is missing from PATH or fails to spawn, every call returns
Unavailable so the SPA can show the AC2.5 hint without crashing the
server.
scan_into_registry walks the configured root, stats each match, probes
git status for it, and upserts the resulting ProjectMeta. Per-file
errors (missing metadata, unreadable file) are logged via tracing::warn
and skipped so one bad file doesn't poison the whole listing; scan-
level errors (unreachable root, walker init failure) propagate via
ScanError.

Paths are canonicalized once at the root and again for each discovered
file so registry keys are stable and lookup-by-canonical-path works
regardless of how the caller phrased the original path. Phase 4's file
watcher will replace eager re-scans with incremental updates; the
function signature stays the same.
Adds AppState (registry, git, root) wrapping Arc-shared per-process
context and a list_projects handler that re-runs scan_into_registry on
every call so listings always reflect the live filesystem. Phase 4 will
swap the eager rescan for a watcher.

Wire paths use forward slashes regardless of host OS so the SPA can use
them as URL segments without further normalization. The git_available
flag mirrors the GitProbe so the SPA can show the AC2.5 hint.
Adds the per-file read endpoint via an Axum 0.8 wildcard route. The
handler sanitizes user-supplied paths before any filesystem access (any
.. segment, absolute path, or null byte yields 400) and re-checks the
canonical resolved path against the canonical scan root after lookup
(403 if it escapes via a symlink).

Format dispatch lives in a small parse.rs module that mirrors
simlin-mcp's open_project for now; Phase 5 will consolidate them. .mdl
files are parsed via the native open_vensim path rather than the xmutil
C++ path so the cross-build doesn't pick up a C++ toolchain dependency.

The .mdl sidecar preference (read <basename>.sd.json instead when it
exists alongside the .mdl) is implemented on the read side here; Phase 2
adds the corresponding write side.
build_router now mounts both TraceLayer and a RequestBodyLimitLayer of
4 MiB. The limit is applied outside the trace layer so an oversized
request is rejected before any tracing event records its size; bumping
the cap is a Phase 2 follow-up once the save path lands and large
projects need to round-trip.

main constructs AppState (registry, git probe, canonical root) and runs
scan_into_registry once at startup so the first request is fast.
The new web/ subdirectory will host the read-only viewer SPA that
simlin-serve embeds via rust-embed. Adds the package to pnpm
workspaces and dep-policy under @simlin/serve-web; ignores the dist/
output. App.tsx is intentionally a class component placeholder until
Task 12 wires in the project list and Editor host. Excludes
simlin-serve/** from src/app's TypeScript program because the app
glob-includes ../**/*.ts and would otherwise trip on the new
vite.config.ts type imports.
Replaces the App.tsx placeholder with a class-component shell that
fetches /api/projects on mount, renders a left ProjectList sidebar
with per-file git chips, and a right EditorHost that fetches the
project's canonical JSON and hands it to @simlin/diagram's Editor in
embedded + readOnlyMode mode (per the implementation plan: embedded
disables canvas interaction, readOnlyMode adds the user-facing
warning toast). The .mdl-as-source path renders an informational
banner about the Phase 2 sidecar; the AC2.5 git-unavailable hint is
session-dismissable.

The vite config gains the wasm + top-level-await plugins so that
@simlin/engine's WASM import in the Worker bundle survives Rollup's
strict ESM-wasm rules. Tests use Jest + jsdom + @testing-library
with a small @simlin/diagram mock that captures the props handed to
Editor without spinning up the WASM engine in the test runner.
The static_assets module derives Assets via rust-embed against
web/dist/ and serves it through an axum fallback handler. The
fallback rule is SPA-friendly: paths that look like assets (contain
a dot) 404 on misses, while extensionless paths fall through to
index.html so client-side routing works without server help.

The /api/* and /healthz routes stay on Router::route(...) so they
keep precedence over the fallback. build.rs gates the actual pnpm
build behind SIMLIN_SERVE_BUILD_WEB=1 (set by CI) but always ensures
web/dist/ exists on disk so rust-embed's compile-time folder check
passes during a plain 'cargo build'.
Exercises every Phase 1 route from a tempdir-backed AppState in a
single test: /healthz, /api/projects, /api/projects/teacup.stmx, and
/. The / portion is gated on web/dist/ being built (with a
documented eprintln when skipped) so 'cargo test --workspace' keeps
passing on configurations where the frontend hasn't been compiled.

Per-route tests in api_projects.rs, api_get_project.rs, healthz.rs,
and static_assets.rs already cover the individual contracts; this
suite catches regressions in their composition and routing
precedence.
main.tsx reads ?token=... from window.location, stashes it under
sessionStorage['simlin-serve-token'], then rewrites the URL via
history.replaceState so the token is no longer visible in the
address bar or browser history. api.ts threads the token onto every
/api/* request as 'Authorization: Bearer <token>'.

Phase 1 only wires the storage and header; the simlin-serve binary
doesn't enforce the bearer yet. Phase 3 will reject requests
missing the header alongside the WebSocket upgrade work, at which
point the existing wiring becomes load-bearing.
Without this the root 'pnpm test' and 'pnpm tsc' commands skip the
new SPA package, so a failing test or type error in
src/simlin-serve/web/ wouldn't surface in the standard developer
flow. Mirrors the existing pattern of explicit --filter lists in
package.json.
Add token::generate_launch_token() returning a 43-character URL-safe
base64-encoded representation of 32 random bytes (256 bits of entropy)
sourced from the OS via rand 0.10's thread-local RNG. The token is
defense-in-depth on top of the loopback bind: it guards against another
local process racing us to the launch URL by guessing the ephemeral
port. WebSocket bearer enforcement (the actual consumer) lands with the
WebSocket itself in a later phase; for now we only need to issue and
embed the token in the launch URL.
Wrap open::that(url) in launcher::open_browser, returning bool and
emitting a single user-facing line to stderr on failure rather than
crashing — the server itself stays healthy and the user can copy the
URL printed to stdout. main.rs now always prints the launch URL
(http://127.0.0.1:<port>/?token=<token>) so users running headless or
with --no-open still see the address; the browser-open call is gated
behind !args.no_open. The pure URL-builder (build_launch_url) is split
out so we can unit-test the format without binding a port.

The launcher's failure-path test predicates on $DISPLAY being unset
because we cannot reliably predict open::that's outcome on a developer
desktop; CI runs with no display and exercises the fallback.
Mirror the @simlin/mcp distribution model: a small npm wrapper at
src/simlin-serve/package.json with four optionalDependencies that resolve
the per-platform native binary at runtime. The bin/simlin-serve.js shim
maps (process.platform, process.arch) to the matching @simlin/serve-*
package, falls back to vendor/<triple>/simlin-serve for local dev (with
an MSVC fallback on Windows since cargo defaults to MSVC there), and
forwards stdio + signals + exit code so callers see the binary's behavior
unchanged. The launcher prints actionable remediation when the binary is
missing rather than failing silently.

Phase 1 ships four platforms (darwin-arm64, linux-{arm64,x64}, win32-x64)
matching @simlin/mcp parity; darwin-x64 is deferred to Phase 8.
Mirror src/simlin-mcp/build-npm-packages.sh: read the version from
Cargo.toml, then emit a package.json under
npm/@simlin/serve-<platform>/ for each of the four supported triples.
The platform packages have empty bin/ directories at this point;
the cross-build step (Task 20) and CI workflow (Task 21) drop the
native binary in before npm publish.

The integration test mirrors simlin-mcp/tests/build_npm_packages.rs:
it copies the script and Cargo.toml into a tempdir to run hermetically
without polluting the source tree, then validates the generated
package.json files have the right name, version, os, cpu,
publishConfig, and repository fields. Two static-string tests verify
the wrapper package.json and JS launcher remain consistent with this
generator.
Mirror src/simlin-mcp/{Dockerfile.cross,scripts/cross-build.sh} for
simlin-serve, with two divergences. First, the Dockerfile bundles
Node and pnpm because build.rs shells to pnpm to produce web/dist/
which rust-embed ingests at compile time; mcp has no embedded
frontend so its image stays leaner. Second, the script streams the
repo into the container via tar instead of bind-mounting it
read-only, since pnpm needs to write node_modules/ and vite needs
to write web/dist/ during the in-container build. Bind-mounting
read-write would pollute the host workspace; piping a tar archive
keeps the host clean while staying within Docker's standard
file-transfer primitives.

The script accepts a single positional argument naming a platform
(linux-x64 / linux-arm64 / win32-x64 / all) so individual targets
can be rebuilt without doing the full sweep. SIMLIN_SERVE_BUILD_WEB=1
is set inside the container so build.rs runs the frontend build.
Output lands in src/simlin-serve/dist/<triple>/simlin-serve[.exe],
matching the path that build-npm-packages.sh + the release workflow
copy from.
Mirror .github/workflows/mcp-release.yml as serve-release.yml, the
serve-v* tag pushes triggering a four-target matrix build (linux-x64,
linux-arm64, win32-x64 via cargo-zigbuild on ubuntu-latest;
darwin-arm64 native on macos-latest). Each runner installs Node 24
plus pnpm, builds the embedded SPA into web/dist/ via vite, then
compiles the Rust binary with --locked and uploads it as an
artifact. The publish-platform job copies the four artifacts into
the per-platform npm packages, chmods +x the three Unix binaries,
and runs npm publish --provenance with an npm-view existence guard
so partial-failure reruns skip already-published packages. The
publish-wrapper job updates the wrapper version + optionalDependencies
to match validate.outputs.version and publishes @simlin/serve last.
NPM OIDC trusted publishing handles auth (NODE_AUTH_TOKEN: '' in
each step clears any token actions/setup-node tries to inject).

scripts/release-serve.sh transcribes scripts/release-mcp.sh to serve:
validate semver, fail loudly on dirty tree, run cargo test, bump
Cargo.toml + Cargo.lock + wrapper package.json + optionalDependencies,
regenerate platform packages, sanity-check that all five package.json
files agree on version, commit, tag serve-v<version>. Does NOT push
the tag.

Static workflow validation in tests/serve_release_workflow.rs mirrors
mcp_release_workflow.rs: parses the YAML and asserts trigger pattern,
matrix shape, OIDC permission scoping, --provenance on every publish,
chmod placement, Dockerfile-vs-toolchain sync, --locked release
builds, rerunnability guards, and serve-v tag-guard on publish jobs.
The symlink-escape defense (canonicalize + starts_with check) existed but
had no automated coverage. Add a #[cfg(unix)] test that creates a symlink
inside the scan root pointing to a file in a separate tempdir, then asserts
GET /api/projects/innocent.stmx returns 403 Forbidden.

Add two sidecar-error tests: one where the .sd.json contains invalid JSON
(e.g. "not actually json") and one where it is valid JSON but the wrong shape
(e.g. {"foo":"bar"}). Both must yield 400 Bad Request, not 500, because the
ParseError::Json variant maps to ApiError::BadRequest via the From impl.
The mtime is read before build_repo_cache runs git status and git ls-files.
If .git/index is rewritten between those two reads the stored mtime is stale
by one generation. Add a comment to that effect so the race is not silently
rediscovered during future maintenance.

Also pass -c core.quotePath=false to both git status and git ls-files so that
paths containing non-ASCII characters (e.g. réservoir.stmx) are emitted as
raw UTF-8 rather than C-escaped octal sequences. Without this flag, git's
default core.quotePath=true would cause such filenames to be misclassified as
Untracked.
Add an integration test that commits a file named réservoir.stmx and asserts
status_for returns Tracked{dirty:false}, which would fail against the old code
before -c core.quotePath=false was added to the git invocations.
Some browsers (notably in private/incognito mode) throw on sessionStorage
access rather than returning null. The existing captureLaunchToken and
readLaunchToken functions in launch-token.ts already wrap such reads. Factor
the initial-state read into a readDismissedFlag helper that applies the same
defensive pattern, colocated with the storage key constant.
The previous comment said "the next call will miss the cache and recompute",
which is the opposite of what the code does. build_repo_cache reads index_mtime
at the end, after running the git commands. If the index is rewritten in that
window, the cache stores fresh-looking mtime alongside slightly-stale data.
The next request then hits the cache and returns the stale result; only a
subsequent index rewrite triggers recomputation.

Reword the comment to match the actual cache-hit/miss behavior.
Defines SaveRequest/SaveResponse/SaveError/ValidationError on
simlin-serve's handlers module and registers POST /api/projects/{*path}
alongside the existing GET on the same path. The handler is a stub that
returns Internal('not yet implemented'); subsequent tasks add the
optimistic version check, validation gate, and disk write.

ValidationError mirrors simlin-mcp::tools::types::ErrorOutput
field-for-field but is duplicated here intentionally to avoid pulling
the simlin-mcp crate into simlin-serve. Both crates are sibling
consumers of simlin-engine error types, not dependents of each other.
Adds RegistryError (NotFound, VersionMismatch{expected, actual}) and
ProjectRegistry::check_and_increment which holds the write lock across
the read-compare-increment sequence. Two concurrent calls cannot both
observe the same expected_version and succeed; the loser sees
VersionMismatch with the post-increment 'actual' value.

The lock is released BEFORE any subsequent file I/O. The version is
'claimed' optimistically: if the disk write later fails, the registry's
counter is one ahead of disk content. This is intentional for Phase 2
because the version monotonically increases and the next successful
save resyncs the file with the post-increment counter. Phase 3's
Loro-doc cache replaces this scheme with a different concurrency model.

Includes a barrier-synchronized two-thread test to verify exactly one
caller wins under contention, plus single-threaded coverage of the
match/mismatch/not-found paths.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 95cf851012

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/simlin-serve/src/mcp/access.rs Outdated
}

let bytes = serialize_for_create(project, project_format)?;
simlin_engine::io::atomic_write(&resolved, &bytes).map_err(AccessError::WriteError)?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Make create path collision-safe under concurrent requests

create() performs a non-atomic existence check and then calls atomic_write, so two concurrent CreateModel calls targeting the same path can both pass resolved.exists() and the later rename-over silently overwrites the first file instead of returning AlreadyExists. This violates the create contract and can lose model content when multiple MCP sessions/agents race on the same filename; use an exclusive-create write path (create_new) to make the check+write atomic.

Useful? React with 👍 / 👎.

Comment thread src/simlin-serve/web/src/App.tsx Outdated
Comment on lines +98 to +103
const previous = prev.liveVersions[msg.path] ?? 0;
// Versions are monotonically increasing per path; if a stale
// event arrives (e.g. due to broadcast ordering races), keep the
// higher value so the EditorHost refetch gate doesn't oscillate.
if (msg.version <= previous) {
return null;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat unseen projectChanged paths as newer than version 0

The stale-event gate initializes previous to 0, then drops any event where msg.version <= previous; that causes first-time projectChanged events at version 0 (emitted on new-project creation) to be ignored for paths this tab has never seen. In practice, models created in another tab/session won't be reflected in live state from this event path, so the sidebar stays stale until a manual reload.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Code Review

I reviewed the new simlin-serve binary, the simlin-mcp-core extraction, the host/origin DNS-rebinding mitigations, the WebSocket auth path, the LoroDoc-backed merge primitive, the file-watcher/echo-suppression flow, and the supporting npm wrapper.

The implementation is solid: the security model is consistent with the documented threat model, the CRDT merge primitive funnels every write surface through check_increment_and_merge / merge_disk_change, race windows around prime_echo_hash / commit_write / redirect_to_sidecar are explicitly reasoned about, and the test surface (unit + integration) is thorough.

A couple of small observations that aren't blockers:

  • MCP vs HTTP read asymmetry for .mdl with a sidecar -- RegistryAccess::open (src/simlin-serve/src/mcp/access.rs:256-304) does not apply the sidecar-preference rule that handlers::get_project does (src/simlin-serve/src/handlers.rs:181-190). Once a save has run redirect_to_sidecar, the .mdl registry entry is gone and MCP open of the .mdl path returns NotFound; a subsequent ListProjects-triggered scan re-creates a fresh .mdl-keyed entry whose hydrated doc reflects the on-disk .mdl bytes, while HTTP reads of the same .mdl path transparently follow the sidecar redirect to the saved content. Two different MCP ReadModel calls on the same .mdl path can therefore produce different results depending on whether a scan has run between them. Worth verifying this matches the intended UX.

  • prime_echo_hash keyed on the .mdl path before sidecar redirect (src/simlin-serve/src/handlers.rs:874-880, mirrored in src/simlin-serve/src/mcp/access.rs:441-446) -- within the small window between commit_write and redirect_to_sidecar, a watcher event for the freshly-written .sd.json looks up by sidecar canonical path and finds no registry entry (or a stale scanner-inserted one without the primed hash). This produces an idempotent re-merge rather than a correctness violation, but the race is wider than the inline comment suggests.

Neither rises to block the PR; both are points to validate against the design plan.

Verdict

Correct. No blocking bugs found.

bpowers added 2 commits April 26, 2026 08:37
The .mdl save flow primes the echo-suppression hash on the .mdl
registry key but then writes bytes to the .sd.json sidecar. The
watcher's lookup-by-canonical-path keys on the sidecar path; in the
window between commit_write firing the OS event and
redirect_to_sidecar re-keying the entry, that lookup returns no
entry (or a stale scanner-inserted one with last_disk_hash = 0).
The watcher falls into the merge path and broadcasts a spurious
ProjectChanged{Disk} for content the server itself just wrote.
The cached doc state is unchanged, but observers see a phantom
disk-source notification and the version on the sidecar entry can
end up bumped a second time.

Add ProjectRegistry::prime_sidecar_echo_hash, which establishes a
sidecar placeholder mirroring the .mdl entry's data with the primed
hash and a shared doc Arc. Both save handlers (HTTP and MCP) call
it for SidecarJson targets before commit_write, so the watcher's
echo-suppression check finds the placeholder and short-circuits
without re-merging or broadcasting. The shared doc Arc keeps reads
via either key consistent through the brief window before
redirect_to_sidecar collapses the entries.

The integration test arms the placeholder, atomic-writes the
sidecar bytes, and asserts no Disk-source broadcast fires; verified
by temporarily disabling the prime call (test failed loudly with a
ProjectChanged{Disk} for the server-initiated write) before
restoring it.
HTTP get_project swaps a `.mdl` request to its `.sd.json` sidecar
whenever the sidecar exists on disk (handlers.rs sidecar_for_mdl
check). RegistryAccess::open went directly through get_or_init_doc
on the canonicalized request path, so MCP ReadModel(.mdl) returned
either NotFound (when redirect_to_sidecar had removed the .mdl
entry from a prior save) or the parsed .mdl bytes (when a scan
re-inserted the .mdl entry between save and read). Two MCP reads
of the same .mdl path could disagree depending on whether a scan
ran between them, and either disagreed with HTTP.

Apply the same sidecar.is_file() preference rule before the registry
lookup. The disk check is the single source of truth, matching the
HTTP read path; even a stale .mdl-keyed registry entry is bypassed
when the sidecar exists. Both the doc lookup and the meta lookup
key on the resolved path so source_format and version come from
the entry that actually backs the returned project.
@bpowers
Copy link
Copy Markdown
Owner Author

bpowers commented Apr 26, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f32299f25

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +105 to +107
return {
liveVersions: { ...prev.liveVersions, [msg.path]: msg.version },
liveSources: { ...prev.liveSources, [msg.path]: msg.source },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Refresh sidebar when projectChanged introduces a new path

projectChanged currently only updates liveVersions/liveSources, so if the event is for a path not already in projects (for example, a model created by another browser tab, an MCP CreateModel, or a new file picked up by the watcher), the sidebar never gains that project until a full reload. The server emits ProjectChanged for these create/add flows, so this branch should also reconcile projects (or trigger loadProjects) when msg.path is unknown.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Code Review Findings

A focused pass over the new simlin-serve and simlin-mcp-core code surfaced the issues below, ordered roughly by severity. Skipping nits and covered-by-existing-tests behavior; flagging only items the author would likely want to fix.


[P1] Watcher rename keeps stale format across cross-extension renames

File: src/simlin-serve/src/watcher.rs:745 (and src/simlin-serve/src/registry.rs:610-622)

handle_model_rename receives the destination's format: ProjectFormat (already classified for the new extension) and calls state.registry.rename_entry(&from_key, &to_key), but rename_entry is documented to preserve the source entry's format verbatim and the watcher never reapplies the new one. A rename across formats (e.g. `foo.xmile` → `foo.stmx`, or `.stmx` → `.mdl`) leaves the registry holding the old format on the new key. The next `get_or_init_doc`/`merge_disk_change` calls `parse_to_datamodel` with the wrong format, which either fails (`HydrationFailed`) or produces nonsense. Trigger: external rename across one of the supported extensions while the server is running. Fix: pass the new `format` into `rename_entry` (or `upsert` over the renamed entry afterwards).


[P1] Watcher's first-touch baseline is computed from the new disk content, defeating validation

File: `src/simlin-serve/src/watcher.rs:544-580`

The comment states "If the doc hasn't been hydrated yet … the baseline is empty, which means any new error will block the merge." But `state.registry.get_or_init_doc(&canonical)` returns `Ok` for any registry-tracked entry by hydrating from disk — and at this point the disk content is the just-arrived new content. So `baseline := compute_baseline(new_project)`, `validate_save_project(new_project, &baseline)` returns no new errors, and the merge proceeds regardless of how broken the file is. The empty-baseline branch only fires when the entry is missing from the registry entirely. Trigger: any external editor save to a model file the SPA hasn't yet opened (very common — initial scan registers every model but only opened files are hydrated). Fix: compute the baseline from a snapshot taken before this watcher event would have written, or skip the disk-rehydration shortcut on first touch.


[P2] `Simulate` MCP tool uses literal `"main"` without the `resolve_model_name` fallback

File: `src/simlin-serve/src/mcp/simulate.rs:129-132,167`

`ReadModel` (`read_model.rs:72-73`) and `EditModel` (`edit_model.rs:243-244`) both call `simlin_mcp_core::open::resolve_model_name`, which falls back to `project.models.first()` when the project has no model literally named `"main"`. `Simulate` instead does `input.model_name.unwrap_or_else(|| "main".to_string())` and passes that straight to `compile_project_incremental`. For a single-model XMILE/Vensim project whose model is named e.g. `"Population"`, `ReadModel` succeeds but `Simulate` fails with a confusing engine-level error when `model_name` is omitted. Apply `resolve_model_name(&project, &model_name)` before the `spawn_blocking` call.


[P2] `initialize` unconditionally spawns a notification forwarder per call

File: `src/simlin-serve/src/mcp/server.rs:239-261`

The `peer_info().is_none()` guard correctly protects `set_peer_info`, but the `tokio::spawn(forward_events_to_peer(peer, bus_rx))` immediately below has no idempotency check. rmcp does not reject a re-`initialize` on the same session at the handler level, so a client that re-sends `initialize` (buggy or otherwise) gets two forwarders subscribed to the same `EventBus`, doubling every `simlin/projectChanged`/`simlin/diagnosticsChanged` notification. Move the spawn under the same `peer_info().is_none()` gate.


[P2] WebSocket `consecutiveFailures` is only reset on a received message, never on a successful open

File: `src/simlin-serve/web/src/ws.ts:174-209,230-246`

`consecutiveFailures = 0` only fires inside `handleMessage` (line 204-207). Most projects go long stretches with no broadcast frames, so a connection that opens cleanly but receives nothing, then closes (laptop sleep/wake, network blip, server reload), bumps the counter every cycle. After `MAX_CONSECUTIVE_FAILURES = 10` cycles — readily reachable over a workday — `scheduleReconnect` flips `closed = true` and emits `'dead'`, permanently killing live updates with no recovery path. Reset `consecutiveFailures` (and emit `'connected'`) inside `handleOpen` so a stable-but-quiet connection is treated as healthy.


Overall correctness verdict

Not correct. The PR is largely well-tested and the security posture (loopback bind, host allowlist, token auth, constant-time compare) looks solid, but the two P1 issues above are real correctness bugs in the watcher path that compromise the validation gate and survive the existing test suite. The P2 items are smaller but each describes user-reachable wrong behavior. None are release-blocking on their own; the watcher format and baseline issues are worth addressing before this lands.

bpowers added 3 commits April 26, 2026 08:58
After a .mdl save, redirect_to_sidecar removes the .mdl registry
entry and the canonical state lives at the .sd.json sidecar key.
The save handler's resolve_save_path returned the .mdl canonical
path unconditionally, then ensure_or_get inserted a fresh .mdl
entry at version 0 when the request POSTed back to the .mdl URL.
The optimistic-lock check ran against that fresh entry instead of
the sidecar's actual version, so a stale tab POSTing version=0
silently overwrote newer sidecar content with stale data.

Apply the same sidecar.is_file() preference rule resolve_save_path
already misses but get_project, RegistryAccess::open, and
resolve_save_target all enforce: when the request is .mdl and a
sibling .sd.json exists on disk, route the save to the sidecar
key so the version check sees the entry that backs the canonical
state. The response path still points at the sidecar (existing
SidecarJson arm + new SdJson arm both relativize against the
written file), so the SPA's onPathRedirect continues to work.

Mirror the same change in RegistryAccess::save so MCP EditModel
calls behave consistently with HTTP saves and with the open()
preference rule landed earlier in this PR.

Test stages a two-tab race: Tab A's save creates the sidecar at
v1 and removes the .mdl entry. Tab B (stale) POSTs to .mdl with
version=0; the response is now 409 instead of 200, and the
sidecar still holds Tab A's content rather than Tab B's stale
overwrite.
The MCP CreateModel path checked existence with `resolved.exists()`
and then wrote via `simlin_engine::io::atomic_write`, which is
write-tempfile + rename. The exists check and the rename are
independent: any racer that passed exists() before another's rename
landed produced an atomic_write into the same `{path}.new` temp,
and the second writer's rename either silently overwrote the first
file's content or failed mid-rename with a confusing NotFound
error from the temp/rename interaction.

Replace the pattern with `OpenOptions::create_new(true)` (mirroring
HTTP `POST /api/projects/new` in handlers.rs::create_new_project).
The kernel guarantees at most one racer passes the create-new
check, so concurrent CreateModel calls produce exactly one winner
and (N-1) AlreadyExists failures. Same crash-safety trade-off as
the HTTP create path: a mid-write crash leaves a partial file, but
for new empty projects the content is small and the user simply
retries. Best-effort cleanup removes the partial file on write or
sync failure so subsequent retries are not blocked.

Test runs eight concurrent creates synchronised on a barrier so
they enter the unsafe window simultaneously; before the fix, losers
returned a mix of NotFound errors from the temp/rename race; after
the fix all losers cleanly surface AlreadyExists.
Three small App.tsx fixes from the latest codex review pass:

1. The handleLiveMessage gate read `prev.liveVersions[msg.path] ?? 0`
   and dropped any event whose version was <= that default. First-time
   projectChanged events at version 0 (a fresh CreateModel, a watcher
   discovery, or a brand-new registry entry) therefore never updated
   live state for paths the receiving tab had never seen. Compare
   against `undefined` explicitly so unseen paths always land; for
   already-observed paths the strict-greater check still suppresses
   stale broadcasts.

2. The same handler updated only liveVersions/liveSources and never
   reconciled the projects list. A projectChanged for a path the
   sidebar did not yet know about (typical for cross-tab create
   flows) left the sidebar stale until a manual reload. After the
   setState, when the path is missing from `projects`, kick off a
   loadProjects() so the new entry appears.

3. handleDismissGitHint wrote to sessionStorage without a try/catch.
   readDismissedFlag already wraps reads because some browsers
   (notably Safari in private mode) throw on any storage access; the
   write path could throw synchronously and bubble an uncaught UI
   error, leaving the hint stuck on. Wrap setItem in the same
   try/catch so the in-memory dismiss always lands.

Tests cover the cross-tab create flow (verify fetchProjects fires
and the new entry appears) and the throwing-storage dismiss
(stub the prototype's setItem to throw, click dismiss, assert
no exception bubbles and the banner disappears).
@bpowers
Copy link
Copy Markdown
Owner Author

bpowers commented Apr 26, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0340e229d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +533 to +536
match OpenOptions::new()
.write(true)
.create_new(true)
.open(&outcome.path)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate parent path before creating the new file

create_new_project performs the root-boundary check only after OpenOptions::create_new(true) has already created and written the file, so a request like parent_dir="escape" where escape is a symlink out of the workspace can create a file outside state.root before returning 403. The best-effort cleanup may also fail, leaving an unintended external file behind. This should be rejected before any write by canonicalizing and validating the target parent path first.

Useful? React with 👍 / 👎.

Comment thread src/diagram/Editor.tsx Outdated
// guarantees the call happens after the setState commit.
const onSelectionChanged = this.props.onSelectionChanged;
if (onSelectionChanged) {
setTimeout(() => onSelectionChanged(this.getSelectionIdents()), 0);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid delayed selection callback after component switch

The selection callback is dispatched with setTimeout(..., 0) and never canceled, so it can fire after the editor has been unmounted or after the host has switched to a different project. In that case, stale idents from the old editor instance can be forwarded under the new active path (via EditorHost.handleSelectionChanged), producing incorrect selectionChanged notifications for MCP consumers.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Code review (claude-opus-4-7)

Findings below from review of simlin-serve (security middleware, MCP transport, save/watcher paths) and the simlin-mcp-core extraction. Tests, plumbing, and the RegistryAccess race fix in mcp/access.rs look solid; flagging two divergences from the documented threat model.

[P1] MCP /mcp endpoint requires no authentication, contradicting docs/threat-model.md §2

Files: src/simlin-serve/src/mcp/transport.rs (lines 39–60), src/simlin-serve/src/main.rs (lines 53–58)

build_mcp_router only layers host_validator_middleware and TraceLayer in front of StreamableHttpService; there is no bearer-token check on /mcp. Combined with the well-known default --mcp-port 7878, any other OS user on the same machine can connect to 127.0.0.1:7878 and call every tool (ReadModel, EditModel, CreateModel, ListProjects, Simulate) against the user's models — EditModel/CreateModel mutate files via the running simlin-serve process, which holds the launching user's privileges, so this is a write primitive across the user-account boundary. The host-allowlist defense documented in §3 doesn't help here because a same-machine attacker can send the correct Host: header. The threat model explicitly claims (§ "What the server does protect" and §2) that other users cannot send the bearer token; that protection only exists for /api/updates, not for the MCP transport. Either gate /mcp on the same constant-time ?token= check updates_ws_handler already uses, or update the threat model to acknowledge that any local-user process can use the MCP server without authentication.

[P2] ?token= query parameter is logged via TraceLayer, contradicting docs/threat-model.md §8

Files: src/simlin-serve/src/lib.rs (line 90), src/simlin-serve/src/mcp/transport.rs (line 59)

Both routers use TraceLayer::new_for_http() with default callbacks, whose make_span_with_default records the full request URI (including query string) on each request. The default EnvFilter set in main.rs (simlin_serve=info) keeps debug-level spans out of stdout, but any operator who runs with RUST_LOG=tower_http=debug, RUST_LOG=debug, or attaches a tracing subscriber that captures span attributes will see every WebSocket upgrade URL — including the launch token — written to logs. The threat model promises the token "is never written to the server's tracing log even at trace level", and the comment near §8 says "Code that emits a request log event must scrub the token query parameter from the URI". A make_span_with callback that strips token=... from the recorded URI on the /api/updates route (or just on the UI router as a whole) would close the gap.


Overall correctness verdict: correct as code, but the two findings above are protocol/policy mismatches with the published threat model. Neither blocks landing if the threat-model promises are softened to match what the code actually enforces; if those promises are load-bearing for users (e.g. shared dev hosts), the [P1] gap should be closed before release.

bpowers added 6 commits April 26, 2026 11:44
V1 of simlin-serve is intended for single-user workstations: a
developer running `npx @simlin/serve` from a terminal on their
laptop. The trust boundary is the OS user account; any process
running as the same user can already read and write the model
files directly, so a server-side bearer token does not raise the
privilege bar — it just adds friction (clients have to refresh
their config every launch, the token risks leaking via verbose
trace logs).

Strip the token surface:
- Remove src/token.rs, the rand/base64/subtle deps, the
  launch_token field on AppState, the WsParams extractor, the
  tokens_match helper, and the /api/updates token check.
- launcher::build_launch_url returns http://127.0.0.1:<port>/
  with no query string.
- main.rs no longer generates a token before binding.
- Frontend: delete launch-token.ts (capture + read), drop the
  Authorization: Bearer header from api.ts, remove the ?token=
  query string from the WS URL in ws.ts, and stop gating
  openLiveUpdates on token presence in App.tsx.
- Tests: every AppState constructor drops the field; ws_updates
  drops wrong_token / missing_token tests; api.test / App.test
  drop their bearer-header assertions; ws.test asserts the URL
  is /api/updates with no query string.

Update docs/threat-model.md to spell out the V1 stance: loopback
bind plus host- and origin-allowlist for cross-origin defense,
with multi-user shared hosts explicitly out of scope. The
removed token / token-leakage sections are deleted; the
verification table drops the bearer-token-gate row. simlin-serve
CLAUDE.md picks up the same framing in the Architecture and
Contracts sections.
The watcher's classify already determines the destination format
from the new path's extension and passes it down to
handle_model_rename, but the previous rename_entry signature
silently preserved the source format. After mv foo.xmile foo.stmx
(or any cross-extension rename) the registry held Xmile under the
new .stmx key; the next merge_disk_change called
parse_to_datamodel(canonical, Xmile, ...) against .stmx content
and either failed with HydrationFailed or produced nonsense.

Make new_format a required parameter on rename_entry so the
re-key and the format update are atomic under the same write
lock. Watcher's only caller already had the right value in scope;
existing tests updated to pass the new format through.

Test asserts an .xmile -> .stmx rename ends up with Stmx as the
recorded format on the new key.
create_new_project canonicalized + boundary-checked the file path
AFTER OpenOptions::create_new(true) had already created it. A
parent_dir whose existing directory is a symlink out of the root
would let create_new follow the symlink and land bytes outside
state.root before the post-write boundary check fired its 403; the
"best-effort cleanup" remove_file then ran, but a permission or I/O
failure during cleanup would have left the file behind.

Extract the resolve-create-target algorithm out of mcp/access.rs
into a shared `path_resolution` module and call it from both the
HTTP and MCP create paths. The helper walks up to the deepest
existing ancestor, canonicalizes that (resolving any symlinks in
the existing prefix), and rejects when the canonical resolution
escapes the root. The MCP wrapper preserves the existing
"NotFound on out-of-root" mapping so MCP clients still cannot
distinguish "exists but forbidden" from "missing".

Test (Unix-only because symlinks): create `inner/escape ->
outer/escape_target`, POST /api/projects/new with parent_dir=escape,
assert 403 and that no file ever lands at the symlink target.
The post-write canonicalize-then-strip check is kept as
defense-in-depth against TOCTOU between resolve and open.
ReadModel and EditModel both call simlin_mcp_core's
resolve_model_name, which falls back to project.models.first()
when the requested name is not literally present. Simulate took
input.model_name.unwrap_or("main") and passed that straight to
compile_project_incremental, so a single-model project whose model
is named e.g. "Population" succeeded under ReadModel/EditModel but
failed Simulate with "no model named 'main' to simulate".

Apply the same fallback before spawn_blocking. Test seeds an
sd.json with a single model named "Population", calls run() with
model_name=None, and asserts the simulation runs.
initialize unconditionally spawned forward_events_to_peer even
though peer_info().is_none() gated set_peer_info. A buggy client
that re-sent initialize on the same session got two forwarders
subscribed to the same EventBus, doubling every projectChanged /
diagnosticsChanged notification.

Track the spawn via an Arc<AtomicBool> on SimlinServeMcpServer.
peer_info().is_none() is unreliable as a "first call" probe (rmcp
may have already cached the request before our impl runs, which
broke the existing notification tests when I tried that approach
first); a server-owned compare-and-swap flag is the right idiom
and is observable across every clone of the server (rmcp's
session machinery requires Self: Clone, so the Arc shares the
flag across each clone).
…on unmount

Two small wins from the latest review pass.

WebSocket consecutiveFailures: handleMessage was the only path that
reset the counter, so a long-running connection that opens cleanly
but receives no broadcast frames (the user is just reading) walked
the failure counter up across each open/close cycle. After
MAX_CONSECUTIVE_FAILURES (10) such cycles — readily reachable across
a workday of laptop sleep / network blip / server reload events —
the socket flipped to permanent 'dead' with no recovery path.
handleOpen now resets the counter and emits 'connected' so a
quiet-but-stable connection is treated as healthy.

Editor.handleSelection deferred its onSelectionChanged callback via
setTimeout(0) so the React 19 setState commit completes before
getSelectionIdents reads the new state. The pending timer was never
canceled — when EditorHost key-swaps the Editor on a path change
(`${path}#${loadGeneration}`), the prior instance's deferred
callback fires after unmount and lands stale idents on the new
host's path. Track the timer in an instance field, cancel it in
componentWillUnmount, and supersede a pending deferral when the
selection changes again.

Add `clearTimeout` to the shared eslint globals so the new ws.ts
and Editor.tsx references lint clean (the list already had
setTimeout but missed the clearTimeout sibling).
bpowers added 2 commits April 26, 2026 12:11
…tion

Move three helpers that were duplicated across handlers.rs, mcp/access.rs,
and watcher.rs into the shared path_resolution module:

- sidecar_for_mdl: /dir/foo.mdl -> /dir/foo.sd.json
- is_mdl_extension: case-insensitive check for the .mdl suffix
- to_forward_slash: render Path as a /-separated string for the wire

The duplication has been the root cause of multiple P1 review bugs where
one consumer (the read path) applied the sidecar-preference rule and a
second (the save path, the MCP path, the watcher) silently diverged.
Centralizing the primitives is the prerequisite for eliminating the
remaining inline canonicalize-within-root and sidecar-preference logic
in subsequent commits.

The unit tests pin case-insensitive .mdl matching, multi-segment path
preservation, and dotted-stem handling so regressions surface here
rather than in the consumers.
Add resolve_existing_within_root + ResolutionError to path_resolution
and replace the four inline implementations:

- handlers::get_project   (was inline canonicalize + starts_with)
- handlers::resolve_save_path (same)
- handlers::create_new_project post-write check (same)
- mcp::access::canonicalize_within_root (was wrapper, now thin)

The variant boundary -- NotFound / OutOfRoot / IoError -- exposes
exactly the distinction every consumer makes: HTTP renders the three
as 404 / 403 / 500, MCP collapses all three to AccessError::NotFound
(intentional, so MCP clients cannot probe for files outside the root),
and the create handler's post-write check renders OutOfRoot as 403 and
the rest as 500. The shared helper means a future contributor adding a
new consumer literally cannot forget the descendant check.

The behavioural difference between the previous handlers code and the
new path is purely in the human-readable error message text -- the
HTTP status mappings and the MCP error mappings are byte-identical to
what they were before.
@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Code Review

Reviewed the new simlin-serve crate (registry, handlers, watcher, MCP integration), the extracted simlin-mcp-core library (ProjectAccess trait, tool implementations, server handler), the refactored simlin-mcp binary, the diagram Editor.tsx changes, and the security middleware (host/origin validation).

The code is exceptionally well-engineered: extensive doc comments explaining why, the functional-core / imperative-shell pattern is followed consistently, optimistic locking is correctly handled under a single registry write lock, echo-suppression hashing closes the watcher race window, and the test coverage is dense (parity tests, race-condition tests, sidecar redirect tests, end-to-end propagation tests).

Findings

[P3] simlin-mcp FileSystemAccess::create retains the TOCTOU race that simlin-serve explicitly fixed

File: src/simlin-mcp/src/access.rs (lines 99–121)

The new simlin-serve RegistryAccess::create was rewritten to use OpenOptions::new().create_new(true) precisely because (per its own test create_is_collision_safe_under_concurrent_calls and the comment the existing exists()-then-atomic_write pattern is non-atomic) two concurrent racers passing the exists() check before either's rename completes can silently overwrite the first writer's content. The same try_existsatomic_write pattern is preserved here in the stateless filesystem impl. Because simlin-mcp is a single-client stdio binary, this race is unlikely to trigger from the MCP side itself, but a concurrent external creation (another tool or the user) between try_exists and atomic_write will silently overwrite. Severity is low — single-client semantics make it rare — but applying the same create_new(true) fix would close the gap and bring the two access impls into parity.

Overall correctness

The patch should be considered correct. The change is large but disciplined, and tests cover the high-risk surfaces (concurrency, sidecar redirection, watcher echo-suppression, MCP↔HTTP parity). The single finding above is low-severity and primarily a consistency note rather than a blocker.

bpowers added 4 commits April 26, 2026 12:21
Add apply_sidecar_preference + ResolvedKey to path_resolution and
replace the four divergent inline implementations:

- handlers::get_project (was matches!(initial_format, Mdl) +
  sidecar.is_file(); did NOT canonicalize the sidecar)
- handlers::resolve_save_path (was the same, but DID canonicalize the
  sidecar without checking the result against the root)
- mcp::access::open (canonicalized the sidecar, no root check)
- mcp::access::save (same)

The previous divergence was the bug shape the refactor exists to
prevent: a malicious or misconfigured sidecar that is a symlink whose
target lives outside the registry root would have been silently
followed by the save path (and by MCP open/save), letting writes land
outside the watched tree. The shared helper canonicalises the sidecar
AND verifies the result lives inside root_canonical; on any failure
mode it falls back to the .mdl path so the user still gets a usable
state.

The watcher's sidecar check is intentionally NOT migrated: it asks a
different question ("should I skip this .mdl event entirely?") whose
shape is binary, not "what registry key should I use?". It already
consumes the centralized sidecar_for_mdl helper from the previous
commit.

Tests: 272 lib + integration tests pass; new unit tests cover
non-mdl input pass-through, missing-sidecar pass-through, sidecar
hit, case-insensitive .mdl matching, and the symlink-out-of-root
fall-back.
Replace the leftover crate::path_resolution::CreatePathError fully-
qualified references with the path_resolution:: shorthand already in
scope from the use statement above. Pure code cleanup; no behavior
change.
The module's existing doc listed three concerns but didn't say which
function implemented each one. Now that apply_sidecar_preference and
resolve_existing_within_root have landed, name them explicitly so a
future contributor reading the module knows which primitive to call
without grepping. Also call out the trivial helpers (sidecar_for_mdl,
is_mdl_extension, to_forward_slash) and why they live here.
…LAUDE.md

Five iterations of bot review on PR #476 surfaced the same shape of
bug repeatedly: an implicit rule (sidecar-preference, canonicalize-
within-root, validation gate) applied in one consumer of a
primitive but forgotten in another. Each round found the next site
that diverged. Three doc changes capture what we learned so future
contributors don't relitigate the same class of bug.

address-feedback (both .claude/skills and .agents/skills versions):
- Add a "Pre-PR self-review" section that runs BEFORE the review
  loop. Walks the diff with reviewer eyes against four questions:
  contract edges, consumer audits when contracts change, threat-
  model alignment when auth/transport/path-validation code shifts,
  and the N>=3 duplication smell. The review loop should be
  catching things you genuinely missed, not things visible to
  anyone reading the diff.
- Inside Step 2 (Evaluate feedback), make the consumer audit an
  explicit step: when the fix changes a primitive's contract, grep
  for every other call site and apply the same change everywhere.
  This is the single biggest source of "still finding P1s after N
  iterations" — fixing one consumer and shipping leaves the
  reviewers to find the rest.
- Re-emphasise that tests should probe the CONTRACT edge (symlink
  escape, format mismatch, race window), not just the call site.

src/simlin-serve/CLAUDE.md:
- Document the path_resolution module and the rule that new
  consumers MUST call its helpers rather than inlining the rule.
  This is the structural fix for the pattern: implementing the
  rule once forecloses the "consumer X forgot it" failure mode.
- Add the same as a top-level Contract entry so it's loud.
@bpowers
Copy link
Copy Markdown
Owner Author

bpowers commented Apr 27, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6c99404414

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +231 to +235
if ext == "mdl" {
return Err(AccessError::ParseError(anyhow::anyhow!(
"Vensim .mdl files are read-only. Use ReadModel to inspect a .mdl file, \
then CreateModel to start a new .sd.json file you can edit."
)));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove hard-coded .mdl rejection from generic EditModel

edit_model rejects any path ending in .mdl before it even calls ProjectAccess, which makes the behavior wrong for backends that intentionally support MDL sidecar writes. In this same commit, simlin-serve's RegistryAccess::save implements .mdl -> .sd.json sidecar persistence, so MCP EditModel requests against .mdl paths will fail unconditionally despite server capabilities and documented format support. This check should be backend-specific (e.g., in the filesystem access impl) or based on opened/source format instead of raw extension.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 27, 2026

Code review

A few findings from reviewing the diff. The PR is large but the patches I read look careful overall; the items below are bugs I have high confidence the author would want to fix.

[P2] SPA drops every diagnosticsChanged, projectFocused, and selectionChanged frame as 'unknown shape' -- src/simlin-serve/web/src/ws.ts:259-280

isWsMessage only accepts projectChanged / projectRemoved / projectRenamed, but the server's WsMessage enum has six variants. events.publish is called with WsMessage::DiagnosticsChanged from diagnostics::maybe_emit_diagnostics_changed after every successful HTTP save, MCP edit, and watcher merge (handlers.rs:1032, mcp/access.rs, watcher.rs), and handle_socket at handlers.rs:1361-1369 republishes inbound ProjectFocused / SelectionChanged onto the same broadcast bus. Each connected browser receives all six variants over the WS, but handleMessage rejects three of them and emits a console.warn('UpdatesSocket: dropped frame with unknown shape', ...). Net effect: (a) the entire diagnostic-change pipeline the backend implements end-to-end is unreachable from the SPA, and (b) every focus/selection a user makes echoes back to their own tab as a logged warning.

[P2] RegistryAccess::save snapshots registry_meta.version outside the merge lock; an unsolicited VersionMismatch can fail MCP edits when expected_version is None -- src/simlin-serve/src/mcp/access.rs:276-355

registry_meta is read at lines 276-283 under a brief read lock that is then released. When the caller passes expected_version: None (the documented 'AI clients have no read-then-write ergonomics' path), line 350 falls back to registry_meta.version from that earlier snapshot. Between then and the check_increment_and_merge call at line 355, any concurrent HTTP save, watcher disk merge, or other MCP session can advance the version. The merge primitive sees entry.version != expected_version, returns VersionMismatch, and the MCP tool response surfaces a confusing version conflict even though the caller explicitly opted out of the check. The fix is either a merge_unconditional registry primitive for the None case or refetching the version inside the merge lock.

[P3] Watcher rename derives from_key without canonicalization; symlinked or case-aliased paths fall through to 'fresh Created' and lose preserved state -- src/simlin-serve/src/watcher.rs:700-722

from_key is built by state.root.join(from.strip_prefix(state.root)...) while to_key (line 710) and the registry keys (set in handle_model_change via path.canonicalize() at line 446) are canonical. On case-insensitive filesystems where the watcher reports a path with different casing than canonicalize returned, or in trees containing a symlinked subdirectory, from_key will not match the canonical registry key. rename_entry returns NotFound and the rename falls through to handle_model_change(to, ..., ChangeKind::Created) at line 732, which rebuilds the entry from scratch — losing the preserved LoroDoc, last_disk_hash, and last_diagnostic_keys the rename branch was documented to preserve. Fix is to canonicalize from (or its parent + leaf) the same way to_key does.

Overall correctness: Not blocking, but the SPA-side WS gap is the most material — the backend wires DiagnosticsChanged through after every save/merge but the browser cannot consume it. The other two findings need narrower conditions to manifest (concurrent edits at the MCP layer; symlinked/case-aliased rename targets).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant