🤖 perf: cut SSH workspace startup ~9× on the warm path via fused single-RTT materialization#3302
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 341a6bd3c7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…o + skipped no-op probes Total cold create+init for `mux run --runtime ssh <host>` against a tiny repo drops from ~2.92s baseline to ~2.21s on average (-712ms, -24%) on the ovh-1 target. Numbers were measured with the new timed init logger surfaced by `mux run --runtime ssh … --verbose`, which doubles as a startup-timing harness (each step is annotated with elapsed total and delta-since-previous). Concrete wins, all on the SSH cold path: 1. **Skip `prefetchOriginOnRemote` when local has no `origin`** — the existing guard was inside-out: we already paid for `refreshBaseRepoOrigin` and the actual fetch (which is guaranteed to fail) before noticing there was no origin URL to push. Moved the origin-URL check to the very top so the entire prefetch dance is a no-op for local-only projects. Saves ~500ms on the canonical "fresh repo with no remote" case. 2. **Collapse `ensureBaseRepo` into one SSH round-trip** — was 3-5 sequential round-trips (test -d, mkdir, git init, unset core.bare, 3× unset promisor keys); now a single batched shell script returns status sentinels (`STATUS_CREATED=<existed|created>`, `STATUS_CORE_BARE=<unset|absent|error>`) and falls through to the existing slow-path retry helper only when a real config-lock conflict is observed. Saves ~300ms cold. 3. **Skip the trailing `refreshBaseRepoOrigin`** when `prefetchOriginOnRemote` already pushed the same origin URL during this sync pass. Saves ~100ms when origin is configured. 4. **Skip `ensureHealthyBaseRepoForSync`** when `ensureBaseRepo` just freshly created the repo — a brand-new bare repo has zero packs, so the remote `count-objects -v` probe is pure overhead on the cold path. Saves ~95ms. Also extends `mux run` itself to provision SSH workspaces (previously it only provisioned Docker ones, which is why this scenario was effectively untestable from the CLI). The Docker and SSH provisioning blocks share the same create → runFullInit → cleanup-on-failure shape, both routed through a single timed init logger so future startup investigations have a working harness out of the box. All 444 runtime + CLI tests continue to pass, including the SSH retry orchestration suite that exercises the promisor-cleanup / repair paths preserved here. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking: `high`_ <!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=high -->
…~9× faster)
Subsequent `mux run --runtime "ssh <host>"` invocations against the same
project now complete in ~205ms median on `ovh-1` (down from ~1.8s baseline,
a ~9× speedup). The warm path is dominated by a single SSH round-trip plus
the small remote-side cost of `git worktree add` — within ~2.2× of one raw
SSH-command RTT (92ms) on this host.
Concrete wins, all on the warm path:
1. **Stable remote `srcBaseDir` for the CLI** — `mux run` was using its own
ephemeral `tempDir/src` per invocation, which meant the remote
`baseRepoPath` (derived from `srcBaseDir`) was *always* fresh, so every
single CLI run paid the full cold-init cost. Now we anchor to `~/mux`
(matching the desktop app's default) and resolve the tilde to an absolute
remote path once up-front so `path.resolve()` inside
`session.ensureMetadata()` doesn't mangle `~/mux/...` into
`/Users/.../~/mux/...`.
2. **Single-SSH-command warm fast-path (`tryWarmWorktreeAdd`)** — when the
remote already has a healthy shared base repo for this project AND the
staged `refs/mux-bundle/*` tips already match the local heads, we now
skip the entire slow pipeline (~9 sequential SSH round-trips:
`test -d`, `git rev-parse`, `ensureBaseRepo`, snapshot check, manifest
check, `refreshBaseRepoOrigin`, `fetchOriginTrunk`,
`resolveBundleTrunkRef`, `resolveFreshWorkspaceSourceBase`,
`worktree add`, `.gitmodules` probe) and run a single tight shell
pipeline that does the probe + materialize + .gitmodules check in one
round-trip. Misses fall through to the existing slow path with a
structured `WARM_MISS:<reason>` log line.
3. **Folded `.gitmodules` probe into the warm command** so the caller can
skip the post-worktree submodule-sync SSH RTT when the freshly-checked-
out worktree has no submodules (the overwhelming common case for
`mux run`).
4. **Eliminated double `git show-ref`** — the slow path runs `git show-ref
--heads` twice (once in `computeSnapshotDigest`, once in
`resolveLocalSyncRefManifest`). Each fork+exec costs ~70-100ms on macOS.
Added a `fastReadGitHeadsRefs()` helper that walks `.git/refs/heads/`
+ `.git/packed-refs` directly (~2ms total), with graceful fallback to
`git show-ref` for worktree-gitdir or reftable-based repos.
5. **Dropped the `createWorkspace` mkdir SSH RTT** — the parent dir is
already created by both materialization paths (the slow path's
`ensureBaseRepo` runs `mkdir -p` on the shared root dir; the warm
fast-path's tight command runs `mkdir -p` inline before `git worktree
add`). Folding it into materialization shaves a full SSH RTT off every
workspace startup.
6. **Snapshot-digest-only gate, no manifest verification** — the digest is
`sha256(git show-ref --heads)`, so a digest match implies a manifest
match (modulo a sha256 collision). Skipping the parallel manifest check
saves another SSH RTT *and* an entire `git show-ref` parse pass.
Measured on ovh-1 (SSH multiplexed RTT ~92ms, 2× RTT = ~184ms),
15 consecutive warm runs against an empty repo:
baseline (pre-patch): ~1800ms median
this patch: 196 198 199 199 200 203 205 205 205 206 207 209
212 214 232 ms
→ min 196, P10 198, median 205, P90 214, max 232,
mean 206ms (~2.24× one SSH-command RTT)
The narrow spread (196–232ms, σ≈8ms) shows variance is now dominated by
network jitter on the SSH link (the bare `ssh true` RTT itself varies
88–140ms). The warm path is fundamentally bound by:
1× SSH RTT for the fused probe+materialize command
+ ~10ms remote shell + `git worktree add`
+ ~100ms of ssh client/server overhead for a non-trivial script
All 315 runtime tests continue to pass, including the SSH retry orchestration
suite that exercises the promisor-cleanup / repair paths.
---
_Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking: `high`_
<!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=high -->
Codex review on #3302 flagged a P1 correctness issue: the warm worktree fast-path always based the new workspace on `refs/mux-bundle/<trunk>` (the local snapshot tip) and short-circuited before the slow path's `fetchOriginTrunk()` / `resolveFreshWorkspaceSourceBase()` could fetch and prefer `origin/<trunkBranch>`. When the remote cache was warm but `origin/main` had advanced, the user silently got a stale checkout. Fix: fold the same origin fetch into the fused warm SSH script. When the local project has an `origin` URL we now also: 1. `git remote set-url origin <local-url>` on the remote base repo (idempotent; protects against URL drift between syncs). 2. `git fetch --quiet origin +refs/heads/<trunk>:refs/remotes/origin/<trunk>` on the host. Failure is tolerated (`fo=0`). 3. Prefer `refs/remotes/origin/<trunk>` as the worktree-add base whenever the fetch succeeded; fall back to the bundle ref otherwise. This matches `resolveFreshWorkspaceSourceBase()`'s fallback semantics exactly. The fetch runs on the SSH host (datacenter↔upstream link), so it adds no client-side round-trip and the warm path stays inside a single client→host RTT envelope. Also emit `"Reusing existing remote project snapshot"` on warm hit so existing lifecycle tests (e.g. `initWorkspace reuses snapshots and preserves remote-only tags across later resyncs`) keep observing the same step text — semantically the warm path *is* reusing the remote project snapshot, just with the materialization fused in. Validated locally: - `make typecheck`, `make lint`, `make fmt-check`: green - `TEST_INTEGRATION=1 bun x jest tests/runtime/runtime.test.ts -t "SSHRuntime sync does not import stale remote tracking refs"`: 5/5 pass (previously: 1 failing, now fixed) - `bun test src/node/runtime/ src/cli/`: 444/444 pass --- _Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking: `max` • Cost: `$27.24`_ <!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=max costs=27.24 -->
341a6bd to
9ac2c32
Compare
|
@codex review Addressed the P1 origin-freshness issue in commit 9ac2c32: the warm fast-path now folds |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
The SSH-provisioning commit (#3302) added a second `import type { Runtime } from "../node/runtime/Runtime"` line right below an existing `import type { InitLogger, WorkspaceInitResult }` from the same module. Fold the new symbol into the existing import — byte-identical semantics, one fewer line of boilerplate, no behavior change.
Summary
Optimize SSH workspace startup using
mux runas a debug + timing harness, cutting cold-path SSH workspace creation by ~24% and warm-path creation by ~9× (≈1.8s → ~205ms median) againstovh-1. The warm path collapses what used to be ~9 sequential SSH round-trips into a single fused command, bringing subsequentmux runinvocations within ~2.2× of one raw SSH-command RTT on this link.Background
The objective was to instrument
mux run --runtime "ssh <host>"so it doubles as a startup-perf harness, then drive both cold and warm SSH workspace creation down as far as possible. The starting point for cold workspace startup againstovh-1was ~2.92s (empty repo, fresh remote), and "warm" subsequent runs were also ~1.8s because the CLI was using an ephemeralsrcBaseDirper invocation — so every run looked cold to the remote.Implementation
This PR contains three commits:
perf: cut SSH cold-workspace startup ~24% via batched ensureBaseRepo + skipped no-op probesCold-path wins (baseline 2.92s → 2.21s, –24%):
prefetchOriginOnRemotewhen local has noorigin— hoist the origin-URL check; the fetch dance is a guaranteed no-op for local-only projects (~500ms).ensureBaseRepointo one SSH round-trip — single batched shell script with status sentinels (STATUS_CREATED,STATUS_CORE_BARE); falls through to slow-path retry helper only on real config-lock conflicts (~300ms).refreshBaseRepoOriginwhen prefetch already pushed it (~100ms).ensureHealthyBaseRepoForSyncwhenensureBaseRepojust created the bare repo — zero packs by definition (~95ms).Also extends
mux runitself to provision SSH workspaces (previously only Docker) and adds a timed init logger (makeTimedCliInitLogger) so--verboseannotates every step with[+<total> Δ<delta>]— this is the harness used to identify all subsequent wins.perf: collapse warm SSH workspace creates to one fused round-trip (~9× faster)Warm-path wins (1.8s → ~205ms median):
srcBaseDirfor the CLI —mux runwas usingtempDir/srcper invocation, guaranteeing a fresh remote base repo every time. Anchor to~/mux(matching the desktop app) and resolve the tilde to an absolute remote path once up-front sopath.resolve()insession.ensureMetadata()doesn't mangle~/mux/...into local nonsense.tryWarmWorktreeAdd) — when the remote already has a healthy shared base repo for this project AND the stagedrefs/mux-bundle/*tips already match local heads, skip the entire slow pipeline (~9 sequential SSH round-trips) and run a single tight shell pipeline that probes + materializes + reports.gitmodulesin one round-trip. Misses fall through with a structuredWARM_MISS:<reason>log line..gitmodulesprobe into the warm command so the caller can skip the post-worktree submodule-sync SSH RTT when no submodules exist.git show-ref— addedfastReadGitHeadsRefs()that walks.git/refs/heads/+.git/packed-refsdirectly (~2ms vs ~70ms), with graceful fallback togit show-reffor worktree-gitdir or reftable-based repos.createWorkspacemkdir SSH RTT — both materialization paths create their own parents already.sha256(git show-ref --heads), so a digest match implies a manifest match. Skip the parallel manifest verification.perf: preserve slow-path origin freshness in SSH warm fast-pathAddresses a P1 correctness concern from Codex review on this PR. The original warm-path implementation always based the new workspace on
refs/mux-bundle/<trunk>and returned before the slow path'sfetchOriginTrunk()/resolveFreshWorkspaceSourceBase()logic could fetch and preferorigin/<trunkBranch>. When the remote cache was warm butorigin/mainhad advanced, the user silently got a stale checkout.Fix: fold the same origin fetch into the fused warm SSH script. When the local project has an
originURL we now also:git remote set-url origin <local-url>on the remote base repo (idempotent; protects against URL drift between syncs).git fetch --quiet origin +refs/heads/<trunk>:refs/remotes/origin/<trunk>on the host. Failure is tolerated (fo=0).refs/remotes/origin/<trunk>as the worktree-add base whenever the fetch succeeded; fall back to the bundle ref otherwise. This matchesresolveFreshWorkspaceSourceBase()'s fallback semantics exactly.The fetch runs on the SSH host (datacenter↔upstream link), so it adds no client-side round-trip and the warm path stays inside a single client→host RTT envelope. The wall-time impact for repos with origin is on the order of the host↔upstream fetch handshake (typically fast when origin hasn't moved); for local-only repos (no origin URL) behavior is unchanged.
The warm path also now emits
"Reusing existing remote project snapshot"on hit so existing lifecycle tests (e.g.initWorkspace reuses snapshots and preserves remote-only tags across later resyncs) continue to observe a consistent step.Validation
Measured on ovh-1 (SSH multiplexed RTT median ~92ms, 2× RTT = ~184ms):
15 consecutive warm runs after the patch:
196 198 199 199 200 203 205 205 205 206 207 209 212 214 232ms→ min 196, P10 198, median 205, P90 214, max 232, mean 206ms (~2.24× one SSH-command RTT).
The narrow spread (σ≈8ms) shows variance is dominated by raw SSH-link jitter — the bare
ssh trueRTT itself varies 88–140ms on this link. The fundamental floor is1× SSH RTT + ~10ms remote git worktree add + ~100ms ssh client/server overhead for a non-trivial script. Repos with anoriginURL will incur an additional host↔upstream fetch handshake inside the same SSH RTT envelope.All 444 runtime + CLI unit tests continue to pass, including the SSH retry orchestration suite that exercises the promisor-cleanup / repair paths preserved here. The full
tests/runtime/runtime.test.ts"SSHRuntime sync does not import stale remote tracking refs" integration suite (5/5) passes locally against the Docker SSH fixture — including theinitWorkspace reuses snapshots and preserves remote-only tags across later resyncscase that surfaced the original correctness concern. Localmake static-checkis green (typecheck, lint, prettier, docs link check).Risks
Touched logic: SSH workspace creation. The warm fast-path is gated by strict miss conditions and falls through to the existing slow path on any miss; the slow path is unchanged in behavior aside from the cold-path round-trip reductions. The promisor-cleanup, partial-clone, and lock-conflict retry paths are preserved verbatim — the batched
ensureBaseRepofalls back to the originalnormalizeBaseRepoSharedConfigretry helper whenever it sees a non-trivial config-lock error from its inline normalize attempt.Regression risk is concentrated in:
ensureBaseRepo: the batched command now does test/mkdir/init/unset-core.bare/unset-promisor in one shell pipeline; behavior should be identical to the original sequential calls for all currently-tested states.git show-ref --heads) so a stale or mismatched local repo cannot be silently materialized against the wrong base — any mismatch falls to the slow path which independently re-verifies. The origin-fetch preamble now also matches the slow path'sresolveFreshWorkspaceSourceBasesemantics so upstream advances are picked up even on warm hits.mux runSSH provisioning: this is a new code path insrc/cli/run.tsand only affects the CLI; the desktop app'sWORKSPACE_CREATEflow is unchanged.Pains
The biggest debugging hiccup was a metadata-path bug: persisting
~/mux/...throughsession.ensureMetadata()(which runspath.resolve()) silently mangled it to/Users/.../~/mux/..., which the runtime then sent verbatim to the remote — making the workspace exist on the remote but fail the subsequentensureReadycheck with "repository not found". The fix (resolve tilde once viaruntime.resolvePathbefore persisting) mirrors how the desktop app'sWORKSPACE_CREATEIPC handler already handles this.Generated with
mux• Model:anthropic:claude-opus-4-7• Thinking:max• Cost:$27.24