Skip to content

refactor(vm/daemon): port to Bun.serve + web-standard Request/Response#3175

Open
tlgimenes wants to merge 3 commits intomainfrom
tlgimenes/vm-start-hang
Open

refactor(vm/daemon): port to Bun.serve + web-standard Request/Response#3175
tlgimenes wants to merge 3 commits intomainfrom
tlgimenes/vm-start-hang

Conversation

@tlgimenes
Copy link
Copy Markdown
Contributor

@tlgimenes tlgimenes commented Apr 24, 2026

What is this contribution about?

Ports the in-VM daemon from Node's http.createServer to Bun.serve with web-standard Request/Response handlers, fetch()-based reverse proxy and liveness probe, ReadableStream SSE, and AbortSignal.timeout for client timeouts (ESM imports throughout; only OS-level concerns still come from node: modules). VmBun is now attached to every VmSpec unconditionally and /opt/run-daemon.sh invokes bun directly with install-bun.service in the daemon's requires — resolving the VM start hang caused by the old Node launcher dropping PATH for the daemon's child processes. Along the way: setup re-entry guard + resume-on-restart, spawnSync-based git helper that surfaces stderr on failure, git safe.directory /app prepended, ordered SSE replay before live broadcasts, rg output limiting with SIGTERM escalation, SIGKILL escalation for stuck processes, and X-Accel-Buffering: no on the SSE stream. A new daemon-script.e2e.test.ts boots the generated script under Bun on a random port and covers auth, SSE replay + live, exec/setup 409, bash timeout, HTML bootstrap injection, chunked POST forwarding, and CORS on every branch.

How to Test

  1. Run bun test packages/mesh-plugin-user-sandbox/server/runner/freestyle/ — 31 pass.
  2. Start the mesh dev server, open a GitHub-connected Virtual MCP, and hit VM_START on a fresh branch; the VM should boot past setup (install runs, dev server starts, preview iframe loads) instead of hanging.
  3. Restart the VM mid-setup; on resume the daemon should detect /app/.git exists, skip the clone, and continue from install — no hang, no duplicate setup.

Migration Notes

None. Behavior is end-to-end compatible: same /_decopilot_vm/* routes, same bearer-token auth, same base64 body wire format.

Review Checklist

  • PR title is clear and descriptive
  • Changes are tested and working
  • Documentation is updated (if needed)
  • No breaking changes

🤖 Generated with Claude Code


Summary by cubic

Ports the in-VM daemon to Bun.serve with web-standard Request/Response and a fetch()-based reverse proxy; the daemon now runs under Bun in all VMs. Fixes the VM start hang by always installing VmBun and invoking the daemon via Bun, with no route or wire-format changes.

  • Refactors

    • Replace Node server with Bun.serve; fetch() reverse proxy + HEAD probe; SSE via ReadableStream with keep-alives and X-Accel-Buffering: no.
    • Preserve base64 JSON body and consistent CORS; ordered SSE replay before live; 404 JSON for unknown daemon routes.
    • Hardening: cap rg output with SIGTERM, SIGKILL escalation for stuck kills, spawn error handling; re-entry guard + resume-on-restart; gitSync via spawnSync; git safe.directory /app.
    • Runner: attach VmBun to every VmSpec, source nvm in the launcher, run /opt/bun/bin/bun; add install-bun.service.
    • Tests/CI/deps: add Bun e2e suite (auth, SSE replay+live, exec/setup 409, bash timeout, proxy HTML injection + chunked POST, CORS), deflake by disabling auto-boot in the test harness; install ripgrep in CI; sync bun.lock (apps/mesh 2.274.0) and add optional deps @freestyle-sh/with-bun, @freestyle-sh/with-deno, @freestyle-sh/with-nodejs, freestyle-sandboxes.
  • Bug Fixes

    • Resolve VM start hang by invoking the daemon with Bun and attaching VmBun to every VmSpec (includes install-bun.service).
    • Reduce stalls with stronger spawn error handling and a more resilient upstream probe.

Written for commit 6d0c649. Summary will update on new commits.

Ports the in-VM daemon from Node's http.createServer to Bun.serve with
web-standard Request/Response handlers, fetch()-based reverse proxy and
liveness probe, ReadableStream SSE, and AbortSignal.timeout for client
timeouts. Top-level imports are ESM; only OS-level concerns (fs, path,
child_process, crypto.timingSafeEqual) still come from node: modules.

Runner-side: VmBun is now attached to every VmSpec unconditionally (not
only when runtime === "bun"), /opt/run-daemon.sh invokes bun directly
(with nvm sourced first so child processes inherit corepack/node on
PATH), and install-bun.service joins the daemon service's requires/after
list. Resolves the VM start hang caused by the old Node launcher
dropping PATH for the daemon's child processes.

Setup hardening baked in along the way: re-entry guard + resume-on-
restart in runSetup, SSE replay ordered before live-broadcast
registration, spawnSync-based gitSync helper that surfaces stderr on
failure, git safe.directory /app prepended so git 2.35+ can't trip on
dubious-ownership, rg output limiting with SIGTERM escalation so large
result sets can't wedge the pipe, SIGKILL escalation for stuck
processes, and X-Accel-Buffering: no on the SSE stream so edge proxies
flush chunks immediately.

New daemon-script.e2e.test.ts boots the generated script under Bun on a
random port and exercises auth, SSE replay + live broadcast, exec/setup
409 re-entry, bash timeout, reverse-proxy HTML bootstrap injection,
chunked POST forwarding, and CORS headers on every response branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction Action
👍 Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 24, 2026

Release Options

Suggested: Patch (2.274.1) — based on refactor: prefix

React with an emoji to override the release type:

Reaction Type Next Version
👍 Prerelease 2.274.1-alpha.1
🎉 Patch 2.274.1
❤️ Minor 2.275.0
🚀 Major 3.0.0

Current version: 2.274.0

Note: If multiple reactions exist, the smallest bump wins. If no reactions, the suggested bump is used (default: patch).

tlgimenes and others added 2 commits April 24, 2026 13:34
…encies

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two exec/setup tests raced against the daemon's auto-boot runSetup() —
the clone to invalid.example.com keeps setupRunning=true for tens of
milliseconds, long enough that on CI both the "returns 200" and
"returns [200, 409]" tests saw 409 on every call. Strip the boot
runSetup() out of the generated script in the test fixture so tests
drive setup explicitly and the Bun.serve handler ordering makes the
concurrent race deterministic.

The grep/glob test spawns rg, which isn't on bare Ubuntu runners —
install ripgrep in the test workflow and guard the test with
it.skipIf(!hasRipgrep) so dev machines without it don't fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant