Skip to content

feat(agents-runtime): Sandbox primitive + Docker/E2B providers + sandbox profile picker#4369

Merged
msfstef merged 12 commits into
mainfrom
msfstef/agent-sandboxing-1
Jun 2, 2026
Merged

feat(agents-runtime): Sandbox primitive + Docker/E2B providers + sandbox profile picker#4369
msfstef merged 12 commits into
mainfrom
msfstef/agent-sandboxing-1

Conversation

@msfstef
Copy link
Copy Markdown
Contributor

@msfstef msfstef commented May 20, 2026

Summary

Adds the Sandbox primitive to the agents runtime — a pluggable abstraction that isolates the filesystem, process, and network operations performed by LLM-driven tool calls — and wires it end-to-end through the runtime, agents-server, desktop, and new-session UI.

The primitive

@electric-ax/agents-runtime/sandbox exposes a deliberately small Sandbox interface: exec, FS methods (readFile/writeFile/mkdir/readdir/exists/remove/stat), fetch (egresses through the sandbox's own network), and dispose. SandboxError carries a policy | runtime | unavailable kind. Containment is documented per concern rather than promised uniformly — writes contained on every provider; reads contained on unrestricted/docker; in-workspace symlink escapes rejected on unrestricted. name is a free-form provider id for logs, not a capability discriminator.

Providers

  • unrestrictedSandbox — in-process pass-through over node:fs / child_process; the built-in default. A single-tenant, trusted-code default: the tool layer contains reads/writes to the workspace and rejects symlink escapes, but it shares the host FS/PID namespace and is not a containment boundary.
  • dockerSandbox — hardened container isolation via dockerode (optional peer dep): CapDrop ALL, no-new-privileges, no docker socket, pids/mem/cpu limits. deny-allNetworkMode=none (the hard network boundary); any other policy gets a bridge, where the allowlist + SSRF guard are fetch-tool-only surface protection, not an exec/bash egress boundary. Exported under /sandbox/docker so callers needing only unrestricted don't pull dockerode.
  • remoteSandbox({provider: 'e2b'}) — first-class adapter for E2B's npm SDK (optional peer dep): reattaches to a shared workspace by key and defers lifecycle to the platform (see below). The RemoteSandboxClient interface makes adding Vercel/Daytona/etc. mechanical.

An earlier iteration shipped a nativeSandbox provider (Seatbelt/bubblewrap via @anthropic-ai/sandbox-runtime). It was removed in favour of Docker as the hardened-isolation path; the dependency is gone.

Lifecycle & identity

resolveSandboxIdentity derives three orthogonal facts from an entity's sandbox config + the live wake:

  • Identity — an explicit cross-entity key, or a scope shorthand ('entity' default ⇒ entityUrl; 'wake'entityUrl#wakeId for full per-wake isolation). "Full isolation" is just a unique key, never a separate code path.
  • Durability (persistent) — selects the owner's idle teardown: preserve (stop/suspend, reattachable) vs wipe (remove/kill).
  • Ownership (owner) — an owner creates and governs teardown; a non-owner (an inherit spawn) only attaches to an already-live sandbox and never conjures a fresh one.

Per-key locked, refcounted, debounced teardown with deterministic container/workspace naming so a cold-started host reattaches by key. processWake constructs the sandbox once per wake-session and disposes it in the outer finally (handlers must not call dispose()).

Sandbox profiles (advertise / validate / pick)

  • Runtimes register named SandboxProfiles (name, label, description?, remote?, local factory). Built-ins: local (always), docker (only when the Docker daemon is reachable), and e2b (only when E2B_API_KEY is set and the optional e2b dep is installed) — so the UI never offers a non-functional choice.
  • The runtime advertises the descriptive fields (not the factory closures) to the agents-server via runner registration. New migration 0010_sandbox_profiles adds runners.sandbox_profiles and entities.sandbox.
  • Spawn requests carry the sandbox selection; the server validates the chosen profile against the target runner's advertised set (or, for unpinned dispatch, a tenant-wide check) and rejects unserviceable choices up front.
  • The new-session UI reads the selected runner's advertised profiles and renders a picker; the sidebar surfaces per-session runner/sandbox badges and can group sessions by runner.

E2B remote provider — shareable, persistent, desktop-ready

remoteSandbox({provider: 'e2b'}) is a first-class shareable provider, mirroring how the Docker provider handles shared/persistent containers:

  • Reattach by key. A shared sandbox is tagged with sandboxKey (e2b metadata); a wake looks it up via list + connect (which auto-resumes a paused VM) and only creates one when none is alive — so collaborators and later wakes, even on a freshly cold-started host, converge on the same workspace. A cross-host create race resolves deterministically (oldest wins). Private (per-entity) sandboxes are created fresh per wake.
  • Platform-deferred lifecycle. Shared sandboxes are created with lifecycle: { onTimeout: 'pause' } and kept alive by a setTimeout heartbeat while a wake holds them (e2b's timeout is absolute, not idle, so activity doesn't refresh it). dispose() just stops the heartbeat — the platform auto-suspends the VM on idle (filesystem + memory preserved, reattachable for e2b's paused-retention window) with no explicit teardown and no cross-host refcount. Private sandboxes still kill() on dispose.
  • Co-location guard relaxed for remote. A per-profile remote flag flows runtime → runner advertisement → server. A shared local sandbox still requires its collaborators to be pinned to a single runner (the container lives on one host); a shared remote sandbox is reachable from any runner, so the single-runner guard is skipped for it.
  • Desktop. The Electron app offers the e2b profile gated on an E2B_API_KEY credential (stored alongside the Anthropic/OpenAI/Brave keys and mirrored into the runtime env on save), and externalizes the optional e2b dep from the Electron main bundle. The new-session picker keeps an explicit profile choice across runner re-advertisement and hides the working-directory control for remote profiles. Horton reports and reads AGENTS.md from the sandbox's own working directory (/work in the VM) rather than a host path, so the model never sees paths its tools can't reach.

Tool refactor + hardening (folded in)

  • All tool factories (createFetchUrlTool, read/write/edit, bash) now require a Sandbox parameter and route through it.
  • bash no longer forwards process.env to children — removes the trivial env-dump leak of secrets like $ANTHROPIC_API_KEY. (The host-sharing unrestricted provider still can't fully contain secrets, e.g. via /proc/<ppid>/environ; use docker/remote for untrusted or multi-tenant entities.)
  • read/write/edit reject symlink escapes from the workspace — enforced in the sandbox (unrestricted.resolveWithin, the CVE-2025-53109/53110-shape defense) and surfaced through the tools.
  • docker exec polls inspect() until the exec is reaped, so a cleanly-exited command never returns a transient null exit code.
  • the boot orphan sweep (sweepOrphanedDockerSandboxes) reclaims only exited ephemeral leftovers — it never force-removes a running container (possible live sibling on a shared daemon) or a persistent one (meant to be reattached by key).
  • the docker fetch SSRF guard canonicalizes encoded IP literals (decimal/hex integer, ::ffff:-mapped, bracketed IPv6) so they can't bypass the private/link-local/metadata check.

Built-in entities (Horton, Worker) default to unrestrictedSandbox via chooseDefaultSandbox(workingDirectory). Stronger isolation is opt-in by selecting the docker/e2b profile or constructing dockerSandbox / remoteSandbox directly.

What this primitive is and is not

Targets host isolation for LLM-driven tool calls (cwd escape, env-var exfil, arbitrary network egress, symlink traversal). It does not address prompt-injection-driven misuse of otherwise-legitimate tools.

Provider-specific limitations (documented in the interface, not promised uniformly):

  • unrestrictedSandbox is not a containment boundary — it shares the host FS/PID namespace. Tool-layer policy (workspace + symlink containment, host-env scrubbing) shrinks the blast radius but does not stop host-level reads (e.g. /proc) or SSRF from fetch_url.
  • docker only hard-enforces deny-all (NetworkMode=none) and allow-all; allowlist is enforced host-side at the fetch tool only, and code run via exec/bash has direct bridge egress.
  • sandbox.fetch() on remoteSandbox runs an HTTP client inside the VM via exec, so egress is governed by the VM's network controls.
  • e2b shared sandboxes assume sequential collaboration (parent → subagent): the provider offers no cross-host coordination, and pausing disconnects any other live client.

Test plan

  • Cross-provider conformance suite pins the Sandbox contract across unrestricted / remote (in-memory fake of the RemoteSandboxClient SDK contract) / docker (gated on daemon availability).

  • Per-provider suites: docker lifecycle + keyed reattach + scoped orphan sweep (running/persistent preserved, exited-ephemeral reclaimed); unrestricted containment + tool-layer symlink safety; net-policy SSRF incl. encoded-literal canonicalization; profiles; tool-refactor.

  • E2B (mock-based, no live account in CI): reattach-by-key, keep-alive heartbeat, suspend-vs-kill dispose — sandbox-remote.test.ts.

  • Server-side spawn validation (electric-agents-sandbox-spawn.test.ts) incl. a shared remote profile bypassing the single-runner guard while a shared local one still requires pinning; runners-router.test.ts round-trips the remote flag.

  • Verified locally: agents-runtime (724 tests) + agents-server + agents suites green; all packages typecheck clean; docker integration suite (incl. new exec-reap + scoped-sweep tests) green against a live daemon.

  • CI matrix exercises the Docker path on Linux

  • Manual smoke test of remoteSandbox({provider: 'e2b'}) against a real E2B account (verified on desktop: shared sandbox resolves to remote:e2b at /work, reattaches across wakes)

🤖 Generated with Claude Code

@msfstef msfstef self-assigned this May 20, 2026
@msfstef msfstef force-pushed the msfstef/agent-sandboxing-1 branch from c6a9ffc to 91303cc Compare May 20, 2026 09:02
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 73.09021% with 701 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.77%. Comparing base (b2ddd59) to head (b0b85e8).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
packages/agents-runtime/src/sandbox/remote/e2b.ts 40.42% 168 Missing ⚠️
packages/agents-runtime/src/sandbox/docker.ts 88.30% 84 Missing ⚠️
...-server-ui/src/components/views/NewSessionView.tsx 0.00% 83 Missing ⚠️
...s-server-ui/src/components/EntityRuntimeBadges.tsx 0.00% 42 Missing ⚠️
packages/agents-runtime/src/process-wake.ts 40.57% 41 Missing ⚠️
...gents-server-ui/src/components/SidebarViewMenu.tsx 0.00% 35 Missing ⚠️
...ackages/agents-runtime/src/sandbox/unrestricted.ts 87.54% 33 Missing ⚠️
packages/agents-runtime/src/sandbox/docker/fs.ts 88.84% 31 Missing ⚠️
packages/agents-runtime/src/sandbox/remote.ts 87.02% 24 Missing ⚠️
packages/agents/src/bootstrap.ts 45.45% 24 Missing ⚠️
... and 18 more
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #4369       +/-   ##
===========================================
- Coverage   85.41%   60.77%   -24.65%     
===========================================
  Files           2      330      +328     
  Lines          48    34902    +34854     
  Branches       11     9624     +9613     
===========================================
+ Hits           41    21212    +21171     
- Misses          7    13672    +13665     
- Partials        0       18       +18     
Flag Coverage Δ
packages/agents 69.69% <49.12%> (?)
packages/agents-mcp 77.54% <ø> (?)
packages/agents-mobile 85.41% <ø> (ø)
packages/agents-runtime 81.99% <81.25%> (?)
packages/agents-server 75.51% <78.66%> (?)
packages/agents-server-ui 6.18% <18.56%> (?)
packages/electric-ax 46.33% <ø> (?)
packages/experimental 87.73% <ø> (?)
packages/react-hooks 86.48% <ø> (?)
packages/start 82.83% <ø> (?)
packages/typescript-client 94.39% <ø> (?)
packages/y-electric 56.05% <ø> (?)
typescript 60.77% <73.09%> (-24.65%) ⬇️
unit-tests 60.77% <73.09%> (-24.65%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

Electric Agents Desktop Builds

Build artifacts for commit 6f1b638.

Platform Status Artifact
macOS Apple Silicon Passed DMG
macOS Intel Passed DMG
Windows x64 Passed Installer
Linux x64 Passed AppImage / deb

Workflow run

@netlify
Copy link
Copy Markdown

netlify Bot commented May 20, 2026

Deploy Preview for electric-next ready!

Name Link
🔨 Latest commit b0b85e8
🔍 Latest deploy log https://app.netlify.com/projects/electric-next/deploys/6a197dfbc7d4b00008614377
😎 Deploy Preview https://deploy-preview-4369--electric-next.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@msfstef msfstef force-pushed the msfstef/agent-sandboxing-1 branch 2 times, most recently from b4082a4 to 91b0613 Compare May 25, 2026 08:58
@msfstef msfstef changed the title feat(agents-runtime): Sandbox primitive + native (Seatbelt/bwrap) and E2B remote providers feat(agents-runtime): Sandbox primitive + Docker/E2B providers + sandbox profile picker May 25, 2026
@msfstef msfstef force-pushed the msfstef/agent-sandboxing-1 branch 3 times, most recently from 3b91f17 to 53c9999 Compare May 27, 2026 08:45
@msfstef msfstef added the claude label May 27, 2026
@msfstef msfstef marked this pull request as ready for review May 27, 2026 08:52
@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Claude Code Review

Summary

Iteration 10 picks up the two new commits (25b339b0e, 6f1b6388a) that landed after iteration 9 (2026-06-02) — they close out the one Important finding and all four carried-forward Suggestions from that review. No new issues introduced. The PR is in good shape.

What is Working Well

  • IPv6 ULA SSRF false-positive fixed (packages/agents-runtime/src/sandbox/docker/net-policy.ts:135-141). The check now requires the colon (regex anchored as ^f[cd][0-9a-f] plus a zero-to-two hex-digit run plus :), so fc2.com, fda.gov, fdrive.com, fcdomain.com, fc-bayern.com (all popular public hostnames) correctly pass while fc00::1, fd00::1, fdab:1234::1 are still flagged as ULA. The zero-to-two repetition cleanly covers 1- to 4-character first hextets across the entire fc00::/7 range (fc::, fc00::, fcff::, fdab::). Regression tests in sandbox-docker-net-policy.test.ts cover both directions.
  • assertAbsolutePosixWorkingDirectory turns the implicit invariant explicit (packages/agents-runtime/src/sandbox/path-containment.ts:23-33), and the remote provider now calls it at construction (packages/agents-runtime/src/sandbox/remote.ts:92). Docker keeps its existing inline startsWith("/") guard; both reject relative workingDirectory consistently, just via slightly different idioms.
  • applyInheritedSandbox precedence is now documented (packages/agents-server/src/routing/sandbox.ts:78-83) — clearly states that inherit: true takes the parent identity AND durability wholesale, sibling persistent is intentionally ignored, and the schema permits the combination because precedence is resolved here.
  • Doc-comment drift fixed: NewSessionView.tsx:207-210 now mentions the explicit-user-choice branch, and the sandbox-profiles comment at :241-243 accurately says order is preserved (no longer sorted).
  • EntityTimeline.tsx:1526-1528 collapses the three sibling <Text> nodes into a single template literal — fewer nodes, same visual.

Issues Found

Critical (Must Fix)

None.

Important (Should Fix)

None.

Suggestions (Nice to Have)

None at this iteration. The DNS-to-private-IP SSRF gap documented in net-policy.ts:114-118 remains a known limitation (closing it would need host-side resolution + per-hop IP pinning); still worth filing as a tracked follow-up so the gap does not get lost as more sandbox providers land.

Issue Conformance

No linked issue. PR description, conversation, and code stay in sync — samwillis 2026-05-29 post-rebase notes match what is actually on disk, and the two latest commits each map to a previous-review finding with no scope creep.

Previous Review Status

All five iteration-9 items addressed:

  • Important — IPv6 ULA false-positive → fixed in 25b339b0e with regression tests.
  • Suggestion — applyInheritedSandbox silent overrides → resolved by documentation note in 6f1b6388a.
  • Suggestion — implicit POSIX-absolute invariant on workingDirectory → resolved by assertAbsolutePosixWorkingDirectory + remote provider call site in 6f1b6388a (docker already guarded inline).
  • Suggestion — stale NewSessionView doc-comment → refreshed in 6f1b6388a.
  • Suggestion — three-sibling <Text> in EntityTimeline → collapsed to a single <Text> in 6f1b6388a.

Review iteration: 10 | 2026-06-02

@msfstef msfstef force-pushed the msfstef/agent-sandboxing-1 branch from 016126f to a271896 Compare May 27, 2026 10:11
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Electric Agents Mobile Build

Android preview build for commit 6f1b638.

Platform Profile Status Build
Android preview Passed EAS build

Workflow run

@msfstef msfstef requested review from icehaunter and kevin-dp May 27, 2026 10:36
@msfstef msfstef force-pushed the msfstef/agent-sandboxing-1 branch 2 times, most recently from 68b3e6d to 1c9c117 Compare May 27, 2026 13:40
Copy link
Copy Markdown
Contributor

@balegas balegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keen to start using this.

@samwillis samwillis force-pushed the msfstef/agent-sandboxing-1 branch from 0def597 to b0b85e8 Compare May 29, 2026 11:52
@samwillis
Copy link
Copy Markdown
Contributor

Rebased this branch onto latest origin/main and resolved the conflicts.

What changed from the rebase forward:

  • Kept the new desktop main-process refactor from main rather than reintroducing the older monolithic main.ts/preload.ts code.
  • Ported the sandbox branch's E2B API key support into the refactored desktop credential modules (shared/types + credentials/api-keys) so E2B_API_KEY still feeds the bundled runtime.
  • Merged the sandbox/runner controls into the new two-row Horton composer layout from main, with runner/sandbox/working-directory controls on the second row.
  • Restored the Horton sandbox picker for single-profile runners so the active sandbox is still visible/selectable instead of collapsing to a passive label.
  • Tightened runtime metadata UI spacing in the entity header and timeline so the new runner/sandbox badges align with existing controls and status markers.
  • Fixed the new-session navigation race where sandbox profiles did not appear until refresh by allowing the desktop runner default to win when its id arrives after the runner list, while preserving explicit user runner choices.
  • Fixed the follow-up Select update loop by making runner selection resolve to one stable preferred id and avoiding empty controlled Select values for sandbox profile selects.

Validation run locally after the conflict fixes / follow-ups:

  • pnpm --filter @electric-ax/agents-desktop typecheck
  • pnpm --filter @electric-ax/agents-server-ui typecheck

Note: these commands need Node 24 here (nvm use 24.15.0); Node 18 trips the repo engine guard before typechecking.

@msfstef msfstef force-pushed the msfstef/agent-sandboxing-1 branch 2 times, most recently from f384a31 to e350ac0 Compare June 2, 2026 09:03
msfstef and others added 3 commits June 2, 2026 12:37
…viders

Adds the `Sandbox` primitive (`@electric-ax/agents-runtime/sandbox`) that
isolates the filesystem, process, and network operations performed by
LLM-driven tools, and routes the bash/read/write/edit/fetch_url tools through
it.

Providers:
- `unrestrictedSandbox` — in-process host pass-through, the default for built-in
  entities via `chooseDefaultSandbox`. Single-tenant trusted-code default: the
  tool layer contains reads/writes to the workspace and rejects symlink escapes,
  but it is NOT a containment boundary (host FS/PID namespace shared).
- `dockerSandbox` — container isolation via `dockerode` (optional peer dep).
  Hardened HostConfig (CapDrop ALL, no-new-privileges, pids/mem/cpu limits, no
  docker socket). deny-all ⇒ NetworkMode=none (hard boundary); any other policy
  gets a bridge, where the allowlist is fetch-tool-only surface protection, not
  an exec/bash egress boundary.
- `remoteSandbox({provider:'e2b'})` — off-host VM via E2B (optional peer dep),
  with reattach / persistence / desktop support.

Lifecycle (resolveSandboxIdentity): identity from an explicit cross-entity
`key` or a `scope` shorthand ('entity' default ⇒ entityUrl, 'wake' ⇒
entityUrl#wakeId); `persistent` selects idle teardown (stop/preserve vs
remove/wipe); `owner` gates create-vs-attach so an `inherit` subagent only
attaches to an owner's live sandbox and never conjures a fresh one. Per-key
locked, refcounted, debounced teardown with deterministic naming so a
cold-started host reattaches by key.

Profiles: runtimes advertise named profiles (e.g. `local`, `docker`); the
agents-server validates a spawn's chosen profile against the target runner's
advertised set and enforces co-location for shared local sandboxes; the
new-session UI surfaces a picker.

Hardening / behavior:
- bash drops host `process.env` (removes the trivial secret-dump leak).
- read/write/edit reject symlink escapes from the workspace.
- docker exec polls `inspect()` until reaped (no transient null exit codes).
- boot orphan sweep reclaims only exited *ephemeral* leftovers — never a running
  (possible live peer) or persistent (reattachable) container.
- fetch SSRF guard canonicalizes encoded IP literals (decimal/hex integer,
  ::ffff-mapped, bracketed IPv6).

`createFetchUrlTool` and the other tool factories now require a `Sandbox`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…by-runner

- Reorder new-session composer pickers to Model → Effort → Runner →
  Sandbox → Working Directory (working dir last; still hidden for remote
  profiles, docker keeps local-like behavior).
- Add clickable runner + sandbox badges to the entity header (detail
  popovers) and enrich the sidebar hover info with runner/sandbox rows.
- Surface the *effective* sandbox: when an entity has no explicit profile
  the runtime falls back to the host `local` sandbox (process-wake.ts) and
  never persists it, so the UI now resolves that default — every entity
  shows Local / Docker / E2B, not just ones spawned with an explicit pick.
- Expose the entity's pinned runner from `dispatch_policy` (already
  allow-listed server-side) on the UI entity schema/collection + optimistic
  spawn insert; resolve runner/sandbox labels from the runners collection.
- Add "Group by → Runner" and a "Show → Runner" filter to the sidebar.
- Tests for groupByRunner and the runner/effective-sandbox resolvers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The entities Electric shape proxy allowlist omitted `sandbox`, so the
profile an entity was spawned with (Local / Docker / E2B) never reached
the UI — `entity.sandbox` was always undefined client-side even though the
column is populated at spawn (entity-registry.createEntity). This made the
header/sidebar sandbox badge (and the timeline's sandbox pill) always fall
back to the "Local" default regardless of the real profile.

Add `sandbox` to the entities column allowlist and a regression test
asserting both `sandbox` and `dispatch_policy` are exposed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
msfstef and others added 7 commits June 2, 2026 12:37
- SSRF guard: parse all inet_aton IPv4 forms (shorthand/octal/hex) so
  127.1, 0177.0.0.1, etc. can't bypass the private-IP denylist
- UnrestrictedSandbox: enforce post-dispose use via assertLive(), keeping
  the cross-provider conformance invariant honest
- docker makeBinds: realpath-resolve mount hostPaths before the
  docker-socket check so a symlink can't smuggle the socket in
- process-wake: clarify the SandboxError('unavailable') message — a dropped
  profile fails only that wake (caught per-wake) and is redriven by the
  server; the runner stays up
- e2b: thread an optional logger so the keep-alive heartbeat failure leaves
  a debug trail instead of silently swallowing every error
- unrestricted resolveWithin: TODO documenting the multi-tenant TOCTOU window
- docker profile: fix the misleading "network constrained" description
  (default is allow-all) and note network policy can become a per-spawn arg
- fetch_url: surface SandboxError('policy') as a distinct "blocked by
  network policy" message, mirroring the FS tools
- bash tool: note the host env is not forwarded

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reduce novel-pattern surface in the sandbox feature by mirroring the
established dispatch_policy structure, and share two duplicated provider
helpers. No behavior change.

- Move sandbox spawn resolution off EntityManager into routing/sandbox.ts
  (sibling of routing/dispatch-policy.ts), split into the orchestrator plus
  applyInheritedSandbox / resolveChosenProfileRemote / assertSharedSandboxColocated.
- Extract the sandbox choice wire schema to sandbox-choice-schema.ts and a
  single SandboxChoice type, collapsing three hand-written copies (router
  schema, TypedSpawnRequest.sandbox, resolver param).
- Share path containment (absoluteSandboxPath / isPathWithinSandbox) between the
  docker and remote providers; unrestricted keeps its stricter realpath walk.
- Share the dispose wipe-vs-preserve core (sandboxWipesOnDispose); each provider
  keeps its own owner-gating, which genuinely differs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up cleanups from a cross-package alignment review of the sandbox
work, keeping it consistent with the dispatch_policy precedent:

- runtime: rename docker provider `reuseKey` -> `sandboxKey` so all three
  sandbox providers share one contract field
- server: drop dead `listSandboxProfileNames`; collapse `SandboxProfileInput`
  into `SandboxProfileAdvertisement`
- runtime: remove unused `undici` dep; tighten `dockerode` peer floor to >=5
- ui: use `shortenId` instead of inlined truncation; drop an `as never` cast;
  resolve the EntityTimeline sandbox badge to its advertised label
- runtime: document the display-only `{ profile }` membership-row narrowing
- changeset: reword the tool-factory note to match the patch bump

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit f3e44a5 dropped the unused `undici` dependency from
agents-runtime/package.json without regenerating pnpm-lock.yaml,
breaking every CI job at `pnpm install --frozen-lockfile`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Keep the sandbox selector visible in the second-row composer controls so single-profile runners still expose the active sandbox choice after the rebase.

Co-authored-by: Cursor <cursoragent@cursor.com>
Keep entity header and timeline runtime metadata spacing consistent with the surrounding controls and status markers.

Co-authored-by: Cursor <cursoragent@cursor.com>
Prefer the desktop runner when it arrives after navigation without overriding explicit user choices or looping Select state updates.

Co-authored-by: Cursor <cursoragent@cursor.com>
@msfstef msfstef force-pushed the msfstef/agent-sandboxing-1 branch from e350ac0 to d02559a Compare June 2, 2026 09:38
msfstef and others added 2 commits June 2, 2026 13:05
The fc00::/7 (ULA) check used `startsWith('fc')` / `startsWith('fd')` with no
colon, so any DNS hostname beginning with those bytes (e.g. fc2.com, fda.gov,
fdrive.com) was wrongly denied as private by the docker fetch tool, regardless
of the allow-all / allowlist policy. Require the colon — matching the fe80:
check — so only real IPv6 literals match. Adds regression tests for fc/fd
public hostnames staying public and fc00::1 / fd00::1 / fdab:… staying private.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- applyInheritedSandbox: document that `inherit: true` takes the parent's
  durability wholesale and intentionally ignores sibling fields (e.g. a
  caller-supplied `persistent`), which the schema permits.
- path-containment: add `assertAbsolutePosixWorkingDirectory` and call it in the
  remote provider constructor so a relative/non-POSIX working directory fails
  loudly instead of silently joining against the host cwd (docker already
  guards this).
- NewSessionView: refresh the stale runner-selection doc-comment (now mentions
  the explicit-user-choice branch) and the sandbox-profiles comment (order is
  preserved as advertised, no longer sorted).
- EntityTimeline: collapse the three-sibling sandbox label into a single Text.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@msfstef msfstef merged commit 17b374f into main Jun 2, 2026
20 of 21 checks passed
@msfstef msfstef deleted the msfstef/agent-sandboxing-1 branch June 2, 2026 10:45
msfstef added a commit that referenced this pull request Jun 3, 2026
…ime source

agents-server-ui typechecks agents-runtime's source via its tsconfig path
mapping. Since #4369 that source includes node-using sandbox code
(node:child_process), but agents-server-ui restricted `types` to vite/client
and had no @types/node, so on CI's isolated install ChildProcess lost its
EventEmitter methods. Add @types/node and "node" to the types array.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants