Run panel: auto-approve operator-invoked tools#1183
Conversation
Invoking a tool through the Run/Test panel paused on approval-gated tools and dead-ended on a "This tool requires approval" message, even though the operator clicking Run is itself the approval. Thread an optional autoApprove through the execute path: the execution engine runs the inline accept-all handler instead of intercepting the first elicitation as a pause, so an approval-gated tool runs to completion. The HTTP /executions endpoint takes autoApprove and the Run panel sends it. block policies still fail before any elicitation, so this never bypasses a hard block; the MCP host path is unchanged and still pauses for the model.
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
executor-marketing | b0bb4a8 | Commit Preview URL Branch Preview URL |
Jun 28 2026, 06:28 PM |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
executor-cloud | b0bb4a8 | Jun 28 2026, 06:30 PM |
Cloudflare previewTorn down — the PR is closed. |
These two files were already unformatted on main (identical to origin/main); oxfmt --check flags them repo-wide. Formatting-only, no behavior change.
@executor-js/cli
@executor-js/config
@executor-js/execution
@executor-js/sdk
@executor-js/plugin-file-secrets
@executor-js/plugin-graphql
@executor-js/plugin-keychain
@executor-js/plugin-mcp
@executor-js/plugin-onepassword
@executor-js/plugin-openapi
@executor-js/codemode-core
@executor-js/runtime-quickjs
executor
commit: |
Greptile SummaryThis PR threads an
Confidence Score: 4/5Safe to merge. The core execution change is a clean short-circuit: The engine and API changes are correct and well-tested, with both a unit test and an e2e scenario proving the before/after contract. The only loose end is in packages/react/src/components/tool-run-panel.tsx — the Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Panel as ToolRunPanel (React)
participant API as POST /executions
participant Engine as ExecutionEngine
participant Invoker as ToolInvoker
participant Tool as Approval-gated Tool
Panel->>API: "{ code, autoApprove: true }"
API->>Engine: "executeWithPause(code, { autoApprove: true })"
Note over Engine: autoApprove branch: skip pause queue
Engine->>Engine: runInlineExecution(code, acceptAllHandler)
Engine->>Invoker: "execute(code, { onElicitation: acceptAllHandler })"
Invoker->>Tool: invoke tool
Tool-->>Invoker: ElicitationRequest (requiresApproval)
Invoker->>Engine: acceptAllHandler(ctx)
Engine-->>Invoker: "{ action: "accept" }"
Tool-->>Invoker: tool result
Invoker-->>Engine: ExecuteResult
Engine-->>API: "{ status: "completed", result }"
API-->>Panel: "{ status: "completed", text, structured, isError }"
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Panel as ToolRunPanel (React)
participant API as POST /executions
participant Engine as ExecutionEngine
participant Invoker as ToolInvoker
participant Tool as Approval-gated Tool
Panel->>API: "{ code, autoApprove: true }"
API->>Engine: "executeWithPause(code, { autoApprove: true })"
Note over Engine: autoApprove branch: skip pause queue
Engine->>Engine: runInlineExecution(code, acceptAllHandler)
Engine->>Invoker: "execute(code, { onElicitation: acceptAllHandler })"
Invoker->>Tool: invoke tool
Tool-->>Invoker: ElicitationRequest (requiresApproval)
Invoker->>Engine: acceptAllHandler(ctx)
Engine-->>Invoker: "{ action: "accept" }"
Tool-->>Invoker: tool result
Invoker-->>Engine: ExecuteResult
Engine-->>API: "{ status: "completed", result }"
API-->>Panel: "{ status: "completed", text, structured, isError }"
|
The cloud e2e project never gated CI either, so ten scenarios rotted. Refresh the four whose product behavior moved intentionally: - connect-card-ssr-origin: install URLs are org-slug-scoped since the org-slug console URLs change (#974); accept the slug form. - connection-owner-isolation: /api/auth/switch-organization was deleted with cookie-based org switching (#1000); switch orgs the way the web client does, via the x-executor-organization selector header. - oauth-connections: the popup-state fix (#1235) envelopes the callback state as base64url JSON; decode it and assert the inner state + orgSlug. - unauthenticated-skeleton: the 404 page shipped as a standalone page in the same commit as the shell-framed assertion (#986); assert the page it actually renders. Quarantine the six that need product/harness work, each with a reason: mcp-browser-approval-org-scope + the two browser-approval scenarios (cloud-only: the mcporter browser-approval completion never lands), cli-device-login (device-flow terminal never reaches the emulator), and run-panel-auto-approve (autoApprove leaves the run paused; never green since the feature landed in #1183).
* e2e: fix stale docs, harden dev-CLI status, add cloud+selfhost CI jobs - e2e/AGENTS.md: the anatomy example predated the service-yielding scenario() signature (no more needs/ctx); capability notes said browser was cloud-only and mcp-oauth selfhost-only, both wrong per targets/*.ts; file placement now lists cloudflare/, local/, cli/; document summary, motel, test:* scripts, the viewer/ SPA, pr-media, and the Windows desktop/cli VM targets. - e2e dev CLI status: probe the app URL before reporting ready (a zombie runner with a dead server used to read as healthy), and only parse real state files in .dev/ (cloud.journey.json rendered as a garbage DEAD line). - CI: run the cloud and selfhost e2e projects on every PR/push with failure artifacts (trace.zip, session.mp4, step screenshots) uploaded per target. * Fix the MCP regressions and policy gaps the e2e suite caught Cloud (hibernatable MCP DO rework fallout): - server.ts no longer gates MCP dispatch behind the Axiom tracer install: with AXIOM_TOKEN unset (any dev boot without motel) every /mcp request fell through to the SPA router and 404ed. - agent-handler mounts a second serve() on /mcp/toolkits/:slug — the agents SDK builds an exact-match URLPattern, so the single /mcp handler never saw toolkit paths. - Restore the old envelope's transport contract: JSON-RPC 405 for verbs outside GET/POST/DELETE/OPTIONS (was a bare 404), 200 for session DELETE (agents SDK answers 204), and a reconnect-worded 404 for requests that race a condemned DO's abort. Selfhost (org-scoped MCP OAuth discovery): - The org-segment strip middleware now carries the original pathname in an internal header, and the protected-resource metadata echoes it, so a client that dialed /<org>/mcp/... passes the MCP SDK's RFC 9728 resource check. Bare paths are untouched; the header is stripped from unrewritten requests. Microsoft Graph URL policy: - microsoftHttpPlugin gains the hosts' local-network dev posture: selfhost, cloud, and the cloudflare host thread allowLocalNetwork into allowUnsafeUrlOverrides, and the override now also admits plain-http loopback URLs (local emulators). Production behavior is unchanged: the flag is unset there, and non-loopback http stays rejected even with it. Stale e2e assertion refreshed for an intentional product change: - tool-descriptions: the execute inventory is names-only since the skills tool slimming; drop the per-connection description assertions. * test(e2e): repair self-host scenarios and gate the suite in CI The self-host e2e project never ran in CI, so it drifted red while the app moved on. Repair the failing scenarios (stale connect-modal selectors, a racy action-bar position read, a shared-admin connection-count assertion, a multi-tenant-only org-slug 404 step, and a cloud-shaped toolkit MCP URL), add a documented skip affordance to the scenario helper, and quarantine the two Microsoft emulator scenarios that need a canonical block-YAML Graph spec (tracked separately). Cherry-picked from origin/fix-selfhost-e2e-and-ci (PR #1239); its CI job is superseded by the cloud+selfhost matrix job already on this branch. * test(e2e): quarantine the two agents-SDK transport gaps Both are real gaps in the hibernatable Agent bridge (standalone SSE supersede never resolves; response routing scopes JSON-RPC ids per session instead of per stream), not regressions on this branch. Skip with reasons so the suite gates CI while the gaps stay visible; fixing the bridge is tracked separately. * test(e2e): repair or quarantine the cloud scenarios that drifted on main The cloud e2e project never gated CI either, so ten scenarios rotted. Refresh the four whose product behavior moved intentionally: - connect-card-ssr-origin: install URLs are org-slug-scoped since the org-slug console URLs change (#974); accept the slug form. - connection-owner-isolation: /api/auth/switch-organization was deleted with cookie-based org switching (#1000); switch orgs the way the web client does, via the x-executor-organization selector header. - oauth-connections: the popup-state fix (#1235) envelopes the callback state as base64url JSON; decode it and assert the inner state + orgSlug. - unauthenticated-skeleton: the 404 page shipped as a standalone page in the same commit as the shell-framed assertion (#986); assert the page it actually renders. Quarantine the six that need product/harness work, each with a reason: mcp-browser-approval-org-scope + the two browser-approval scenarios (cloud-only: the mcporter browser-approval completion never lands), cli-device-login (device-flow terminal never reaches the emulator), and run-panel-auto-approve (autoApprove leaves the run paused; never green since the feature landed in #1183). * lint: suppress the adapter-boundary error checks in the MCP agent handler The condemned-DO abort surfaces as a plain runtime Error thrown out of the agents SDK's serve.fetch; its message string is the only signal. Narrow suppressions with boundary reasons, per the typed-errors skill. * test(e2e): quarantine the seat-limit scenario on the emulate 0.9.0 Autumn gap emulate 0.9.0's Autumn customer balances omit the expanded feature object autumn-js asserts, so useCustomer crashes the org page into the error boundary. Fixed upstream in UsefulSoftwareCo/emulate#8 (0.9.1); unskip once the publish lands and the e2e dependency is bumped. * ci: retrigger * ci: shard the cloud e2e job so each shard gets a fresh dev stack A full-suite run against one long-lived cloud dev server degrades partway through: sign-in starts refusing connections and everything after fails with fetch errors (the same SSE/OTel memory growth being instrumented on main). Four shards, each booting its own stack, stay under the threshold. Re-merge into one job once the leak is fixed. * ci: split the cloud e2e job into eight shards Four shards still hit the dev-server degradation a few minutes in on 2-core runners; eight keeps each stack's lifetime under the threshold. * ci: retry flaky browser scenarios twice on the same stack The remaining shard failures are scattered single-test Playwright waitFor timeouts on 2-core runners, not systemic stack death; vitest --retry clears them without hiding real regressions (a consistent failure still fails after 3 attempts). * test(e2e): quarantine the Graph default-add scenario on CI runners Compiling the Graph spec inside dev workerd 500s on 2-core GitHub runners and takes the dev stack down for every scenario after it in the shard (the auth-hint/org-slug/docs-link failures in the same shard were all downstream of this). Local runs are unaffected; skip only under CI. * selfhost: read the local-network posture from env in the plugins seam plugins() runs per request; loadConfig() does filesystem work (data dir, secret key resolution) that should not ride the request path. The env read is the same computation loadConfig makes for the flag. * e2e: bump @executor-js/emulate to 0.10.0, unskip the seat-limit scenario 0.10.0 ships the Autumn balances.feature expansion autumn-js asserts (UsefulSoftwareCo/emulate#8), so the org page renders again and the scenario passes.
Invoking a tool through the Run/Test panel paused on approval-gated tools and dead-ended on a "This tool requires approval (a policy gates it)" message, with no way to actually run it from the panel. But the operator clicking Run is itself the approval, so the panel should just run the tool.
What changed
autoApproveis threaded through the execute path. When set, the execution engine runs the inline accept-all handler instead of intercepting the first elicitation as a pause, so an approval-gated tool runs to completion:executeWithPause(code, { autoApprove })in the enginePOST /executionsaccepts an optionalautoApproveautoApprove: trueSafety is preserved:
blockpolicies fail before any elicitation, so this never bypasses a hard block. The MCP host path is unchanged and still pauses for the model to approve.Before / after
Same tool, same
Require approval · npmdl.*policy badge. Before, the panel dead-ends; after, it returns the real result (HTTP 200, real download count).Before
After
Tests
e2e/scenarios/run-panel-auto-approve.test.ts: drives the samePOST /executionsendpoint the panel uses against a tool gated by its ownrequiresApprovalannotation. WithoutautoApprovethe call pauses and the side effect does not happen; withautoApproveit runs to completion and the side effect lands. Green against a live selfhost instance.tool-invoker.test.ts: the same eliciting tool that pauses withoutautoApproveruns straight to completion with it.autoApprovepath is untouched).