Skip to content

feat: add prompt_sandbox MCP tool for persistent sandbox prompting#244

Merged
sweetmantech merged 7 commits intotestfrom
sweetmantech/myc-4341-mcp-prompt_sandbox
Feb 25, 2026
Merged

feat: add prompt_sandbox MCP tool for persistent sandbox prompting#244
sweetmantech merged 7 commits intotestfrom
sweetmantech/myc-4341-mcp-prompt_sandbox

Conversation

@sweetmantech
Copy link
Contributor

@sweetmantech sweetmantech commented Feb 25, 2026

Summary

  • Adds new prompt_sandbox MCP tool that sends prompts to OpenClaw in a persistent per-account sandbox
  • Reuses a running sandbox (or creates one from the latest snapshot) instead of creating a new sandbox per call
  • Returns raw stdout/stderr synchronously — the sandbox stays alive for follow-up prompts

New Files

  • lib/sandbox/getActiveSandbox.ts — reconnects to the most recent running sandbox for an account
  • lib/sandbox/createSandboxFromSnapshot.ts — creates from latest snapshot or fresh (30m timeout, 4 vcpus, node22)
  • lib/sandbox/getOrCreateSandbox.ts — composes the above: try active first, create if needed
  • lib/sandbox/promptSandbox.ts — core domain logic: get/create sandbox → run OpenClaw prompt → return output
  • lib/mcp/tools/sandbox/registerPromptSandboxTool.ts — MCP tool registration with { prompt, account_id? } schema

Test plan

  • 20 new tests across 5 test files — all passing
  • Full test suite: 151 files, 1231 tests — all passing
  • pnpm build passes cleanly

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added a prompt sandbox execution tool to run prompts in isolated sandboxes.
    • Automatic sandbox management: reuse active sandboxes or create new ones from the latest snapshots.
    • Sandbox runs now return structured execution results including stdout, stderr, exit codes, and sandbox identifiers.

Adds a new `prompt_sandbox` MCP tool that sends prompts to OpenClaw in a
persistent per-account sandbox. Unlike `run_sandbox_command` which creates
a new sandbox each call and fires-and-forgets via Trigger.dev, this tool
reuses a running sandbox and returns stdout/stderr synchronously.

New domain logic (bottom-up):
- getActiveSandbox: reconnects to most recent running sandbox
- createSandboxFromSnapshot: creates from latest snapshot or fresh
- getOrCreateSandbox: composes the above two
- promptSandbox: orchestrates sandbox reuse + OpenClaw prompt execution

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link
Contributor

vercel bot commented Feb 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
recoup-api Ready Ready Preview Feb 25, 2026 8:42pm

Request Review

@coderabbitai
Copy link

coderabbitai bot commented Feb 25, 2026

Warning

Rate limit exceeded

@sweetmantech has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 18 minutes and 43 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 28785b5 and bf7d549.

⛔ Files ignored due to path filters (2)
  • lib/mcp/tools/sandbox/__tests__/registerPromptSandboxTool.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/promptSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (2)
  • lib/mcp/tools/sandbox/registerPromptSandboxTool.ts
  • lib/sandbox/promptSandbox.ts
📝 Walkthrough

Walkthrough

Adds a new "prompt_sandbox" MCP tool and supporting sandbox utilities to create, locate, or reuse per-account sandboxes and run the OpenClaw agent with forwarded credentials, returning structured execution results.

Changes

Cohort / File(s) Summary
MCP Tool Registration
lib/mcp/tools/sandbox/index.ts, lib/mcp/tools/sandbox/registerPromptSandboxTool.ts
Registers a new prompt_sandbox MCP tool (zod-validated prompt input). Resolves effective accountId from auth, calls promptSandbox, and maps results/errors to tool responses.
Sandbox Creation & Persistence
lib/sandbox/createSandboxFromSnapshot.ts, lib/sandbox/createSandbox.ts, lib/sandbox/processCreateSandbox.ts
Adds creation from latest snapshot, updates createSandbox return shape to include sandbox + nested response, and adapts creator usage to destructure the new result shape.
Sandbox Lookup & Lifecycle
lib/sandbox/getActiveSandbox.ts, lib/sandbox/getOrCreateSandbox.ts
Adds utilities to retrieve the most recent running sandbox or lazily create one (returns created flag and sandboxId).
Sandbox Execution
lib/sandbox/promptSandbox.ts
Adds promptSandbox to obtain/create per-account sandbox, run the OpenClaw agent with provided prompt and forwarded env vars, and return stdout/stderr/exitCode plus sandbox metadata.

Sequence Diagram

sequenceDiagram
    participant Client
    participant MCP as MCP Server
    participant PromptTool as Prompt Tool Handler
    participant SandboxMgmt as Sandbox Management
    participant Database
    participant Agent as OpenClaw Agent

    Client->>MCP: Call prompt_sandbox tool
    MCP->>PromptTool: Invoke handler with prompt & account_id
    PromptTool->>SandboxMgmt: getOrCreateSandbox(accountId)
    SandboxMgmt->>Database: selectAccountSandboxes(accountId)
    alt Active Sandbox Exists
        Database-->>SandboxMgmt: Return recent sandbox record
        SandboxMgmt->>Database: Fetch Sandbox instance by id
        Database-->>SandboxMgmt: Return Sandbox (running)
        SandboxMgmt-->>PromptTool: Return {sandbox, created: false}
    else No Active Sandbox
        Database-->>SandboxMgmt: Empty result
        SandboxMgmt->>Database: selectAccountSnapshots(accountId)
        Database-->>SandboxMgmt: Return latest snapshot (if any)
        SandboxMgmt->>SandboxMgmt: createSandboxFromSnapshot(...)
        SandboxMgmt->>Database: insertAccountSandbox(account_id, sandbox_id)
        Database-->>SandboxMgmt: Confirmed
        SandboxMgmt-->>PromptTool: Return {sandbox, created: true}
    end
    PromptTool->>Agent: Execute OpenClaw with prompt + env vars in sandbox
    Agent-->>PromptTool: {stdout, stderr, exitCode}
    PromptTool-->>MCP: Return tool result
    MCP-->>Client: Tool execution response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🏜️ Prompts whisper into sand,

🧭 Snapshots seed a waiting land,

🚀 Agents wake and words take flight,

🔁 Lazy births to get it right,

✨ Small tools hum through the night.

🚥 Pre-merge checks | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Solid & Clean Code ⚠️ Warning Code quality violations detected: missing token validation in registerPromptSandboxTool.ts and silent error swallowing in getActiveSandbox.ts catch blocks. Add token validation guard clause and replace silent catches with explicit error handling that logs or re-throws appropriately.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sweetmantech/myc-4341-mcp-prompt_sandbox

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
lib/sandbox/getActiveSandbox.ts (1)

31-33: Consider logging unexpected errors in the catch block.

The bare catch { return null; } swallows all errors from Sandbox.get(), making it impossible to distinguish a legitimately stopped sandbox from an SDK/network failure during debugging.

♻️ Suggested improvement
-  } catch {
+  } catch (err) {
+    console.error(`[getActiveSandbox] Failed to get sandbox ${mostRecent.sandbox_id}:`, err);
     return null;
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/getActiveSandbox.ts` around lines 31 - 33, The catch in
getActiveSandbox currently swallows all errors (catch { return null; }) — modify
the catch to capture the error (e.g., catch (err)) and log it before returning
null so SDK/network failures are visible; update the getActiveSandbox function
to call your logger (or console.error/console.warn) with context like
"Sandbox.get failed" and the caught error, then return null as before.
lib/sandbox/getOrCreateSandbox.ts (1)

5-9: sandboxId in the result interface is redundant.

sandboxId duplicates sandbox.sandboxId — callers already have the sandbox instance. This is minor, but callers mixing both could cause confusion if the two ever diverge (they shouldn't, but the surface area exists).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/getOrCreateSandbox.ts` around lines 5 - 9,
GetOrCreateSandboxResult currently exposes a redundant sandboxId that duplicates
sandbox.sandboxId; remove the sandboxId property from the
GetOrCreateSandboxResult interface and update all code that constructs or
consumes GetOrCreateSandboxResult (look for returns/builders that reference
GetOrCreateSandboxResult and the symbol sandboxId) to stop setting or reading
result.sandboxId and instead use result.sandbox.sandboxId wherever needed;
ensure any tests or type annotations are updated to the new shape and run type
checks to catch remaining usages.
lib/sandbox/promptSandbox.ts (1)

32-39: No timeout on runCommand — a hanging openclaw process will block the MCP request indefinitely.

If the openclaw agent stalls, runCommand will never resolve, eventually causing the caller's connection to time out without a useful error.

♻️ Suggested improvement using AbortSignal
+  const controller = new AbortController();
+  const timeoutId = setTimeout(() => controller.abort(), 5 * 60 * 1000); // 5 min

   const commandResult = await sandbox.runCommand({
     cmd: "openclaw",
     args: ["agent", "--agent", "main", "--message", prompt],
     env: {
       RECOUP_API_KEY: recoupApiKey,
       RECOUP_ACCOUNT_ID: accountId,
     },
+    signal: controller.signal,
   });
+  clearTimeout(timeoutId);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/promptSandbox.ts` around lines 32 - 39, The call to
sandbox.runCommand can hang indefinitely; wrap it with an AbortController-based
timeout so the process is aborted if it takes too long: create an
AbortController before calling sandbox.runCommand, set a timer (e.g.,
configurable ms) that calls controller.abort(), pass controller.signal into
sandbox.runCommand (where commandResult is produced), and clear the timer after
the call resolves or rejects so you don't leak timers; handle the abort error
path to return a clear timeout/error response instead of leaving the request
hanging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/sandbox/createSandboxFromSnapshot.ts`:
- Around line 36-39: The call to insertAccountSandbox in
createSandboxFromSnapshot is awaited but its result (including any { error }) is
not checked, which can leave a running sandbox untracked; modify
createSandboxFromSnapshot to capture the result from insertAccountSandbox, check
for an error or failed insert, and if it fails rollback/delete the created
sandbox (or throw so caller can clean up) and log the failure; ensure
getActiveSandbox behavior remains consistent by only returning after a
successful insert and reference insertAccountSandbox, createSandboxFromSnapshot,
and getActiveSandbox when making these changes.

In `@lib/sandbox/promptSandbox.ts`:
- Around line 35-38: Validate process.env.RECOUP_API_KEY at the start of the
function that constructs the env object (where RECOUP_API_KEY and
RECOUP_ACCOUNT_ID are set) instead of using the non-null assertion; if
process.env.RECOUP_API_KEY is missing, throw or return a clear startup error
(e.g., throw new Error or call processLogger.error and exit) that mentions the
missing RECOUP_API_KEY so the sandbox doesn't start with an invalid key. Locate
the env creation in promptSandbox.ts (the env: { RECOUP_API_KEY:
process.env.RECOUP_API_KEY!, RECOUP_ACCOUNT_ID: accountId } block) and replace
the non-null assertion by an explicit check at function entry that
validates/processes the key before building the env object.

---

Nitpick comments:
In `@lib/sandbox/getActiveSandbox.ts`:
- Around line 31-33: The catch in getActiveSandbox currently swallows all errors
(catch { return null; }) — modify the catch to capture the error (e.g., catch
(err)) and log it before returning null so SDK/network failures are visible;
update the getActiveSandbox function to call your logger (or
console.error/console.warn) with context like "Sandbox.get failed" and the
caught error, then return null as before.

In `@lib/sandbox/getOrCreateSandbox.ts`:
- Around line 5-9: GetOrCreateSandboxResult currently exposes a redundant
sandboxId that duplicates sandbox.sandboxId; remove the sandboxId property from
the GetOrCreateSandboxResult interface and update all code that constructs or
consumes GetOrCreateSandboxResult (look for returns/builders that reference
GetOrCreateSandboxResult and the symbol sandboxId) to stop setting or reading
result.sandboxId and instead use result.sandbox.sandboxId wherever needed;
ensure any tests or type annotations are updated to the new shape and run type
checks to catch remaining usages.

In `@lib/sandbox/promptSandbox.ts`:
- Around line 32-39: The call to sandbox.runCommand can hang indefinitely; wrap
it with an AbortController-based timeout so the process is aborted if it takes
too long: create an AbortController before calling sandbox.runCommand, set a
timer (e.g., configurable ms) that calls controller.abort(), pass
controller.signal into sandbox.runCommand (where commandResult is produced), and
clear the timer after the call resolves or rejects so you don't leak timers;
handle the abort error path to return a clear timeout/error response instead of
leaving the request hanging.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between feff039 and 8532d60.

⛔ Files ignored due to path filters (5)
  • lib/mcp/tools/sandbox/__tests__/registerPromptSandboxTool.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/createSandboxFromSnapshot.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/getActiveSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/getOrCreateSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/promptSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (6)
  • lib/mcp/tools/sandbox/index.ts
  • lib/mcp/tools/sandbox/registerPromptSandboxTool.ts
  • lib/sandbox/createSandboxFromSnapshot.ts
  • lib/sandbox/getActiveSandbox.ts
  • lib/sandbox/getOrCreateSandbox.ts
  • lib/sandbox/promptSandbox.ts

Comment on lines 36 to 39
await insertAccountSandbox({
account_id: accountId,
sandbox_id: sandbox.sandboxId,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Silent DB insert failure creates an orphaned, untracked sandbox.

insertAccountSandbox is awaited but its { error } result is never inspected. If the insert fails, the sandbox is running and accruing cost, but getActiveSandbox will never find it — every subsequent call will create yet another sandbox.

🐛 Proposed fix
-  await insertAccountSandbox({
+  const { error: insertError } = await insertAccountSandbox({
     account_id: accountId,
     sandbox_id: sandbox.sandboxId,
   });
+
+  if (insertError) {
+    console.error(`[createSandboxFromSnapshot] Failed to record sandbox ${sandbox.sandboxId}:`, insertError);
+    // Optionally stop the sandbox to prevent orphaned resource charges:
+    // await sandbox.stop();
+    throw new Error(`Failed to record sandbox: ${insertError.message}`);
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
await insertAccountSandbox({
account_id: accountId,
sandbox_id: sandbox.sandboxId,
});
const { error: insertError } = await insertAccountSandbox({
account_id: accountId,
sandbox_id: sandbox.sandboxId,
});
if (insertError) {
console.error(`[createSandboxFromSnapshot] Failed to record sandbox ${sandbox.sandboxId}:`, insertError);
// Optionally stop the sandbox to prevent orphaned resource charges:
// await sandbox.stop();
throw new Error(`Failed to record sandbox: ${insertError.message}`);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/createSandboxFromSnapshot.ts` around lines 36 - 39, The call to
insertAccountSandbox in createSandboxFromSnapshot is awaited but its result
(including any { error }) is not checked, which can leave a running sandbox
untracked; modify createSandboxFromSnapshot to capture the result from
insertAccountSandbox, check for an error or failed insert, and if it fails
rollback/delete the created sandbox (or throw so caller can clean up) and log
the failure; ensure getActiveSandbox behavior remains consistent by only
returning after a successful insert and reference insertAccountSandbox,
createSandboxFromSnapshot, and getActiveSandbox when making these changes.

Account ID should only be resolved from the API key via the MCP server,
not overridden by the caller.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add `sandbox` instance to `SandboxCreatedResponse` from `createSandbox`
- Rewrite `createSandboxFromSnapshot` to delegate to `createSandbox`
  instead of duplicating Sandbox.create logic and constants
- Destructure `sandbox` out in `processCreateSandbox` to keep its
  serialized response unchanged

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sandbox now receives the caller's auth token as RECOUP_API_KEY
so it operates with their permissions, not the server's global key.
The MCP tool extracts the token from authInfo and passes it through
promptSandbox to runCommand's env.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Return { sandbox, response } from createSandbox instead of mixing the
Sandbox instance into SandboxCreatedResponse. This keeps the serializable
response type clean — consumers like processCreateSandbox no longer need
to destructure the instance out before spreading.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Let the API key resolve the account ID in the sandbox context instead
of passing it explicitly. The API key already carries the account
identity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/mcp/tools/sandbox/registerPromptSandboxTool.ts`:
- Around line 47-50: The call to promptSandbox uses authInfo!.token without
validating presence; update registerPromptSandboxTool so before calling
promptSandbox you verify authInfo and authInfo.token (e.g., check
authInfo?.token) and handle the missing token by returning/throwing a clear
domain error (or logging and aborting) instead of proceeding with invalid
credentials; ensure the check is placed immediately before the promptSandbox
invocation and reference authInfo and promptSandbox in the fix so callers get a
deterministic error when token is absent.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8532d60 and 28785b5.

⛔ Files ignored due to path filters (6)
  • lib/mcp/tools/sandbox/__tests__/registerPromptSandboxTool.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/createSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/createSandboxFromSnapshot.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/createSandboxPostHandler.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/processCreateSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/promptSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (5)
  • lib/mcp/tools/sandbox/registerPromptSandboxTool.ts
  • lib/sandbox/createSandbox.ts
  • lib/sandbox/createSandboxFromSnapshot.ts
  • lib/sandbox/processCreateSandbox.ts
  • lib/sandbox/promptSandbox.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • lib/sandbox/createSandboxFromSnapshot.ts
  • lib/sandbox/promptSandbox.ts

Comment on lines +47 to +50
const result = await promptSandbox({
accountId,
apiKey: authInfo!.token,
prompt: args.prompt,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard missing API key token before invoking promptSandbox.

At Line 49, authInfo!.token is used without validation. If account resolution succeeds but token is absent, the sandbox command gets invalid credentials and fails in a less controlled way.

🔧 Proposed fix
       if (!accountId) {
         return getToolResultError("Failed to resolve account ID");
       }

+      if (!authInfo?.token) {
+        return getToolResultError("Authentication required: missing API key token");
+      }
+
       try {
         const result = await promptSandbox({
           accountId,
-          apiKey: authInfo!.token,
+          apiKey: authInfo.token,
           prompt: args.prompt,
         });

As per coding guidelines: “For domain functions, ensure … proper error handling”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/mcp/tools/sandbox/registerPromptSandboxTool.ts` around lines 47 - 50, The
call to promptSandbox uses authInfo!.token without validating presence; update
registerPromptSandboxTool so before calling promptSandbox you verify authInfo
and authInfo.token (e.g., check authInfo?.token) and handle the missing token by
returning/throwing a clear domain error (or logging and aborting) instead of
proceeding with invalid credentials; ensure the check is placed immediately
before the promptSandbox invocation and reference authInfo and promptSandbox in
the fix so callers get a deterministic error when token is absent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sweetmantech sweetmantech merged commit b5430d5 into test Feb 25, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant