Skip to content

feat: add runSandboxCommand task for background sandbox execution#17

Merged
sweetmantech merged 6 commits intomainfrom
sweetmantech/myc-4099-task-runsandboxcommand
Feb 2, 2026
Merged

feat: add runSandboxCommand task for background sandbox execution#17
sweetmantech merged 6 commits intomainfrom
sweetmantech/myc-4099-task-runsandboxcommand

Conversation

@sweetmantech
Copy link
Contributor

@sweetmantech sweetmantech commented Feb 2, 2026

Summary

  • Adds runSandboxCommandTask to move sandbox execution from Recoup-API to background processing
  • Creates Vercel Sandbox with Claude Code CLI installed
  • Executes prompts asynchronously so the API can respond faster

Changes

  • New task: src/tasks/runSandboxCommandTask.ts - Background task using schemaTask for typed payloads
  • New helpers: src/sandbox/installClaudeCode.ts, src/sandbox/runClaudeCode.ts
  • New schema: src/schemas/sandboxSchema.ts - Zod validation for task payload
  • Dependencies: Added @vercel/sandbox, ms, typescript, @types/ms

Task Configuration

  • Task ID: run-sandbox-command
  • Max duration: 15 minutes (sandbox runs for up to 10 minutes)
  • Retries: 1 (no retries - sandbox operations are not idempotent)
  • Payload: { prompt: string, accountId: string (UUID) }

How to trigger from Recoup-API

import { tasks } from "@trigger.dev/sdk/v3";
import type { runSandboxCommandTask } from "recoup-tasks";

await tasks.trigger<typeof runSandboxCommandTask>("run-sandbox-command", {
  prompt: "Your prompt here",
  accountId: "uuid-here",
});

Test plan

  • Deploy to Trigger.dev and verify task appears in dashboard
  • Trigger task manually from Trigger.dev dashboard
  • Verify sandbox creates, installs Claude Code, and executes prompt
  • Verify sandbox stops after completion

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Run Claude Code prompts in isolated sandboxes with automated setup, model-configured execution, logging, and environment cleanup.
    • Background task support that executes prompts and returns sandbox execution status and results.
  • Dependencies

    • Added runtime and dev packages to support sandbox execution tooling and TypeScript development.

Move sandbox command execution to Trigger.dev task to allow the API
to respond faster while sandbox operations run asynchronously.

- Add runSandboxCommandTask with schemaTask for typed payloads
- Add sandbox helper functions (installClaudeCode, runClaudeCode)
- Add Zod schemas for payload validation
- Add @vercel/sandbox and ms dependencies

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

📝 Walkthrough

Walkthrough

Adds Claude Code support: new dependencies, sandbox helpers to install and run Anthropic's Claude Code CLI and SDK inside Vercel sandboxes, Zod schemas for payload/result typing, and a background schemaTask that manages the sandbox lifecycle and executes prompts.

Changes

Cohort / File(s) Summary
Dependencies
package.json
Added runtime deps @vercel/sandbox, ms; added devDeps @types/ms, typescript (minor formatting tweak).
Sandbox utilities
src/sandboxes/installClaudeCode.ts, src/sandboxes/runClaudeCode.ts
New helpers: installClaudeCode(sandbox) installs @anthropic-ai/claude-code and @anthropic-ai/sdk; runClaudeCode(sandbox, prompt) writes a shell script and runs the claude CLI inside the sandbox using ANTHROPIC_API_KEY.
Schema definitions
src/schemas/sandboxSchema.ts
Added Zod schemas and types: runSandboxCommandPayloadSchema (requires prompt, sandboxId) and sandboxResultSchema (sandboxId, sandboxStatus, timeout, createdAt).
Task orchestration
src/tasks/runSandboxCommandTask.ts
Added runSandboxCommandTask (created via schemaTask) that connects to a Vercel sandbox, installs Claude Code, runs a prompt, logs lifecycle events, returns a SandboxResult, and always stops the sandbox in finally; maxDuration 15m, single attempt.

Sequence Diagram

sequenceDiagram
    actor Runner as Task\ Runner
    participant Task as runSandboxCommandTask
    participant SandboxLib as Vercel\ Sandbox
    participant Installer as installClaudeCode
    participant Executor as runClaudeCode
    participant Logger as Logger

    Runner->>Task: invoke(payload: { prompt, sandboxId })
    Task->>Logger: log start
    Task->>SandboxLib: Sandbox.get(sandboxId)
    Task->>Logger: log connected
    Task->>Installer: installClaudeCode(sandbox)
    Installer->>SandboxLib: run npm install commands
    Installer->>Logger: log install complete
    Task->>Executor: runClaudeCode(sandbox, prompt)
    Executor->>SandboxLib: write script & run claude with ANTHROPIC_API_KEY
    Executor->>Logger: log execution result
    Task->>Logger: construct SandboxResult / log success
    Task->>SandboxLib: stop sandbox (finally)
    Task->>Runner: return SandboxResult
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hopped into a sandbox small,

I planted prompts and gave a call,
Claude stirred up and spoke its code,
I logged the prints and closed the node,
A joyful hop — the job's not tall.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add runSandboxCommand task for background sandbox execution' directly and accurately summarizes the main change: adding a background task for sandbox execution.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sweetmantech/myc-4099-task-runsandboxcommand

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

sweetmantech and others added 2 commits February 2, 2026 14:11
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The API will create the sandbox and return the sandboxId immediately.
This task connects to the existing sandbox and runs the work asynchronously.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/sandboxes/runClaudeCode.ts`:
- Around line 30-40: After calling sandbox.runCommand(...) (the invocation that
runs "ralph-once.sh") you must check result.exitCode and fail the task when it's
non‑zero: inspect the returned result from sandbox.runCommand, and if
result.exitCode !== 0 throw an Error (include the exit code and any available
stdout/stderr from result in the error message) instead of only logging; keep
the existing logger.log for successful runs but ensure non‑zero exit codes cause
the function to throw so failures are not silent.
🧹 Nitpick comments (4)
src/sandboxes/installClaudeCode.ts (1)

10-31: Pin Claude Code and Anthropic SDK versions for reproducible sandbox installations.

Installing latest versions at runtime can introduce breaking changes. The project already pins critical dependencies (@trigger.dev/sdk@4.3.3, trigger.dev@4.3.3, pnpm@9.10.0); apply the same rigor to sandbox package installations by using environment variables or hardcoded versions (e.g., @anthropic-ai/claude-code@X.Y.Z).

Proposed diff
 export async function installClaudeCode(sandbox: Sandbox): Promise<void> {
+  const claudeCodeVersion = process.env.CLAUDE_CODE_VERSION;
+  const anthropicSdkVersion = process.env.ANTHROPIC_SDK_VERSION;
+
+  if (!claudeCodeVersion || !anthropicSdkVersion) {
+    throw new Error(
+      "Missing CLAUDE_CODE_VERSION or ANTHROPIC_SDK_VERSION for deterministic installs"
+    );
+  }
+
   logger.log("Installing Claude Code CLI globally");

   const installCLI = await sandbox.runCommand({
     cmd: "npm",
-    args: ["install", "-g", "@anthropic-ai/claude-code"],
+    args: ["install", "-g", `@anthropic-ai/claude-code@${claudeCodeVersion}`],
     sudo: true,
   });
@@
   const installSDK = await sandbox.runCommand({
     cmd: "npm",
-    args: ["install", "@anthropic-ai/sdk"],
+    args: ["install", `@anthropic-ai/sdk@${anthropicSdkVersion}`],
   });
src/schemas/sandboxSchema.ts (2)

3-6: Trim prompt before enforcing non-empty.

Line 4 allows whitespace-only prompts. Zod supports .trim() chaining on string schemas, so consider trimming first to avoid "blank" prompts passing validation.

Proposed diff
 export const runSandboxCommandPayloadSchema = z.object({
-  prompt: z.string().min(1, "prompt is required"),
+  prompt: z.string().trim().min(1, "prompt is required"),
   accountId: z.string().uuid("accountId must be a valid UUID"),
 });

12-16: Tighten result schema types (datetime + nonnegative timeout).

createdAt and timeout lack validation. Use z.iso.datetime() for ISO 8601 date validation and z.number().int().nonnegative() to ensure timeout is a nonnegative integer.

Proposed diff
 export const sandboxResultSchema = z.object({
   sandboxId: z.string(),
   sandboxStatus: z.string(),
-  timeout: z.number(),
-  createdAt: z.string(),
+  timeout: z.number().int().nonnegative(),
+  createdAt: z.iso.datetime(),
 });
src/tasks/runSandboxCommandTask.ts (1)

66-69: Correct the line numbers and verify the actual risk.

The code in question is at lines 59-62 (not 66-69). While sandbox.stop() is idempotent and resolves even if already stopped, the concern about error masking is technically valid: if the underlying API fails during stop(), it could override the original error. However, @vercel/sandbox.stop() is designed to be safe to call multiple times—any failures would only come from the underlying API layer (auth/network issues), not because the sandbox is already stopped. In practice, when an error occurs in the try block, the sandbox is typically already in a failed or stopped state, making stop() likely to resolve cleanly.

The proposed fix is a reasonable defensive practice to ensure the original error isn't lost if the API layer fails during cleanup, though the practical risk is low. Consider this refactor as optional, not critical.

Comment on lines +30 to +40
const result = await sandbox.runCommand({
cmd: "sh",
args: ["ralph-once.sh"],
env: {
ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY || "",
},
});

logger.log("Claude Code execution completed", {
exitCode: result.exitCode,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd "runClaudeCode.ts" --type f

Repository: Recoupable-com/Recoup-Tasks

Length of output: 101


🏁 Script executed:

cat -n src/sandboxes/runClaudeCode.ts

Repository: Recoupable-com/Recoup-Tasks

Length of output: 1383


🌐 Web query:

@vercel/sandbox runCommand exitCode stderr stdout API

💡 Result:

In @vercel/sandbox, sandbox.runCommand(...) returns a CommandFinished (when not detached) that gives you:

  • exitCode: number (0 = success) [1][2]
  • stdout(): Promise<string> and stderr(): Promise<string> to read the full output after it runs [2][3]
  • logs(): AsyncIterable<{ stream: "stdout" | "stderr"; data: string }> to stream output incrementally [3]
  • output("stdout" | "stderr" | "both") to collect one/both streams as a string [2][3]

Basic usage:

import { Sandbox } from "@vercel/sandbox";

const sandbox = await Sandbox.create();

const result = await sandbox.runCommand("node", ["-e", "console.log('hi'); console.error('oops')"]);
console.log(result.exitCode);          // number
console.log(await result.stdout());    // "hi\n"
console.log(await result.stderr());    // "oops\n"

await sandbox.stop();

Detached mode (returns a running Command; call wait() to get CommandFinished with a populated exitCode): [1][3]

const cmd = await sandbox.runCommand({ cmd: "sleep", args: ["5"], detached: true });
for await (const log of cmd.logs()) { /* stream stdout/stderr */ }
const done = await cmd.wait();
console.log(done.exitCode);

runCommand overloads / params:

  • runCommand(command: string, args?: string[], opts?: { signal?: AbortSignal }): Promise<CommandFinished> [1]
  • runCommand({ cmd, args, cwd, env, sudo, stdout, stderr, signal }): Promise<CommandFinished> [1]
  • runCommand({ ..., detached: true }): Promise<Command> [1]

Sources: [1] Sandbox.runCommand reference, [2] CommandFinished reference, [3] Command reference.


Fail the task when Claude Code exits non‑zero.

The code currently logs the exit code but doesn't check for failure. When sandbox.runCommand() returns a non-zero exitCode, the function completes successfully instead of throwing an error, causing silent failures.

Proposed fix
   const result = await sandbox.runCommand({
     cmd: "sh",
     args: ["ralph-once.sh"],
     env: {
       ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY || "",
     },
   });
 
+  if (result.exitCode !== 0) {
+    logger.error("Claude Code execution failed", {
+      exitCode: result.exitCode,
+    });
+    throw new Error(`Claude Code failed with exit code ${result.exitCode}`);
+  }
+
   logger.log("Claude Code execution completed", {
     exitCode: result.exitCode,
   });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const result = await sandbox.runCommand({
cmd: "sh",
args: ["ralph-once.sh"],
env: {
ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY || "",
},
});
logger.log("Claude Code execution completed", {
exitCode: result.exitCode,
});
const result = await sandbox.runCommand({
cmd: "sh",
args: ["ralph-once.sh"],
env: {
ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY || "",
},
});
if (result.exitCode !== 0) {
logger.error("Claude Code execution failed", {
exitCode: result.exitCode,
});
throw new Error(`Claude Code failed with exit code ${result.exitCode}`);
}
logger.log("Claude Code execution completed", {
exitCode: result.exitCode,
});
🤖 Prompt for AI Agents
In `@src/sandboxes/runClaudeCode.ts` around lines 30 - 40, After calling
sandbox.runCommand(...) (the invocation that runs "ralph-once.sh") you must
check result.exitCode and fail the task when it's non‑zero: inspect the returned
result from sandbox.runCommand, and if result.exitCode !== 0 throw an Error
(include the exit code and any available stdout/stderr from result in the error
message) instead of only logging; keep the existing logger.log for successful
runs but ensure non‑zero exit codes cause the function to throw so failures are
not silent.

Pass VERCEL_TOKEN, VERCEL_TEAM_ID, and VERCEL_PROJECT_ID explicitly
to Sandbox.get() for authentication in Trigger.dev environment.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/schemas/sandboxSchema.ts`:
- Around line 3-6: The schema runSandboxCommandPayloadSchema currently only
enforces sandboxId as a non-empty string; change its sandboxId validator to
enforce a UUID (use z.string().uuid("sandboxId must be a valid UUID")) so
invalid IDs are rejected by Zod validation, leaving prompt validation unchanged
and keeping the schema defined in src/schemas/sandboxSchema.ts.

sweetmantech and others added 2 commits February 2, 2026 16:53
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
SDK doesn't auto-infer credentials in Trigger.dev environment.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@sweetmantech sweetmantech merged commit 8e99594 into main Feb 2, 2026
1 check passed
@sweetmantech sweetmantech deleted the sweetmantech/myc-4099-task-runsandboxcommand branch February 2, 2026 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments