Skip to content

feat: inline setup for fresh sandbox onboarding (simplified)#256

Merged
sweetmantech merged 6 commits intotestfrom
sweetmantech/myc-4377-api-inline-setup-for-fresh-sandbox-onboarding
Mar 4, 2026
Merged

feat: inline setup for fresh sandbox onboarding (simplified)#256
sweetmantech merged 6 commits intotestfrom
sweetmantech/myc-4377-api-inline-setup-for-fresh-sandbox-onboarding

Conversation

@sweetmantech
Copy link
Contributor

@sweetmantech sweetmantech commented Mar 4, 2026

Summary

  • When promptSandboxStreaming detects a fresh sandbox (no snapshot), it delegates to the existing runSandboxCommandTask background task instead of duplicating the entire setup pipeline inline
  • createSandboxFromSnapshot now returns fromSnapshot boolean
  • getOrCreateSandbox propagates fromSnapshot to callers
  • New pollTaskRun utility wraps runs.poll() from Trigger.dev SDK

How it works

Scenario Behavior
Account has snapshot Existing flow: create from snapshot, run prompt directly, stream output
Account has active sandbox Existing flow: reuse sandbox, run prompt directly, stream output
Fresh account (no snapshot) New: trigger runSandboxCommandTask (handles full setup + runs prompt), poll for completion

Why this is simpler than PR #252

PR #252 duplicated the entire setup pipeline (3000+ lines, 38 files) inline in promptSandboxStreaming. This PR reuses the existing runSandboxCommandTask which already handles OpenClaw install, GitHub repo setup, org repos, README, skills, push, and snapshot — all in ~8 changed files and ~250 net new lines.

Test plan

  • 165 tests pass across 26 test files
  • New tests for fromSnapshot in createSandboxFromSnapshot
  • New tests for fromSnapshot propagation in getOrCreateSandbox
  • New tests for fresh sandbox delegation in promptSandboxStreaming
  • New tests for pollTaskRun utility

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features
    • Fresh sandboxes now trigger background execution immediately, improving initialization speed.
    • System now tracks sandbox origin (created from snapshot or newly created) and exposes run identifiers for better execution monitoring.

When promptSandboxStreaming detects a fresh sandbox (no snapshot),
it delegates to the existing runSandboxCommandTask background task
which handles full setup (OpenClaw, GitHub, etc.) instead of
duplicating the setup pipeline inline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link
Contributor

vercel bot commented Mar 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
recoup-api Ready Ready Preview Mar 4, 2026 1:14pm

Request Review

@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

📝 Walkthrough

Walkthrough

The changes implement snapshot-origin tracking throughout the sandbox creation and streaming pipeline. A new fromSnapshot flag is introduced and propagated from sandbox creation through prompt execution, with fresh sandboxes now triggering background run tasks instead of direct streaming.

Changes

Cohort / File(s) Summary
Sandbox Creation Flow
lib/sandbox/createSandboxFromSnapshot.ts, lib/sandbox/getOrCreateSandbox.ts
New CreateSandboxFromSnapshotResult interface with fromSnapshot flag introduced; return types updated to expose snapshot origin. GetOrCreateSandboxResult now extends the new interface to propagate fromSnapshot through both existing and fresh sandbox paths.
Streaming & Tooling
lib/sandbox/promptSandboxStreaming.ts, lib/chat/tools/createPromptSandboxStreamingTool.ts
Added fromSnapshot and runId fields to streaming result interfaces. Fresh sandboxes (non-snapshot) now trigger background run via triggerRunSandboxCommand with early return; snapshotted sandboxes continue direct streaming. Tool updated to conditionally emit new fields in final result payload.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant GetOrCreateSandbox
    participant CreateSandboxFromSnapshot
    participant TriggerRunSandboxCommand
    participant DirectStreaming

    Client->>GetOrCreateSandbox: getOrCreateSandbox(accountId)
    
    alt Existing Sandbox Found
        GetOrCreateSandbox-->>Client: {sandbox, sandboxId, created: false, fromSnapshot: true}
    else Fresh Sandbox Created
        GetOrCreateSandbox->>CreateSandboxFromSnapshot: createSandboxFromSnapshot(accountId)
        CreateSandboxFromSnapshot-->>GetOrCreateSandbox: {sandbox, fromSnapshot: false}
        GetOrCreateSandbox->>TriggerRunSandboxCommand: trigger background run
        TriggerRunSandboxCommand-->>GetOrCreateSandbox: runId
        GetOrCreateSandbox-->>Client: {sandbox, sandboxId, created: true, fromSnapshot: false}
    end
    
    Client->>DirectStreaming: promptSandboxStreaming(sandbox, fromSnapshot)
    
    alt Fresh (fromSnapshot: false)
        DirectStreaming->>TriggerRunSandboxCommand: initiate background execution
        DirectStreaming-->>Client: {runId, fromSnapshot: false, stdout: "", stderr: ""}
    else Snapshot-based (fromSnapshot: true)
        DirectStreaming->>DirectStreaming: execute & stream directly
        DirectStreaming-->>Client: {stdout, stderr, fromSnapshot: true}
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🎯 From snapshots old or freshly spun,
A flag now tracks which path we've run—
The fresh ones leap to background tasks,
While snapshots stream what one might ask.
Clean contracts flow through every stage,
SOLID principles on every page! 📋✨

🚥 Pre-merge checks | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Solid & Clean Code ⚠️ Warning The pull request contains multiple SOLID and clean code violations including SRP violation in getOrCreateSandbox with incorrect fromSnapshot assumption, DRY violation with duplicated return structures, unused variable accumulation, overly complex conditional logic, and missing error handling. Refactor getOrCreateSandbox to properly track sandbox origin; consolidate duplicate return statements; remove unused stdout variable; simplify conditional field assignment; add comprehensive error handling with try/catch blocks throughout all affected functions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sweetmantech/myc-4377-api-inline-setup-for-fresh-sandbox-onboarding

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…hotResult

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
lib/sandbox/getOrCreateSandbox.ts (1)

9-9: fromSnapshot is semantically overloaded in the existing-sandbox path.

On Line 26, fromSnapshot: true means “already has an active sandbox,” not strictly “originated from snapshot.” Consider renaming this flag (e.g., hasExistingSetup) to reduce future misuse.

As per coding guidelines, "**/*.{ts,tsx}: ... use specific names like 'artist', 'workspace', 'organization' when referring to specific types".

Also applies to: 26-26

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/getOrCreateSandbox.ts` at line 9, The flag fromSnapshot is
overloaded in getOrCreateSandbox (used to mean "already has an active sandbox"
in the existing-sandbox path) — rename that boolean to a specific, unambiguous
name (e.g., hasExistingSetup) and update all places that construct or read this
property (the object created in the existing-sandbox branch where fromSnapshot:
true is set and any consumer code that checks the flag) to use the new name;
ensure the exported/returned type signature and any references in functions or
classes that expect fromSnapshot are updated accordingly to avoid breaking type
checks and follow the naming guideline for specific, semantic property names.
lib/sandbox/promptSandboxStreaming.ts (1)

31-37: Split this function into focused helpers to reduce complexity.

promptSandboxStreaming now orchestrates sandbox retrieval, fresh-background execution, and direct streaming in one flow. Extracting the fresh path and direct path into helpers will improve readability/testability and keep this under the recommended size.

As per coding guidelines, lib/**/*.ts: "Single responsibility per function", "Keep functions under 50 lines", and "DRY: Consolidate similar logic into shared utilities".

Also applies to: 43-70, 72-102

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/promptSandboxStreaming.ts` around lines 31 - 37, The
promptSandboxStreaming function is doing too much—split its responsibilities
into focused helpers: extract sandbox retrieval into
getOrCreateSandbox(sandboxId|input) and separate the two execution flows into
handleFreshExecution(input, sandbox) and handleDirectStreaming(input, sandbox)
(or similarly named functions referenced from promptSandboxStreaming) so the
main function only routes to the appropriate helper and yields their
AsyncGenerator output; move any duplicated logic (e.g., stream event mapping,
error handling, logger usage) into a shared utility (e.g., streamEventMapper or
emitStreamChunk) so duplicate code in the fresh-path and direct-path sections is
consolidated, keep each helper under ~50 lines, update promptSandboxStreaming to
call these helpers and forward their results, and adjust tests to call the
helpers directly for focused unit tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/sandbox/promptSandboxStreaming.ts`:
- Around line 56-61: The code assumes pollTaskRun(handle.id) returns a run with
a non-null structured output and force-casts it; update the logic around
pollTaskRun and the local output variable to first check that run and run.output
exist and have the expected shape (stdout, stderr, exitCode) before casting or
accessing properties, and throw or return a clear structured error if missing or
malformed; specifically adjust the block using pollTaskRun, the const run =
await pollTaskRun(handle.id) and the subsequent use of output (and the handling
around lines 63–69) to validate types/fields (or use a safe parser/guard)
instead of force-casting.
- Around line 43-56: The generator currently awaits pollTaskRun(handle.id)
unconditionally and can outlive an aborted caller; update the created &&
!fromSnapshot branch to respect the provided abortSignal by (1) passing the
abortSignal into triggerRunSandboxCommand and/or pollTaskRun if those helpers
accept it (use the abortSignal parameter), and (2) if they don't, check
abortSignal.aborted immediately after obtaining handle (and subscribe to
abortSignal) to cancel the run and throw or return early; ensure you also call
any available cancellation API for the run (e.g., a cancelTaskRun or
handle.cancel) before exiting to avoid orphaned background tasks.

---

Nitpick comments:
In `@lib/sandbox/getOrCreateSandbox.ts`:
- Line 9: The flag fromSnapshot is overloaded in getOrCreateSandbox (used to
mean "already has an active sandbox" in the existing-sandbox path) — rename that
boolean to a specific, unambiguous name (e.g., hasExistingSetup) and update all
places that construct or read this property (the object created in the
existing-sandbox branch where fromSnapshot: true is set and any consumer code
that checks the flag) to use the new name; ensure the exported/returned type
signature and any references in functions or classes that expect fromSnapshot
are updated accordingly to avoid breaking type checks and follow the naming
guideline for specific, semantic property names.

In `@lib/sandbox/promptSandboxStreaming.ts`:
- Around line 31-37: The promptSandboxStreaming function is doing too much—split
its responsibilities into focused helpers: extract sandbox retrieval into
getOrCreateSandbox(sandboxId|input) and separate the two execution flows into
handleFreshExecution(input, sandbox) and handleDirectStreaming(input, sandbox)
(or similarly named functions referenced from promptSandboxStreaming) so the
main function only routes to the appropriate helper and yields their
AsyncGenerator output; move any duplicated logic (e.g., stream event mapping,
error handling, logger usage) into a shared utility (e.g., streamEventMapper or
emitStreamChunk) so duplicate code in the fresh-path and direct-path sections is
consolidated, keep each helper under ~50 lines, update promptSandboxStreaming to
call these helpers and forward their results, and adjust tests to call the
helpers directly for focused unit tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8e40e5c5-5e51-4b7f-a2db-867fafac8b06

📥 Commits

Reviewing files that changed from the base of the PR and between 36d385e and 6d41770.

⛔ Files ignored due to path filters (4)
  • lib/sandbox/__tests__/createSandboxFromSnapshot.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/getOrCreateSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/promptSandboxStreaming.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/trigger/__tests__/pollTaskRun.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (4)
  • lib/sandbox/createSandboxFromSnapshot.ts
  • lib/sandbox/getOrCreateSandbox.ts
  • lib/sandbox/promptSandboxStreaming.ts
  • lib/trigger/pollTaskRun.ts

Comment on lines +43 to +56
if (created && !fromSnapshot) {
yield {
data: "Setting up your sandbox for the first time...\n",
stream: "stderr" as const,
};

const handle = await triggerRunSandboxCommand({
command: "openclaw",
args: ["agent", "--agent", "main", "--message", prompt],
sandboxId,
accountId,
});

const run = await pollTaskRun(handle.id);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Background-task path currently ignores abortSignal.

Line 56 awaits polling unconditionally; if the caller aborts, this generator can continue waiting and outlive the request lifecycle.

Proposed cancellation wiring
+    if (abortSignal?.aborted) {
+      throw new DOMException("Operation aborted", "AbortError");
+    }
+
     const handle = await triggerRunSandboxCommand({
       command: "openclaw",
       args: ["agent", "--agent", "main", "--message", prompt],
       sandboxId,
       accountId,
     });
 
-    const run = await pollTaskRun(handle.id);
+    const runPromise = pollTaskRun(handle.id);
+    const run = abortSignal
+      ? await Promise.race([
+          runPromise,
+          new Promise<never>((_, reject) => {
+            abortSignal.addEventListener(
+              "abort",
+              () => reject(new DOMException("Operation aborted", "AbortError")),
+              { once: true },
+            );
+          }),
+        ])
+      : await runPromise;

As per coding guidelines, lib/**/*.ts: "For domain functions, ensure: ... Proper error handling".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/promptSandboxStreaming.ts` around lines 43 - 56, The generator
currently awaits pollTaskRun(handle.id) unconditionally and can outlive an
aborted caller; update the created && !fromSnapshot branch to respect the
provided abortSignal by (1) passing the abortSignal into
triggerRunSandboxCommand and/or pollTaskRun if those helpers accept it (use the
abortSignal parameter), and (2) if they don't, check abortSignal.aborted
immediately after obtaining handle (and subscribe to abortSignal) to cancel the
run and throw or return early; ensure you also call any available cancellation
API for the run (e.g., a cancelTaskRun or handle.cancel) before exiting to avoid
orphaned background tasks.

…oxes

Avoids API function timeouts by triggering the background task and
returning the runId immediately. The caller can track progress
without blocking the request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Chat frontend will show the same tool component as run sandbox command,
so no need to yield a status message from the streaming generator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/sandbox/getOrCreateSandbox.ts`:
- Around line 21-27: getOrCreateSandbox currently returns fromSnapshot: true
when reusing an existing Sandbox returned by getActiveSandbox(accountId), but
that branch has no snapshot-origin metadata so it can misreport origin; update
the Sandbox persistence/retrieval to include origin metadata (e.g., add an
origin/fromSnapshot field when creating and storing sandboxes) and change
getActiveSandbox and getOrCreateSandbox to read and propagate that real origin
value (use the Sandbox object's origin/fromSnapshot property instead of
hardcoding true) so callers receive the correct origin flag for
sandboxId/sandbox.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: efb2d170-e9c3-4470-b661-7ff1d1bf8375

📥 Commits

Reviewing files that changed from the base of the PR and between 6d41770 and 02ce1cb.

📒 Files selected for processing (1)
  • lib/sandbox/getOrCreateSandbox.ts

Comment on lines 21 to 27
if (existing) {
return {
sandbox: existing,
sandboxId: existing.sandboxId,
created: false,
fromSnapshot: true,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

fromSnapshot is hardcoded for reused sandboxes and can misreport origin.

On Line 26, fromSnapshot: true is returned even though this branch only has Sandbox from getActiveSandbox(accountId) and no snapshot-origin metadata. That can leak incorrect state to callers if an active sandbox was originally created fresh.

Recommend persisting/retrieving sandbox origin metadata (e.g., alongside sandbox_id) and returning the real value here, rather than inferring it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/getOrCreateSandbox.ts` around lines 21 - 27, getOrCreateSandbox
currently returns fromSnapshot: true when reusing an existing Sandbox returned
by getActiveSandbox(accountId), but that branch has no snapshot-origin metadata
so it can misreport origin; update the Sandbox persistence/retrieval to include
origin metadata (e.g., add an origin/fromSnapshot field when creating and
storing sandboxes) and change getActiveSandbox and getOrCreateSandbox to read
and propagate that real origin value (use the Sandbox object's
origin/fromSnapshot property instead of hardcoding true) so callers receive the
correct origin flag for sandboxId/sandbox.

Allows the chat tool UI component to determine which UI to render
based on whether the sandbox was created from a snapshot or fresh.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The complete status now includes fromSnapshot and runId when a fresh
sandbox triggers a background setup task. This allows the chat frontend
to show the appropriate UI component (run sandbox command view vs
streaming view).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
lib/sandbox/promptSandboxStreaming.ts (1)

45-51: Refactor shared command construction and split execution paths for maintainability.

This function now handles two execution strategies plus output aggregation, and the OpenClaw args are duplicated in Line 48 and Line 67. Extracting helpers will reduce branching noise and keep this domain function tighter.

♻️ Proposed refactor
+function buildOpenClawAgentArgs(prompt: string): string[] {
+  return ["agent", "--agent", "main", "--message", prompt];
+}
+
+async function triggerFreshSandboxRun(params: {
+  sandboxId: string;
+  accountId: string;
+  prompt: string;
+  created: boolean;
+  fromSnapshot: boolean;
+}): Promise<PromptSandboxStreamingResult> {
+  const handle = await triggerRunSandboxCommand({
+    command: "openclaw",
+    args: buildOpenClawAgentArgs(params.prompt),
+    sandboxId: params.sandboxId,
+    accountId: params.accountId,
+  });
+
+  return {
+    sandboxId: params.sandboxId,
+    stdout: "",
+    stderr: "",
+    exitCode: 0,
+    created: params.created,
+    fromSnapshot: params.fromSnapshot,
+    runId: handle.id,
+  };
+}
+
 export async function* promptSandboxStreaming(
   input: PromptSandboxStreamingInput,
 ): AsyncGenerator<
@@
   // Fresh sandbox: trigger background task for full setup + prompt execution
   if (created && !fromSnapshot) {
-    const handle = await triggerRunSandboxCommand({
-      command: "openclaw",
-      args: ["agent", "--agent", "main", "--message", prompt],
-      sandboxId,
-      accountId,
-    });
-
-    return {
-      sandboxId,
-      stdout: "",
-      stderr: "",
-      exitCode: 0,
-      created,
-      fromSnapshot,
-      runId: handle.id,
-    };
+    return triggerFreshSandboxRun({
+      sandboxId,
+      accountId,
+      prompt,
+      created,
+      fromSnapshot,
+    });
   }
@@
   const cmd = await sandbox.runCommand({
     cmd: "openclaw",
-    args: ["agent", "--agent", "main", "--message", prompt],
+    args: buildOpenClawAgentArgs(prompt),

As per coding guidelines, lib/**/*.ts: "For domain functions, ensure: Single responsibility per function ... Keep functions under 50 lines ... DRY: Consolidate similar logic into shared utilities" and **/*.{ts,tsx}: "Extract shared logic into reusable utilities following Don't Repeat Yourself (DRY) principle".

Also applies to: 65-67

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/promptSandboxStreaming.ts` around lines 45 - 51, The OpenClaw
command args are duplicated in promptSandboxStreaming (used in the
triggerRunSandboxCommand call and again in the alternate execution path);
extract a small helper like buildOpenClawArgs(prompt) and a
runOpenClaw(sandboxId, accountId, prompt) wrapper that calls
triggerRunSandboxCommand with the constructed args, then replace the two inline
arg arrays and direct calls with those helpers, and refactor the two execution
branches to both call the same run helper and a single output-aggregation
routine so promptSandboxStreaming remains focused and DRY.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@lib/sandbox/promptSandboxStreaming.ts`:
- Around line 45-51: The OpenClaw command args are duplicated in
promptSandboxStreaming (used in the triggerRunSandboxCommand call and again in
the alternate execution path); extract a small helper like
buildOpenClawArgs(prompt) and a runOpenClaw(sandboxId, accountId, prompt)
wrapper that calls triggerRunSandboxCommand with the constructed args, then
replace the two inline arg arrays and direct calls with those helpers, and
refactor the two execution branches to both call the same run helper and a
single output-aggregation routine so promptSandboxStreaming remains focused and
DRY.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 56e6743e-7a4e-414e-b9f8-e352e551d53b

📥 Commits

Reviewing files that changed from the base of the PR and between 02ce1cb and 4cca737.

⛔ Files ignored due to path filters (2)
  • lib/chat/tools/__tests__/createPromptSandboxStreamingTool.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/promptSandboxStreaming.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (2)
  • lib/chat/tools/createPromptSandboxStreamingTool.ts
  • lib/sandbox/promptSandboxStreaming.ts

@sweetmantech sweetmantech merged commit 7555cfb into test Mar 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant