Skip to content

feat(sandbox): port buildOrgSnapshotWorkflow + Vercel Workflow runtime#531

Merged
sweetmantech merged 3 commits intotestfrom
feat/sandbox-build-org-snapshot-workflow
May 7, 2026
Merged

feat(sandbox): port buildOrgSnapshotWorkflow + Vercel Workflow runtime#531
sweetmantech merged 3 commits intotestfrom
feat/sandbox-build-org-snapshot-workflow

Conversation

@sweetmantech
Copy link
Copy Markdown
Contributor

@sweetmantech sweetmantech commented May 7, 2026

Summary

Closes the biggest remaining cutover gap from the gap analysis: when a recoupable org repo is requested but no snapshot exists yet, fire a durable background workflow that builds one. The current request still pays the slow full-clone path; what's protected is that future sessions for the same org warm-boot from the new snapshot in seconds.

This is the first Vercel Workflow integration in api. Per the direction of using Vercel Workflow as the runner going forward (deprecating Trigger.dev over time), this PR lands the infra alongside the first workflow.

What's new

File Purpose
`next.config.ts` Wraps the config with `withWorkflow` from `workflow/next` so workflow files become deployable
`app/workflows/buildOrgSnapshot.ts` Preserves open-agents' workflow code structure. One step that calls api's existing `refreshBaseSnapshot` (inlined via #507) — provision sandbox named ``, run `git clone --depth=1 `, snapshot
`lib/sandbox/kickBuildOrgSnapshotWorkflow.ts` Fire-and-forget `start(buildOrgSnapshotWorkflow, [input])` from `workflow/api`. Failures logged, never surfaced
`lib/sandbox/defaultBaseSnapshotId.ts` `DEFAULT_SANDBOX_BASE_SNAPSHOT_ID` constant, env-overridable. Used as the base image (bun + jq + agent-browser + chromium + code-server) so the workflow only does the clone
`lib/sandbox/createSandboxHandler.ts` Wired in: when `extractOrgRepoName` returns non-null and `findOrgSnapshot` misses, kick the workflow

Plus the `workflow@^4.2.0-beta.72` package and lockfile.

Behavior

POST `/api/sandbox` flow now:

  1. Auth + ownership (unchanged)
  2. `extractOrgRepoName(repoUrl)` (unchanged)
  3. `findOrgSnapshot(orgRepoName)` (unchanged) → hit OR miss
  4. NEW: on a recoupable miss → `kickBuildOrgSnapshotWorkflow({cloneUrl, sandboxName})` fire-and-forget
  5. `connectSandbox` provisions the session sandbox (unchanged — this request gets the full-clone path)
  6. The workflow runs durably in the background; future requests for the same org hit the snapshot

The workflow itself is a faithful port of open-agents' `buildOrgSnapshotWorkflow`. Same one-step structure, same `refreshBaseSnapshot` call (already in api), same failure path (caught + logged, returns `{success: false, error}`).

TDD

Strict red → green. +3 net new tests (suite 2574 → 2577):

  • kicks the workflow on recoupable miss with the right input
  • does NOT kick when an existing snapshot is found (hit case)
  • does NOT kick for non-recoupable repos

Verification

Check Result
`pnpm test` ✅ 2577 / 2577 pass
`pnpm lint:check` ✅ clean
`tsc --noEmit` for new files ✅ clean

Test plan

  • CI green
  • Preview build succeeds (the `withWorkflow` wrapper is a build-time change — first-time workflow integration)
  • Preview: POST /api/sandbox with a recoupable org URL that has no snapshot → response 200 (cold-clone path), workflow run shows up in Vercel dashboard, and a few minutes later `vercel sandbox snapshots list` shows a new `created` snapshot for that org name
  • Preview: POST again with the same URL → `[findOrgSnapshot]` log line shows hit, `readyMs` drops dramatically, no second workflow kick

What follows (Phase 2)

This was Phase 1 (build-org-snapshot — small, self-contained). Phase 2 is the bigger one:

  • `sandboxLifecycleWorkflow` (123 lines, stateful while-loop with timed sleeps, `lifecycle_run_id` lease semantics)
  • `evaluateSandboxLifecycle` and 4-5 dependent helpers
  • Wire `sandbox-created` and `status-check-overdue` kicks

Will land in its own PR. Phase 1 already closes the largest user-visible cutover gap (~75s cold-start regression for new orgs).

🤖 Generated with Claude Code


Summary by cubic

Adds a background Vercel Workflow that builds an org snapshot when a recoupable repo is requested and no snapshot exists, so future sessions start in seconds. Enables the Vercel Workflow runtime and hardens the build step.

  • New Features

    • Added buildOrgSnapshotWorkflow + buildSnapshotStep to provision, shallow-clone, and snapshot org repos via refreshBaseSnapshot; uses DEFAULT_SANDBOX_BASE_SNAPSHOT_ID (env overridable).
    • Updated createSandboxHandler to fire-and-forget kickBuildOrgSnapshotWorkflow on a recoupable snapshot miss; current request still full-clones.
    • Enabled Vercel Workflows via withWorkflow(nextConfig); workflows live in app/workflows/*; added workflow@^4.2.4.
  • Bug Fixes

    • Pointed DEFAULT_SANDBOX_BASE_SNAPSHOT_ID to a Recoup team snapshot snap_RgVtpDO4y1BJHQiUbptMwS3Rt2EQ to stop 404s.
    • Hardened the build: shell-escape the git clone URL to prevent command injection; removed cloneUrl from logs to avoid leaking credentials.

Written for commit 27cce76. Summary will update on new commits.

Summary by CodeRabbit

  • New Features
    • Added organization snapshot building capability to accelerate sandbox initialization
    • Implemented background snapshot warm-up mechanism for faster subsequent sandbox provisioning
    • Enhanced sandbox creation flow with automatic organization snapshot detection and management

Closes the biggest remaining cutover gap: when a recoupable org repo
is requested via POST /api/sandbox and no snapshot exists yet, fire a
durable background workflow that builds one. The current request still
pays the slow full-clone path; what's protected is that *future*
sessions for the same org warm-boot from the new snapshot in seconds.

This is the first Vercel Workflow integration in api. Per the
direction of using Vercel Workflow as the workflow runner going
forward (deprecating Trigger.dev over time), this PR also lands the
infra: the `workflow` package, `withWorkflow(nextConfig)` wrapper, and
the `app/workflows/` convention.

Files:
- `next.config.ts` — wraps the config with `withWorkflow` from
  `workflow/next` so workflow files (with `"use workflow"` /
  `"use step"` directives) are deployable
- `app/workflows/buildOrgSnapshot.ts` — preserves the open-agents
  workflow code structure exactly. One step that calls api's existing
  `refreshBaseSnapshot` (inlined via PR #507) to provision a sandbox
  named `<orgRepoName>`, run `git clone --depth=1 <cloneUrl>`, and
  snapshot the result
- `lib/sandbox/kickBuildOrgSnapshotWorkflow.ts` — fire-and-forget
  invocation via `start(buildOrgSnapshotWorkflow, [input])` from
  `workflow/api`. Failures are logged but never surface in the
  request — the request always falls back to the slow full-clone path
- `lib/sandbox/defaultBaseSnapshotId.ts` — `DEFAULT_SANDBOX_BASE_SNAPSHOT_ID`
  constant, env-overridable. Used by the build workflow as the base
  image (bun + jq + agent-browser + chromium + code-server) so the
  workflow only does the clone, not the runtime install
- `lib/sandbox/createSandboxHandler.ts` — wired in: when
  `extractOrgRepoName` returns non-null and `findOrgSnapshot` misses,
  kick the workflow

TDD red -> green:
- 3 new createSandboxHandler tests:
  - kicks the workflow on recoupable miss with the right input
  - does NOT kick when an existing snapshot is found (hit case)
  - does NOT kick for non-recoupable repos
- All existing tests pass (the mock for kickBuildOrgSnapshotWorkflow
  is no-op so unchanged tests don't see the kick)
- Suite: 2574 -> 2577 (+3). pnpm lint:check + tsc --noEmit clean.

Phase 2 (sandboxLifecycleWorkflow + evaluateSandboxLifecycle) follows
in a separate PR — that one's much bigger and ports the stateful
auto-pause / status-check-overdue logic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
api Ready Ready Preview May 7, 2026 9:12pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces a Vercel durable workflow system for building and caching organization repository snapshots. When a sandbox is created for an organization repo without a cached snapshot, the handler asynchronously initiates a background workflow to clone the repo into a base sandbox and save the result, enabling faster provisioning on subsequent accesses.

Changes

Organization Snapshot Build Workflow

Layer / File(s) Summary
Types and Configuration
lib/sandbox/defaultBaseSnapshotId.ts
Introduces DEFAULT_SANDBOX_BASE_SNAPSHOT_ID sourced from environment variable with hardcoded fallback snapshot ID; includes documentation on usage and refresh procedures.
Snapshot Build Step
app/workflows/buildSnapshotStep.ts
Defines BuildOrgSnapshotInput contract and implements buildSnapshotStep to provision a sandbox from the base snapshot, retrieve a service GitHub token, execute git clone --depth=1 of the repository, and return the new snapshot ID.
Workflow Orchestration
app/workflows/buildOrgSnapshotWorkflow.ts
Implements the durable workflow that invokes buildSnapshotStep, logs execution, normalizes exceptions to error strings, and returns typed payloads: { success: true, snapshotId } or { success: false, error }.
Fire-and-Forget Launcher
lib/sandbox/kickBuildOrgSnapshotWorkflow.ts
Provides kickBuildOrgSnapshotWorkflow to start the workflow asynchronously without blocking, logging workflow run IDs on success and suppressing errors to prevent cascading failures.
Sandbox Handler Integration
lib/sandbox/createSandboxHandler.ts
Integrates conditional workflow trigger: when an organization repository is identified but no cached snapshot exists, asynchronously kicks the build workflow before provisioning the current sandbox, enabling next-session warm-boot without blocking the current request.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

A snapshot blooms in the durable night,
Cloning repos with workflow delight,
Fire-and-forget, no need to wait—
Next session boots at a blazing rate! 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Solid & Clean Code ⚠️ Warning DRY violation: Identical input interfaces defined twice. Error handling prevents workflow auto-retries by catching and returning instead of rethrowing. Consolidate BuildOrgSnapshotInput/KickBuildOrgSnapshotInput to shared definition. Rethrow errors instead of returning success: false for proper Vercel Workflow retry semantics.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/sandbox-build-org-snapshot-workflow

Warning

Review ran into problems

🔥 Problems

Timed out fetching pipeline failures after 30000ms


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 8 files

Confidence score: 2/5

  • There is a high-confidence security risk in app/workflows/buildOrgSnapshot.ts: interpolating cloneUrl directly into a shell command can enable command injection, which is a merge-blocking concern unless properly quoted/escaped.
  • lib/sandbox/kickBuildOrgSnapshotWorkflow.ts logs the full cloneUrl, which can expose embedded credentials/tokens and creates meaningful security and compliance risk.
  • lib/sandbox/createSandboxHandler.ts may trigger duplicate snapshot builds for the same org when a prior build is still running, introducing concrete regression risk (extra load, conflicting workflow state).
  • Pay close attention to app/workflows/buildOrgSnapshot.ts, lib/sandbox/kickBuildOrgSnapshotWorkflow.ts, lib/sandbox/createSandboxHandler.ts - command execution safety, secret leakage in logs, and duplicate workflow triggering.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="app/workflows/buildOrgSnapshot.ts">

<violation number="1" location="app/workflows/buildOrgSnapshot.ts:31">
P0: Avoid interpolating `cloneUrl` directly into a shell command; quote/escape it before execution to prevent command injection.</violation>

<violation number="2" location="app/workflows/buildOrgSnapshot.ts:38">
P2: Custom agent: **Module should export a single primary function whose name matches the filename**

Primary export name must match the filename basename; `buildOrgSnapshotWorkflow` does not match `buildOrgSnapshot`.</violation>
</file>

<file name="lib/sandbox/kickBuildOrgSnapshotWorkflow.ts">

<violation number="1" location="lib/sandbox/kickBuildOrgSnapshotWorkflow.ts:27">
P1: Avoid logging the full `cloneUrl`; it can leak embedded repository credentials or tokens.</violation>
</file>

<file name="lib/sandbox/createSandboxHandler.ts">

<violation number="1" location="lib/sandbox/createSandboxHandler.ts:75">
P2: The new snapshot-miss branch can repeatedly start duplicate build workflows for the same org while an earlier build is still in progress.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant Client as External Client
    participant Handler as createSandboxHandler
    participant Extract as extractOrgRepoName
    participant Snap as findOrgSnapshot
    participant Kick as kickBuildOrgSnapshotWorkflow
    participant Workflow as Vercel Workflow Runtime
    participant WFCode as buildOrgSnapshotWorkflow
    participant Step as buildSnapshotStep
    participant Refresh as refreshBaseSnapshot
    participant Tokens as getServiceGithubToken
    participant SandboxAPI as Sandbox Service (Vercel)
    participant FutureClient as Future Client (same org)

    Note over Client,FutureClient: POST /api/sandbox - Recoupable Org Snapshot Miss

    Client->>Handler: POST /api/sandbox
    Handler->>Handler: Auth + ownership (unchanged)

    Handler->>Extract: extractOrgRepoName(repoUrl)
    Extract-->>Handler: orgRepoName or null

    alt Org repo detected
        Handler->>Snap: findOrgSnapshot(orgRepoName)
        
        alt No snapshot exists (miss)
            Snap-->>Handler: null

            Note over Handler,FutureClient: NEW: Fire-and-forget background workflow

            Handler->>Kick: kickBuildOrgSnapshotWorkflow(cloneUrl, sandboxName)
            Kick->>Kick: fire-and-forget (void promise)

            alt Workflow starts successfully
                Kick->>Workflow: start(buildOrgSnapshotWorkflow, [input])
                Workflow-->>Kick: run { runId }
                Kick->>Kick: log success
            else Workflow fails to start
                Workflow-->>Kick: error
                Kick->>Kick: log error (never surface to client)
            end

            par Background workflow execution (outside request lifecycle)
                Workflow->>WFCode: execute buildOrgSnapshotWorkflow(input)
                WFCode->>WFCode: use workflow
                WFCode->>Step: buildSnapshotStep(input)
                Step->>Step: use step

                Step->>Tokens: getServiceGithubToken()
                alt Token available
                    Tokens-->>Step: githubToken
                    Step->>Refresh: refreshBaseSnapshot(...)
                    Refresh->>SandboxAPI: provision sandbox from base snapshot
                    Refresh->>SandboxAPI: execute git clone
                    Refresh->>SandboxAPI: create snapshot
                    SandboxAPI-->>Refresh: snapshotId
                    Refresh-->>Step: result { snapshotId }
                    Step-->>WFCode: snapshotId
                    WFCode-->>Workflow: return { success: true, snapshotId }
                else Token missing
                    Tokens-->>Step: undefined
                    Step->>Step: throw Error
                    Step-->>WFCode: error caught
                    WFCode-->>Workflow: return { success: false, error }
                end
            end

        else Snapshot exists (hit)
            Snap-->>Handler: snapshotId
            Note over Handler: NEW: Do NOT kick workflow
        end
    end

    Note over Handler,FutureClient: Request continues with full-clone path (unchanged)

    Handler->>Handler: connectSandbox (cold-start full clone)

    Handler-->>Client: 200 OK

    Note over FutureClient,FutureClient: Future requests for same org

    FutureClient->>Handler: POST /api/sandbox (same org)
    Handler->>Snap: findOrgSnapshot(orgRepoName)
    Snap-->>Handler: snapshotId (created by workflow)
    Handler->>Handler: warm-boot from snapshot (seconds)

    Note over FutureClient,FutureClient: Dramatically reduced cold-start
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread app/workflows/buildOrgSnapshot.ts Outdated
Comment thread lib/sandbox/kickBuildOrgSnapshotWorkflow.ts Outdated
Comment thread app/workflows/buildOrgSnapshot.ts Outdated
// the *next* session warm-boots from it. This request still pays the
// full-clone cold-start path — the workflow runs durably outside the
// request lifecycle.
if (orgRepoName && !orgSnapshotId) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The new snapshot-miss branch can repeatedly start duplicate build workflows for the same org while an earlier build is still in progress.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/sandbox/createSandboxHandler.ts, line 75:

<comment>The new snapshot-miss branch can repeatedly start duplicate build workflows for the same org while an earlier build is still in progress.</comment>

<file context>
@@ -67,6 +68,17 @@ export async function createSandboxHandler(request: NextRequest): Promise<NextRe
+  // the *next* session warm-boots from it. This request still pays the
+  // full-clone cold-start path — the workflow runs durably outside the
+  // request lifecycle.
+  if (orgRepoName && !orgSnapshotId) {
+    kickBuildOrgSnapshotWorkflow({
+      cloneUrl: body.repoUrl,
</file context>

Comment thread app/workflows/buildOrgSnapshot.ts Outdated
sandboxName: string;
}

async function buildSnapshotStep(input: BuildOrgSnapshotInput): Promise<string> {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SRP - new lib file for buildSnapshotStep.ts

Comment thread app/workflows/buildOrgSnapshot.ts Outdated
return result.snapshotId;
}

export async function buildOrgSnapshotWorkflow(input: BuildOrgSnapshotInput) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SRP - new lib file for buildOrgSnapshotWorkflow.ts

The previous hardcoded id (snap_EjsphVxi07bFKrfojljJdIS41KHT) lived in
open-agents' Vercel team and returned 404 from api's project, breaking
the build-org-snapshot workflow on its first step.

Built a recoup-team equivalent snapshot manually via the @vercel/sandbox
SDK by provisioning a clean Amazon Linux 2023 sandbox and running:

  - sudo dnf install -y jq
  - curl -fsSL https://bun.sh/install | sudo BUN_INSTALL=/usr/local bash
  - sudo npm install -g agent-browser
  - curl -fsSL https://code-server.dev/install.sh | sudo sh

Then snapshotted via `vercel sandbox snapshot <id> --stop`. Result:
snap_RgVtpDO4y1BJHQiUbptMwS3Rt2EQ.

Tooling note (in the file comment): chromium is intentionally NOT in
this base. Amazon Linux 2023's default repo doesn't carry it, and
agent-browser fetches a managed Playwright browser on first use, so
having it in the base would just bloat the image.

The header comment now also documents the install commands and
refresh procedure so the next person who needs to add tooling has a
clear path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 1 file (changes from recent commits).

Requires human review: Auto-approval blocked by 4 unresolved issues from previous reviews.

@sweetmantech
Copy link
Copy Markdown
Contributor Author

End-to-end workflow loop verified

You spotted that the first run failed in the dashboard. Caught the root cause via runtime logs (`vercel logs --query "build-org-snapshot" --expand`):

```
[build-org-snapshot] Creating sandbox from base snapshot snap_EjsphVxi07bFKrfojljJdIS41KHT.
Error: Status code 404 is not ok
[build-org-snapshot] Failed for 'api': FatalError: Step "buildSnapshotStep" failed after 3 retries
```

That snapshot id lives in open-agents' Vercel team, not recoup's, so the create-from-base call returned 404 and the workflow retried 3× then bubbled.

Fix: built a recoup-team base snapshot

Did this manually via the @vercel/sandbox SDK because we couldn't access open-agents' base directly. Probed the runtime first to determine the OS:

```
NAME="Amazon Linux 2023"
ID="amzn"
PLATFORM_ID="platform:al2023"
which: dnf yum (no apt, no apk)
```

Then provisioned a clean sandbox and ran:

```bash
sudo dnf install -y jq
curl -fsSL https://bun.sh/install | sudo BUN_INSTALL=/usr/local bash
sudo npm install -g agent-browser
curl -fsSL https://code-server.dev/install.sh | sudo sh
```

`vercel sandbox snapshot --stop` → `snap_RgVtpDO4y1BJHQiUbptMwS3Rt2EQ`.

Pushed `60265ee5` to point `defaultBaseSnapshotId.ts` at it. Comment in the file documents the install commands + refresh procedure for the next person who needs to add tooling. (Chromium intentionally NOT in this base — Amazon Linux 2023's default repo doesn't carry it, and `agent-browser` fetches a managed Playwright browser on first use anyway.)

Re-test — full loop succeeded

1) Cold POST → kick → workflow ran end-to-end in ~21s:

```
[findOrgSnapshot] 'api' → miss (0 total snapshots returned)
[build-org-snapshot] Started workflow run wrun_01KR231SC0EHH2D9TPQK98TCWE for 'api'
[build-org-snapshot] workflow:start
[build-org-snapshot] step:start
[build-org-snapshot] Creating sandbox from base snapshot snap_RgVtpDO4y1BJHQiUbptMwS3Rt2EQ
[build-org-snapshot] Running command 1/1: git clone --depth=1 https://github.com/recoupable/api .
[build-org-snapshot] Creating snapshot from prepared sandbox
[build-org-snapshot] Created snapshot snap_g6FiWyWxIA0g1psgHvLqY3MJiPhH
[build-org-snapshot] Built snap_g6FiWyWxIA0g1psgHvLqY3MJiPhH for 'api'
```

`vercel sandbox snapshots list` confirms: `snap_g6FiWyWxIA0g1psgHvLqY3MJiPhH` (697 MB, status `created`).

2) Subsequent POST → hit:

```
[findOrgSnapshot] 'api' → hit snap_g6FiWyWxIA0g1psgHvLqY3MJiPhH (1 total snapshots returned)
```

What's now runtime-proven

  • ✅ `withWorkflow` config wrapper builds and deploys cleanly (first Vercel Workflow integration in api)
  • ✅ Workflow kicks fire from `createSandboxHandler`'s misses
  • ✅ Workflow runs durably outside the request lifecycle (visible in dashboard at `/api/workflows/runs/wrun_...`)
  • ✅ Workflow's step succeeds against the recoup-team base
  • ✅ Resulting snapshot is findable by `findOrgSnapshot` on the next request
  • ✅ Subsequent POSTs hit the workflow-built snapshot

The Phase 1 cutover gap (org snapshot building for new orgs) is closed end-to-end.

Three review fixes bundled:

1) SRP per sweetmantech: split app/workflows/buildOrgSnapshot.ts into
   - app/workflows/buildSnapshotStep.ts (the "use step" function)
   - app/workflows/buildOrgSnapshotWorkflow.ts (the "use workflow"
     function — filename now matches the export, addressing the
     cubic P2 about filename mismatch)

2) P0 command injection (cubic): wrap cloneUrl with shellEscape() in
   the `git clone --depth=1 ...` command. The Zod + parseGitHubRepoUrl
   validators upstream of this workflow already reject anything that
   doesn't match `^https:\/\/github\.com\/recoupable\/...`, so this
   isn't exploitable today — but shell-escaping is defense-in-depth
   that survives validator changes. Zero behavior change for valid
   inputs (extra single quotes around an already-clean URL).

3) P1 credential leak (cubic): dropped `cloneUrl` from every log line
   in the workflow + kick. The `https://user:token@github.com/...`
   shape is also blocked by the regex validator, but log scrubbing
   is the cheaper of the two layers if the validator ever loosens.
   `sandboxName` (the regex-extracted repo name only) remains in logs
   — uniquely identifies the org for observability without
   exposing tokens.

Out of scope: cubic's P2 about duplicate workflow runs (same
fire-and-forget pattern as open-agents — duplicates are wasteful but
not incorrect; can add a Vercel Workflow idempotency key in a
follow-up if observed in practice).

Tests: existing handler tests still pass unchanged (the kick is mocked
at the lib path, which didn't change). Suite remains 2577 / 2577.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sweetmantech
Copy link
Copy Markdown
Contributor Author

Pushed `27cce760` addressing 3 of the 4 actionable comments.

Comment Resolution
SRP: new lib for buildSnapshotStep Extracted to `app/workflows/buildSnapshotStep.ts`
SRP: new lib for buildOrgSnapshotWorkflow Extracted to `app/workflows/buildOrgSnapshotWorkflow.ts` (filename now matches the export, also resolves cubic's filename-mismatch P2)
P0 command injection (cubic) `shellEscape(input.cloneUrl)` around the `git clone` arg. Zod + `parseGitHubRepoUrl` already reject shell-metachar URLs upstream, but defense-in-depth survives validator changes
P1 credential leak in logs (cubic) Dropped `cloneUrl` from every log line in the workflow + kick. The validator already strips `user:token@` from URLs, but log scrubbing is the cheaper of the two layers. `sandboxName` remains in logs (regex-extracted repo name, no tokens)

Skipped cubic's P2 about duplicate workflow runs: same fire-and-forget pattern as open-agents — duplicates waste compute but aren't incorrect. Worth a Vercel Workflow idempotency key in a follow-up if it's observed in practice.

Tests + lint + tsc all green. Suite stays at 2577 (no test changes — the existing handler test mocks the kick at `@/lib/sandbox/kickBuildOrgSnapshotWorkflow`, which didn't change path).

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details
{}

@sweetmantech
Copy link
Copy Markdown
Contributor Author

Smoke test re-run after `27cce760` (with new org URL)

Used the URL you suggested: `https://github.com/recoupable/org-myco-wtf-80263819-9dfd-4bbf-9371-60a6185122d6\`. Full miss → workflow → hit loop visible in the logs in correct chronological order:

```
16:13:26 POST → [findOrgSnapshot] 'org-myco-wtf-...' → miss (0 total snapshots returned)
[build-org-snapshot] Started workflow run wrun_01KR24KKHN3TMCGVRK914FCX99 for 'org-myco-wtf-...'

16:13:27 [step:start] name='org-myco-wtf-...'
Creating sandbox from base snapshot snap_RgVtpDO4y1BJHQiUbptMwS3Rt2EQ
Running command 1/1: git clone --depth=1 'https://github.com/recoupable/org-myco-wtf-...' .
Created snapshot snap_5w67z2k52zns9aocc7JjuLmItm46

16:13:48 Built snap_5w67z2k52zns9aocc7JjuLmItm46 for 'org-myco-wtf-...'
[workflow run total ~22s]

16:16:20 POST → [findOrgSnapshot] 'org-myco-wtf-...' → hit snap_5w67z2k52zns9aocc7JjuLmItm46 (1 total snapshots returned)
```

Both review fixes from `27cce760` are visible in the runtime evidence

  • shellEscape (P0 fix): The git clone log line shows 'https://github.com/recoupable/org-myco-wtf-...' — single quotes wrap the URL, which is shellEscape doing its job.
  • Drop cloneUrl from logs (P1 fix): The kick line, workflow:start, step:start, "Built …", and "Started workflow run …" lines all show only the `sandboxName`, no URL. (The "Running command" log still shows the URL, but that comes from refreshBaseSnapshot's underlying log callback echoing the actual command being executed — it's the command, not derivative logging. Scrubbing that would mean changing the abstraction's log surface, which is outside this PR.)

Negative paths

Test Got
no auth 401
missing repoUrl 400
non-github URL 400

Honest observation about readyMs

Path readyMs
Miss + cold-clone 5326ms
Hit + restore-from-snapshot 11474ms

The hit case was ~6s slower than the cold path. Restoring a 727MB snapshot has overhead that exceeds the time saved on a tiny repo's git clone --depth=1. The "~75s warm-boot saves" claim from open-agents only materializes when the snapshot contains expensive state (installed node_modules, build artifacts) — my workflow currently only does git clone --depth=1, so the snapshot doesn't carry cached work worth restoring for small repos.

The infrastructure works correctly end-to-end. The performance win will appear once the workflow is extended to include a post-clone install step (e.g., bun install). That's a follow-up — meaningful but separable from this PR's scope of "the workflow runs".

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
app/workflows/buildSnapshotStep.ts (1)

28-49: ⚡ Quick win

Trim buildSnapshotStep into smaller helpers for readability.

It’s just over the function-length threshold; extracting token resolution and refresh payload construction would keep the step tighter and easier to scan.

As per coding guidelines, "Flag functions longer than 20 lines or classes with >200 lines" and "Keep functions small and focused".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/workflows/buildSnapshotStep.ts` around lines 28 - 49, The
buildSnapshotStep function is too long; extract the token resolution and the
refreshBaseSnapshot payload construction into small helpers: create a
getGithubTokenOrThrow() that encapsulates getServiceGithubToken() and throws the
existing error if not present, and create a buildRefreshPayload(input) (or
buildRefreshOptions) that returns the object passed into refreshBaseSnapshot
(baseSnapshotId, sandboxName, sandboxTimeoutMs, commandTimeoutMs, githubToken,
commands, log) using shellEscape(input.cloneUrl) for the clone command; then
simplify buildSnapshotStep to call those two helpers and await
refreshBaseSnapshot(payload). Ensure helper names (getGithubTokenOrThrow,
buildRefreshPayload) match references in buildSnapshotStep.
lib/sandbox/createSandboxHandler.ts (1)

33-157: 🏗️ Heavy lift

Split createSandboxHandler into smaller focused units.

This handler is carrying too many responsibilities in one function. Please extract at least auth/session resolution, snapshot warmup decision, and provisioning/update paths into helpers.

As per coding guidelines, "Flag functions longer than 20 lines or classes with >200 lines", "Keep functions small and focused", and for lib/**/*.ts: "Keep functions under 50 lines" and "Apply Single Responsibility Principle (SRP): one exported function per file; each file should do one thing well".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/sandbox/createSandboxHandler.ts` around lines 33 - 157, The
createSandboxHandler is too large and does multiple responsibilities; split it
into focused helpers: extractSessionAndAuth(request) which wraps
validateCreateSandboxBody and selectSessions + permission checks (use symbols
validateCreateSandboxBody, selectSessions, sessionRow),
determineOrgSnapshot(repoUrl) which encapsulates extractOrgRepoName,
findOrgSnapshot and kickBuildOrgSnapshotWorkflow (use orgRepoName,
findOrgSnapshot, kickBuildOrgSnapshotWorkflow), and
provisionAndUpdateSandbox(params) which handles connectSandbox, updateSession
and installSessionGlobalSkills (use connectSandbox, updateSession,
installSessionGlobalSkills) and returns the final response payload; then have
createSandboxHandler orchestrate these smaller functions — keep each helper
under ~50 lines and export only createSandboxHandler per file to follow SRP.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/workflows/buildOrgSnapshotWorkflow.ts`:
- Around line 20-28: The current try/catch in buildOrgSnapshotWorkflow swallows
errors and returns { success: false } which prevents Vercel Workflow from
marking the run as failed; instead, after logging the error (preserve the
existing console.error call that formats message for input.sandboxName), rethrow
the original error (or throw a new Error with the message) so the workflow run
fails and can be retried; update the catch block around buildSnapshotStep so it
no longer returns { success: false } but logs then throws the error (or remove
the catch entirely to let the error bubble).

In `@lib/sandbox/createSandboxHandler.ts`:
- Around line 75-80: The current call to kickBuildOrgSnapshotWorkflow can start
duplicate workflows when parallel requests race; update
kickBuildOrgSnapshotWorkflow (and its internal call to start) to pass a
deterministic runId based on the org identifier (e.g., use `orgRepoName`) so
identical requests reuse the same workflow run; specifically, modify the
start(...) invocation inside kickBuildOrgSnapshotWorkflow to include an options
object with runId set to a stable string like `org-snapshot-${orgRepoName}`
derived from the same org identifier used in the if block, ensuring idempotent
workflow starts.

---

Nitpick comments:
In `@app/workflows/buildSnapshotStep.ts`:
- Around line 28-49: The buildSnapshotStep function is too long; extract the
token resolution and the refreshBaseSnapshot payload construction into small
helpers: create a getGithubTokenOrThrow() that encapsulates
getServiceGithubToken() and throws the existing error if not present, and create
a buildRefreshPayload(input) (or buildRefreshOptions) that returns the object
passed into refreshBaseSnapshot (baseSnapshotId, sandboxName, sandboxTimeoutMs,
commandTimeoutMs, githubToken, commands, log) using shellEscape(input.cloneUrl)
for the clone command; then simplify buildSnapshotStep to call those two helpers
and await refreshBaseSnapshot(payload). Ensure helper names
(getGithubTokenOrThrow, buildRefreshPayload) match references in
buildSnapshotStep.

In `@lib/sandbox/createSandboxHandler.ts`:
- Around line 33-157: The createSandboxHandler is too large and does multiple
responsibilities; split it into focused helpers: extractSessionAndAuth(request)
which wraps validateCreateSandboxBody and selectSessions + permission checks
(use symbols validateCreateSandboxBody, selectSessions, sessionRow),
determineOrgSnapshot(repoUrl) which encapsulates extractOrgRepoName,
findOrgSnapshot and kickBuildOrgSnapshotWorkflow (use orgRepoName,
findOrgSnapshot, kickBuildOrgSnapshotWorkflow), and
provisionAndUpdateSandbox(params) which handles connectSandbox, updateSession
and installSessionGlobalSkills (use connectSandbox, updateSession,
installSessionGlobalSkills) and returns the final response payload; then have
createSandboxHandler orchestrate these smaller functions — keep each helper
under ~50 lines and export only createSandboxHandler per file to follow SRP.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cd168a8d-c3d5-422f-a893-de6e8819b05d

📥 Commits

Reviewing files that changed from the base of the PR and between cdc9e5f and 27cce76.

⛔ Files ignored due to path filters (4)
  • lib/sandbox/__tests__/createSandboxHandler.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • next.config.ts is excluded by none and included by none
  • package.json is excluded by none and included by none
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml and included by none
📒 Files selected for processing (5)
  • app/workflows/buildOrgSnapshotWorkflow.ts
  • app/workflows/buildSnapshotStep.ts
  • lib/sandbox/createSandboxHandler.ts
  • lib/sandbox/defaultBaseSnapshotId.ts
  • lib/sandbox/kickBuildOrgSnapshotWorkflow.ts

Comment on lines +20 to +28
try {
const snapshotId = await buildSnapshotStep(input);
console.log(`[build-org-snapshot] Built ${snapshotId} for '${input.sandboxName}'`);
return { success: true as const, snapshotId };
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
console.error(`[build-org-snapshot] Failed for '${input.sandboxName}':`, message);
return { success: false as const, error: message };
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In Vercel Workflow (workflow@4.2.0-beta.72), if a workflow catches an exception and returns a value instead of throwing, is the run considered failed and eligible for automatic retries?

💡 Result:

No. In Vercel Workflow, if you catch an exception and return a value instead of throwing a failure (i.e., you don’t let the step/workflow end in an error), the run/step is considered successful rather than failed, so it is not eligible for the automatic retry mechanism. What counts as retryable is tied to failures being surfaced as errors: by default, “errors thrown inside steps are retried” [1], and you can explicitly control retrying by throwing RetryableError (or stop retrying by throwing FatalError) [1][2]. If you catch the exception and simply return a value, you’re preventing the thrown error from occurring, so there’s no failure state to trigger retries. Notes: - If a step throws until it exhausts retries, the run is marked failed (and you can inspect the run status as “failed”) [3]. - Separately, retries may also occur due to certain infrastructure/runtime issues, but those still originate from an error path rather than a caught-and-returned success value [4].

Citations:


Rethrow errors to preserve run failure state and enable automatic retries.

Catching errors and returning { success: false } prevents Vercel Workflow from marking the run as failed, which disables automatic retries and obscures true failure conditions. Log the error, then rethrow it to surface the failure properly.

💡 Proposed fix
 catch (error) {
   const message = error instanceof Error ? error.message : String(error);
   console.error(`[build-org-snapshot] Failed for '${input.sandboxName}':`, message);
-  return { success: false as const, error: message };
+  throw error instanceof Error ? error : new Error(message);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/workflows/buildOrgSnapshotWorkflow.ts` around lines 20 - 28, The current
try/catch in buildOrgSnapshotWorkflow swallows errors and returns { success:
false } which prevents Vercel Workflow from marking the run as failed; instead,
after logging the error (preserve the existing console.error call that formats
message for input.sandboxName), rethrow the original error (or throw a new Error
with the message) so the workflow run fails and can be retried; update the catch
block around buildSnapshotStep so it no longer returns { success: false } but
logs then throws the error (or remove the catch entirely to let the error
bubble).

Comment on lines +75 to +80
if (orgRepoName && !orgSnapshotId) {
kickBuildOrgSnapshotWorkflow({
cloneUrl: body.repoUrl,
sandboxName: orgRepoName,
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

In workflow@4.2.0-beta.72, what is the recommended way to deduplicate start(...) calls for the same logical job (idempotency key / unique run key / concurrency controls)?

💡 Result:

In workflow@4.2.0-beta.72, the recommended way to deduplicate start(...) calls for the same logical job (idempotency key / unique run key) is to provide a custom runId in the StartOptions object. If a run with that runId already exists, start() returns the existing run instead of creating a duplicate.[1][2] Example: const run = await start(myWorkflow, args, { runId: `order-${orderId}` }); This ensures idempotency: repeated calls with the same runId return the same run object.[1][2] For concurrency controls, there is an ongoing discussion about a native lock primitive (distributed semaphore) to limit concurrent executions for the same key inside workflows, but it appears not yet implemented in beta.72 (no mention in release notes).[3][4] Use application-level checks (e.g., check run status before starting) or the custom runId for deduplication as primary mechanism.[5][6]

Citations:


Use deterministic runId to prevent duplicate org snapshot workflow kicks.

When parallel requests miss the snapshot check simultaneously, both can trigger identical builds. Implement idempotency by passing a deterministic runId based on the org identifier to the start() call in kickBuildOrgSnapshotWorkflow. This ensures that concurrent identical requests reuse the same workflow run instead of spawning duplicates:

const run = await start(myWorkflow, args, { 
  runId: `org-snapshot-${orgRepoName}` 
});

This prevents redundant clones, snapshot operations, and avoidable cost/rate pressure from concurrent requests.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/sandbox/createSandboxHandler.ts` around lines 75 - 80, The current call
to kickBuildOrgSnapshotWorkflow can start duplicate workflows when parallel
requests race; update kickBuildOrgSnapshotWorkflow (and its internal call to
start) to pass a deterministic runId based on the org identifier (e.g., use
`orgRepoName`) so identical requests reuse the same workflow run; specifically,
modify the start(...) invocation inside kickBuildOrgSnapshotWorkflow to include
an options object with runId set to a stable string like
`org-snapshot-${orgRepoName}` derived from the same org identifier used in the
if block, ensuring idempotent workflow starts.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="app/workflows/buildOrgSnapshotWorkflow.ts">

<violation number="1" location="app/workflows/buildOrgSnapshotWorkflow.ts:27">
P2: Do not swallow workflow errors; rethrow after logging so failed runs are recorded correctly.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.

} catch (error) {
const message = error instanceof Error ? error.message : String(error);
console.error(`[build-org-snapshot] Failed for '${input.sandboxName}':`, message);
return { success: false as const, error: message };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Do not swallow workflow errors; rethrow after logging so failed runs are recorded correctly.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/workflows/buildOrgSnapshotWorkflow.ts, line 27:

<comment>Do not swallow workflow errors; rethrow after logging so failed runs are recorded correctly.</comment>

<file context>
@@ -0,0 +1,29 @@
+  } catch (error) {
+    const message = error instanceof Error ? error.message : String(error);
+    console.error(`[build-org-snapshot] Failed for '${input.sandboxName}':`, message);
+    return { success: false as const, error: message };
+  }
+}
</file context>

Tip: Review your code locally with the cubic CLI to iterate faster.

@sweetmantech sweetmantech merged commit 05f321d into test May 7, 2026
6 checks passed
@sweetmantech sweetmantech deleted the feat/sandbox-build-org-snapshot-workflow branch May 7, 2026 21:23
sweetmantech added a commit that referenced this pull request May 7, 2026
* feat(sandbox): port sandbox lifecycle workflow + kicks (Phase 2)

Closes the auto-pause and lifecycle-self-heal gaps from the cutover
analysis. Sequel to PR #531 (build-org-snapshot workflow). The Vercel
Workflow runtime + withWorkflow already shipped — this PR adds the
second workflow on top.

What it does:
- POST /api/sandbox now kicks the sandbox-lifecycle workflow with
  reason="sandbox-created" after provisioning + skill install. The
  workflow sleeps until `hibernate_after` / `sandbox_expires_at`
  (whichever is sooner), then evaluates: pause if idle, retry on
  next iteration if a chat stream is active or the user extended
  the timing.
- GET /api/sandbox/status now kicks with reason="status-check-overdue"
  when the session row says lifecycle_state="active" but the
  workflow's due-time has already passed (its lease has died). The
  kick reclaims the stale lease and starts a fresh workflow.

Components ported faithfully from open-agents:

Sandbox state predicates (one fn per file per api convention):
- getPersistentSandboxName, getResumableSandboxName,
  hasResumableSandboxState, hasPausedSandboxState
- canOperateOnSandbox, clearSandboxState

Lifecycle types + config:
- sandboxLifecycleTypes.ts: SandboxLifecycleState/Reason types
- sandboxLifecycleConfig.ts: SANDBOX_INACTIVITY_TIMEOUT_MS (5min),
  SANDBOX_EXPIRES_BUFFER_MS (10s), SANDBOX_LIFECYCLE_MIN_SLEEP_MS
  (5s), SANDBOX_LIFECYCLE_STALE_RUN_GRACE_MS (2min)

Lifecycle update builders + due-at:
- buildLifecycleActivityUpdate, buildActiveLifecycleUpdate,
  buildHibernatedLifecycleUpdate
- getLifecycleDueAtMs (the wake-time calculator)
- getNextLifecycleVersion, getSandboxExpiresAtDate

Active-stream guard (don't pause mid-chat):
- hasActiveStreamForSession + Supabase helper selectChatsBySession

Concurrency:
- claimSessionLifecycleRunId — atomic UPDATE that fails when the
  expected current value doesn't match. Prevents two kicks from
  starting duplicate workflows.

The evaluator (the heart):
- evaluateSandboxLifecycle (~110 lines) — single-pass FSM that
  decides skip / hibernate / fail. Mirrors open-agents' evaluator
  with the same skip reasons and self-heal pattern.

The workflow:
- app/workflows/sandboxLifecycleWorkflow.ts — `while(true)` loop with
  `sleep(new Date(wakeAtMs))` between iterations, lease-claim
  pattern, and step boundaries via "use step". Loops on
  not-due-yet / active-workflow defers, terminates on
  hibernated / failed / sandbox-gone.

The kick:
- kickSandboxLifecycleWorkflow.ts — fire-and-forget invocation with
  stale-run detection, lease claiming, and graceful handling when
  start() fails (clears the lease so retry can succeed).

TDD red -> green:
- 2 new createSandboxHandler tests asserting the kick fires with the
  right reason on session-bound provisions, and is skipped without
  sessionId
- All existing handler tests continue to pass with the new mocks
- Suite: 2577 -> 2579 (+2 new tests). pnpm lint:check + tsc clean
  for new files.

Out of scope (deliberately deferred):
- Tests for the workflow itself + evaluator. The workflow primitives
  (`"use workflow"`, `"use step"`, `sleep(date)`) require a Vercel
  Workflow test harness we haven't introduced yet. The handler
  tests + open-agents' parity give enough confidence; if the
  workflow misbehaves at runtime, the smoke test will catch it.
- Reason-specific telemetry beyond what evaluateSandboxLifecycle
  already logs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sandbox): adopt open-agents scheduleBackgroundWork pattern for lifecycle kick

Smoke test caught: the lifecycle kick fired but its async chain
(selectSessions → claim lease → start workflow) never completed
because the serverless function tore down on response. No
[kickSandboxLifecycleWorkflow] log lines, no workflow run started.

Original PR placed `after()` from next/server inside the kick lib —
which violated open-agents' architecture: the kick is a generic
fan-out lib, the platform-specific scheduler belongs at the call
site.

Refactored to match open-agents' open / api conventions:
- `kickSandboxLifecycleWorkflow` accepts an optional
  `scheduleBackgroundWork: (task: Promise<void>) => void` parameter.
  Default behavior is `void task` (matches open-agents fallback),
  scheduler used when provided.
- createSandboxHandler + getSandboxStatusHandler pass
  `scheduleBackgroundWork: task => after(() => task)`. Same pattern
  api already uses in `lib/agents/createPlatformRoutes.ts:62` and
  `app/api/chat/slack/route.ts:16` for waitUntil-style background
  work.

Test updated to use `expect.objectContaining` since the kick now
receives an extra `scheduleBackgroundWork` callback. Suite stays at
2579.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(sandbox): KISS deletions + SRP extractions per review feedback

Addresses 17 review comments on PR #533. All KISS deletions and SRP
extractions in one pass — no behavior change, just file/function
reorganization to match api conventions (one exported function per
file, filename matches the export, no thin wrappers).

KISS deletions (3 files removed):
- canOperateOnSandbox.ts → callers use hasRuntimeSandboxState directly
  (was a one-line wrapper around it)
- getNextLifecycleVersion.ts → was unused; existing inline calc kept
- hasResumableSandboxState.ts → callers use
  `getResumableSandboxName(state) !== null` (one-line wrapper)

KISS rename:
- selectChatsBySession.ts → selectChats.ts with a generic
  `{ id?, sessionId? }` filter parameter, mirroring selectSessions

SRP split — Supabase queries:
- claimSessionLifecycleRunId.ts split into:
  - claimSessionLifecycleRunIdIfNull.ts (initial claim)
  - claimSessionLifecycleRunIdIfMatch.ts (lease refresh)
  - claimSessionLifecycleRunId.ts (combiner — picks the right one)

SRP extractions — kick logic:
- runKick.ts (the async chain)
- reclaimStaleLease.ts
- shouldStartLifecycle.ts
- isLifecycleRunStale.ts
- createLifecycleRunId.ts
- kickSandboxLifecycleWorkflow.ts now thin: just runKick + scheduler

SRP extractions — due-at math:
- computeInactivityDueAtMs.ts
- computeExpiryDueAtMs.ts
- getLifecycleDueAtMs.ts now thin: composes the two

SRP extractions — evaluator helpers:
- restoreActiveLifecycleState.ts
- wasLifecycleTimingExtended.ts

SRP extractions — workflow steps (each preserves "use step"):
- computeLifecycleWakeDecision.ts
- runLifecycleEvaluation.ts
- clearLifecycleRunIdIfOwned.ts
- sandboxLifecycleWorkflow.ts now thin: just composes the steps with
  the while-sleep-evaluate loop

Net delta: 16 new files, 4 deletions, no behavior change.
Tests: 2579 / 2579 still pass. pnpm lint:check + tsc clean for
new/changed files (pre-existing TS errors in processCreateSandbox /
updateSnapshotPatchHandler unrelated).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sandbox): actually delegate getLifecycleDueAtMs to the extracted helpers

In the previous SRP refactor I created computeInactivityDueAtMs.ts and
computeExpiryDueAtMs.ts but the Write to update getLifecycleDueAtMs.ts
errored on "File has not been read yet" and I missed it in the batch
output. The two helper files were left as orphans while the original
file kept its inline copies — visible by the SANDBOX_*_MS imports +
both functions still inlined.

Now properly imports computeInactivityDueAtMs / computeExpiryDueAtMs
and the parent file just composes them. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sandbox): KISS lifecycle claim — combiner under lib/sessions, IfMatch via updateSession

Two review fixes:

1) Move `claimSessionLifecycleRunId.ts` from `lib/supabase/sessions/`
   to `lib/sessions/`. The combiner doesn't directly query Supabase —
   it composes two underlying helpers, so per api convention it
   belongs alongside other domain composers, not under the Supabase
   namespace.

2) KISS: delete `claimSessionLifecycleRunIdIfMatch.ts` and use the
   existing `updateSession` helper for the lease-refresh path. The
   refresh writes the same `lifecycle_run_id` value back, so an
   unconditional `updateSession({ lifecycle_run_id: runId })` does the
   work — accepts a small race where a concurrent stale-reclaim could
   be overwritten, but the kick path (which DOES use the atomic
   `claimSessionLifecycleRunIdIfNull`) remains the primary
   concurrency guard.

Two callers (runKick, computeLifecycleWakeDecision) updated to import
from the new path. No behavior change at the workflow level.

Tests: 2579 / 2579 still pass. lint + tsc clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sandbox): KISS lifecycle claim — drop IfNull helper, use updateSession

Per latest review: the `IfNull` Supabase helper is also just an
`updateSession` with a conditional WHERE — KISS it the same way the
`IfMatch` helper was simplified.

Changes:

- Delete `lib/supabase/sessions/claimSessionLifecycleRunIdIfNull.ts`.
- Delete `lib/sessions/claimSessionLifecycleRunId.ts` (combiner — no
  longer needed once both branches collapse to plain `updateSession`).
- `runKick`: write the new lease via
  `updateSession(id, { lifecycle_run_id: runId })`. The atomic
  `WHERE lifecycle_run_id IS NULL` guard is gone; `shouldStartLifecycle`
  already filters out rows that already hold a lease — best-effort
  concurrency, accepting the small race window.
- `computeLifecycleWakeDecision`: replace the IfMatch lease re-claim
  with a SELECT-based ownership check using the row already read in
  the same step. If `lifecycle_run_id` was overwritten by a concurrent
  kick, return `run-replaced` and bail.

Tests 2579 / 2579 still pass. Lint clean. The pre-existing tsc errors
in unrelated test files (orgId on ApiKeyDetails, SlackApiResponse
messages) are not introduced here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant