Skip to content

refactor(sandbox): callers use open-agents abstraction (Phase 2.2)#509

Merged
sweetmantech merged 2 commits intotestfrom
feat/sandbox-callers-use-abstraction
May 4, 2026
Merged

refactor(sandbox): callers use open-agents abstraction (Phase 2.2)#509
sweetmantech merged 2 commits intotestfrom
feat/sandbox-callers-use-abstraction

Conversation

@sweetmantech
Copy link
Copy Markdown
Contributor

@sweetmantech sweetmantech commented May 4, 2026

Summary

Phase 2.2 of the open-agents migration. Replaces api's direct `@vercel/sandbox` SDK calls with the open-agents sandbox abstraction (inlined in PR #507) for the sandbox lifecycle (create + reconnect).

HTTP response shapes preserved exactly — `{sandboxId, sandboxStatus, timeout, createdAt}` unchanged, same enum values for `sandboxStatus`. Only the internals differ.

Scope (Option B — hybrid)

Per the agreed plan, only lifecycle creators get refactored. `installClaudeCode` / `runClaudeCode` / `getSandboxStatus` stay on the SDK directly because the abstraction doesn't cover their needs (sudo, stdout/stderr streaming, simple status reads). The first two are also confirmed dead orphans (defined but never called anywhere in api production code) and will be deleted entirely after the full migration.

Refactored files

File Change
`lib/sandbox/createSandbox.ts` `Sandbox.create(...)` → `VercelSandbox.create(...)`. Input switched to `VercelSandboxConfig`. Snapshot trigger: `restoreSnapshotId` field (was `source: { type: "snapshot", ... }`). Returns `VercelSandbox`.
`lib/sandbox/createSandboxWithFallback.ts` Passes `restoreSnapshotId` to `createSandbox`
`lib/sandbox/createSandboxFromSnapshot.ts` Type cascade (`Sandbox` → `VercelSandbox`)
`lib/sandbox/getActiveSandbox.ts` `Sandbox.get({name})` → `VercelSandbox.connect(name, {})`. Status check: `sandbox.status` → `sandbox.sdkStatus`
`lib/sandbox/getOrCreateSandbox.ts` No code change — type cascades automatically through `CreateSandboxFromSnapshotResult`
`lib/sandbox/processCreateSandbox.ts` Reads `sandbox.sdkStatus`, defensive nullish on `createdAt`

Abstraction extension (additive only)

Added two readonly getters to `vercel/sandbox/VercelSandbox.ts`, following the same pattern as the existing `host` / `environmentDetails` / `expiresAt` / `timeout` getters:

  • `get sdkStatus(): string` — raw SDK session status (`running` / `pending` / `stopped` / `failed` / `aborted` / `snapshotting`). Distinct from the abstraction's normalized `status` getter; exposed for callers that need to surface the exact SDK lifecycle state in HTTP responses or status polling.
  • `get createdAt(): Date | undefined` — SDK `session.createdAt`

These exist so api callers can construct the existing HTTP response shape (which uses SDK enum strings) without breaking the abstraction's typed interface.

Tests updated (4 files)

Test file Change
`createSandbox.test.ts` Mocks `VercelSandbox.create` instead of `Sandbox.create`; mock objects use `sdkStatus` instead of `status`
`createSandboxWithFallback.test.ts` Asserts `restoreSnapshotId` pass-through
`getActiveSandbox.test.ts` Mocks `VercelSandbox.connect`; `sdkStatus` on mock objects
`processCreateSandbox.test.ts` `mockSandbox` uses `sdkStatus`

Verification

Check Result
`pnpm lint:check` ✅ clean
`pnpm test` 2391/2391 pass
HTTP response shape ✅ unchanged — same fields, same enum values for `sandboxStatus` (sourced via `sdkStatus` now, was direct `Sandbox.status` before — identical strings either way)

What this unlocks

PR 2.3 — delete the now-dead-orphan files (`installClaudeCode.ts`, `runClaudeCode.ts`, and any other helpers nothing imports after this lands). Plus the inline of open-agents `packages/agent`, which depends on this abstraction being live in the call paths.

Risk

Bounded. The HTTP response shape stays exactly identical, so any external consumer (chat in prod) sees no change. Internal callers all update in lock-step. The only behavior change is that creates/reconnects now go through `VercelSandbox`'s lifecycle hooks (proactive timeout, network policy, etc.) instead of the bare SDK — which is the explicit point of the migration.

🤖 Generated with Claude Code


Summary by cubic

Refactors sandbox create/reconnect to use the open-agents VercelSandbox abstraction and tightens snapshot/createdAt handling. HTTP responses stay the same (sandboxId, sandboxStatus, timeout, createdAt).

  • Refactors

    • createSandbox calls VercelSandbox.create(...) with VercelSandboxConfig; always passes defaults (vcpus, runtime, timeout) and supports restoreSnapshotId.
    • getActiveSandbox reconnects via VercelSandbox.connect(...); checks sdkStatus === "running".
    • Response shape unchanged; sandboxStatus sourced from sdkStatus; createdAt comes from the abstraction (no fallback).
    • Adds sdkStatus and non-optional createdAt getters on VercelSandbox; both refresh session before read.
    • Scope is hybrid: only lifecycle creators moved; installClaudeCode, runClaudeCode, and getSandboxStatus remain on @vercel/sandbox.
  • Bug Fixes

    • Removed fabricated createdAt fallbacks in API; getter now throws if missing (SDK guarantees it).
    • Dropped misleading snapshot branching; snapshot restores now pass vcpus/runtime with restoreSnapshotId consistently.

Written for commit 7e614c0. Summary will update on new commits.

Summary by CodeRabbit

  • Refactor
    • Reworked sandbox backend to a unified abstraction for more consistent behavior.
    • Improved sandbox lifecycle/status reporting and created-at metadata for more reliable reconnection and monitoring.
  • Chores
    • Simplified sandbox creation paths and defaults to reduce unexpected variations.

Replaces direct @vercel/sandbox SDK calls with the open-agents sandbox
abstraction layer (inlined in Phase 2.1) for sandbox lifecycle (create
+ reconnect). HTTP response shapes preserved exactly.

Per the agreed Option B (hybrid): only the lifecycle creator helpers
get refactored. installClaudeCode / runClaudeCode / getSandboxStatus
stay on the SDK directly because the abstraction does not cover their
needs (sudo, stdout/stderr streaming, simple status reads). Those
two install/run files are also dead orphans (defined but never called)
and will be removed entirely after the full migration.

Production refactor:
  createSandbox.ts            Sandbox.create(...) -> VercelSandbox.create(...)
                              Input: VercelSandboxConfig (was SDK params)
                              Snapshot trigger: restoreSnapshotId field
                                (was source: { type: "snapshot", ... })
                              Returns VercelSandbox (was SDK Sandbox)
  createSandboxWithFallback.ts cascade — passes restoreSnapshotId to createSandbox
  createSandboxFromSnapshot.ts type cascade only (Sandbox -> VercelSandbox)
  getActiveSandbox.ts         Sandbox.get({name}) -> VercelSandbox.connect(name, {})
                              Status check: sandbox.status -> sandbox.sdkStatus
  getOrCreateSandbox.ts       no code change — type cascades automatically
  processCreateSandbox.ts     reads sandbox.sdkStatus instead of sandbox.status
                              defensive nullish on createdAt

Abstraction extension:
  vercel/sandbox/VercelSandbox.ts adds two readonly getters following
  the existing host/environmentDetails/expiresAt pattern:
    get sdkStatus(): string  — raw SDK session status (running/pending/
                                stopped/failed/aborted/snapshotting),
                                distinct from the abstraction's normalized
                                status getter
    get createdAt(): Date | undefined  — SDK session.createdAt

  These give api callers what they need to construct the existing
  HTTP response shape without breaking the abstraction's interface.

Tests updated:
  createSandbox.test.ts            mocks VercelSandbox.create instead of
                                    Sandbox.create; mock object uses
                                    sdkStatus instead of status
  createSandboxWithFallback.test.ts asserts restoreSnapshotId pass-through
  getActiveSandbox.test.ts         mocks VercelSandbox.connect; sdkStatus
                                    on mock objects
  processCreateSandbox.test.ts     mockSandbox uses sdkStatus

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2391/2391 pass
  - HTTP response shape unchanged: same fields, same enum values for
    sandboxStatus (sourced from the SDK now via sdkStatus, was directly
    via SDK Sandbox.status before — identical strings either way)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
api Ready Ready Preview May 4, 2026 1:33pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 83a21401-6958-4b0d-a28c-2633ddea7647

📥 Commits

Reviewing files that changed from the base of the PR and between 375673a and 7e614c0.

⛔ Files ignored due to path filters (1)
  • lib/sandbox/__tests__/createSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (3)
  • lib/sandbox/createSandbox.ts
  • lib/sandbox/processCreateSandbox.ts
  • lib/sandbox/vercel/sandbox/VercelSandbox.ts

📝 Walkthrough

Walkthrough

This PR replaces direct usage of the @vercel/sandbox SDK types with the project-specific VercelSandbox abstraction across sandbox creation, snapshot restore, reconnection, and result mapping; it adds two VercelSandbox getters (sdkStatus, createdAt) and updates function signatures and response shapes accordingly.

Changes

Sandbox Abstraction Migration

Layer / File(s) Summary
Data Shape / Types
lib/sandbox/vercel/sandbox/VercelSandbox.ts, lib/sandbox/createSandbox.ts, lib/sandbox/createSandboxFromSnapshot.ts
Introduced VercelSandbox getters sdkStatus: string and createdAt: Date. CreateSandboxParams now aliases VercelSandboxConfig. Public result types updated to reference VercelSandbox and sandboxStatus: string.
Core Implementation
lib/sandbox/vercel/sandbox/VercelSandbox.ts, lib/sandbox/createSandbox.ts
Added sdkStatus and createdAt getters that refresh session; createSandbox now always calls VercelSandbox.create(...) with explicit defaults (vcpus, runtime, timeout) merged with provided config and no longer branches on source.type === "snapshot".
Wiring / Callsites
lib/sandbox/createSandboxWithFallback.ts, lib/sandbox/getActiveSandbox.ts, lib/sandbox/processCreateSandbox.ts
Snapshot restore now uses restoreSnapshotId option. getActiveSandbox uses VercelSandbox.connect(...) and checks sandbox.sdkStatus === "running". processCreateSandbox maps sandboxStatus from sandbox.sdkStatus and serializes createdAt from sandbox.createdAt.
Consumers / API Surface
lib/sandbox/createSandboxFromSnapshot.ts, other callers...lib/sandbox/*
Public/external-facing signatures and returned shapes updated to reflect VercelSandbox and new field names (e.g., sandbox.name, sandbox.sdkStatus, sandbox.timeout, sandbox.createdAt).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

✨ An SDK gives way to a named class bright,
Getters whisper status, dates now take flight.
Snapshots restored with a single field,
Types align, and wiring is healed.
Small changes, clearer sandboxed light.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Solid & Clean Code ✅ Passed Pull request demonstrates strong adherence to SOLID principles and clean code practices with single responsibility functions, proper abstraction through VercelSandbox, eliminated duplication via centralized configuration, and maintainable architecture.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/sandbox-callers-use-abstraction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
lib/sandbox/processCreateSandbox.ts (1)

25-30: ⚡ Quick win

Extract a dedicated buildSandboxCreatedResponse helper.

This response mapping now duplicates the logic in lib/sandbox/createSandbox.ts, including the sdkStatus lookup and createdAt fallback. A small shared serializer would keep the direct-create and snapshot-create paths from drifting.

As per coding guidelines, "Extract shared logic into reusable utilities following Don't Repeat Yourself (DRY) principle".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/processCreateSandbox.ts` around lines 25 - 30, Extract a shared
helper named buildSandboxCreatedResponse that centralizes the mapping from a
Sandbox entity to SandboxCreatedResponse (including resolving sdkStatus lookup
and applying the createdAt fallback to ISO string), then replace the inline
construction in processCreateSandbox (the const result assignment) and the
equivalent mapping in createSandbox with calls to buildSandboxCreatedResponse so
both paths use the same serializer; ensure the helper accepts the sandbox object
(and any dependencies needed for sdkStatus resolution) and returns the exact
shape used elsewhere (SandboxCreatedResponse).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/sandbox/createSandbox.ts`:
- Around line 41-53: The restore branch for VercelSandbox.create currently
spreads config but still allows vcpus/runtime to be forwarded (and the SDK will
default them), so restores overwrite snapshot-defined resources; when
config.restoreSnapshotId is true, build the payload so vcpus and runtime are
omitted (e.g., destructure and exclude vcpus/runtime or explicitly set them to
undefined) and only pass timeout and other non-resource fields to
VercelSandbox.create; reference VercelSandbox.create, config.restoreSnapshotId,
DEFAULT_VCPUS, and DEFAULT_RUNTIME when making the change so restores no longer
inherit the defaults.

In `@lib/sandbox/vercel/sandbox/VercelSandbox.ts`:
- Around line 975-982: The createdAt getter exposes potentially stale session
metadata because it reads this.session.createdAt directly; update the createdAt
accessor in VercelSandbox to call await this.refreshStateFromCurrentSession()
(or the synchronous equivalent used by sdkStatus) before returning
this.session.createdAt ?? undefined so it refreshes session state after
reconnect/resume; reference the createdAt getter,
refreshStateFromCurrentSession(), sdkStatus, and this.session.createdAt to
locate and modify the code.

---

Nitpick comments:
In `@lib/sandbox/processCreateSandbox.ts`:
- Around line 25-30: Extract a shared helper named buildSandboxCreatedResponse
that centralizes the mapping from a Sandbox entity to SandboxCreatedResponse
(including resolving sdkStatus lookup and applying the createdAt fallback to ISO
string), then replace the inline construction in processCreateSandbox (the const
result assignment) and the equivalent mapping in createSandbox with calls to
buildSandboxCreatedResponse so both paths use the same serializer; ensure the
helper accepts the sandbox object (and any dependencies needed for sdkStatus
resolution) and returns the exact shape used elsewhere (SandboxCreatedResponse).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 20b9b0ca-7cad-4fc2-ae87-8a679e344db9

📥 Commits

Reviewing files that changed from the base of the PR and between 3ea937a and 375673a.

⛔ Files ignored due to path filters (4)
  • lib/sandbox/__tests__/createSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/createSandboxWithFallback.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/getActiveSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/sandbox/__tests__/processCreateSandbox.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (6)
  • lib/sandbox/createSandbox.ts
  • lib/sandbox/createSandboxFromSnapshot.ts
  • lib/sandbox/createSandboxWithFallback.ts
  • lib/sandbox/getActiveSandbox.ts
  • lib/sandbox/processCreateSandbox.ts
  • lib/sandbox/vercel/sandbox/VercelSandbox.ts

Comment thread lib/sandbox/createSandbox.ts Outdated
Comment on lines 41 to 53
const sandbox = await VercelSandbox.create(
config.restoreSnapshotId
? {
...params,
timeout: params.timeout ?? DEFAULT_TIMEOUT,
...config,
timeout: config.timeout ?? DEFAULT_TIMEOUT,
}
: {
resources: { vcpus: DEFAULT_VCPUS },
timeout: params.timeout ?? DEFAULT_TIMEOUT,
vcpus: DEFAULT_VCPUS,
timeout: config.timeout ?? DEFAULT_TIMEOUT,
runtime: DEFAULT_RUNTIME,
...params,
...config,
},
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Snapshot restores still inherit the default runtime/resource config.

This branch omits DEFAULT_VCPUS and DEFAULT_RUNTIME, but VercelSandbox.create() still defaults vcpus = 4 and runtime = "node22" internally and always forwards both into the SDK create payload. Restores via restoreSnapshotId therefore do not actually preserve snapshot-defined resources/runtime the way this helper now documents.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/sandbox/createSandbox.ts` around lines 41 - 53, The restore branch for
VercelSandbox.create currently spreads config but still allows vcpus/runtime to
be forwarded (and the SDK will default them), so restores overwrite
snapshot-defined resources; when config.restoreSnapshotId is true, build the
payload so vcpus and runtime are omitted (e.g., destructure and exclude
vcpus/runtime or explicitly set them to undefined) and only pass timeout and
other non-resource fields to VercelSandbox.create; reference
VercelSandbox.create, config.restoreSnapshotId, DEFAULT_VCPUS, and
DEFAULT_RUNTIME when making the change so restores no longer inherit the
defaults.

Comment thread lib/sandbox/vercel/sandbox/VercelSandbox.ts
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 10 files

Confidence score: 3/5

  • There is a concrete regression risk in lib/sandbox/processCreateSandbox.ts: substituting new Date() when createdAt is missing fabricates creation metadata and can misrepresent sandbox state.
  • Because this issue is medium severity (6/10) with high confidence (8/10) and affects observable data correctness, this sits at some merge risk rather than a minor housekeeping concern.
  • Pay close attention to lib/sandbox/processCreateSandbox.ts - ensure missing createdAt is handled explicitly (or failed fast) instead of generating a synthetic timestamp.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="lib/sandbox/processCreateSandbox.ts">

<violation number="1" location="lib/sandbox/processCreateSandbox.ts:29">
P2: Do not substitute `new Date()` for a missing sandbox `createdAt`; this returns incorrect creation metadata. Fail fast (or explicitly handle missing SDK data) instead of fabricating a timestamp.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant Client as API / Route Handler
    participant Lib as Sandbox Libs (create/getActive)
    participant DB as Supabase (Account Sandboxes)
    participant OA as NEW: VercelSandbox (Abstraction)
    participant SDK as @vercel/sandbox (SDK)

    Note over Client, SDK: Sandbox Lifecycle (Create/Reconnect)

    rect rgb(240, 240, 240)
    Note right of Client: Create Flow
    Client->>Lib: createSandbox(config)
    Lib->>OA: NEW: VercelSandbox.create(config)
    OA->>SDK: Sandbox.create()
    SDK-->>OA: Session object
    OA-->>Lib: VercelSandbox Instance
    Lib->>OA: NEW: get sdkStatus & createdAt
    Lib-->>Client: { sandboxId, sandboxStatus, createdAt, ... }
    end

    rect rgb(230, 240, 255)
    Note right of Client: Reconnect Flow
    Client->>Lib: getActiveSandbox(accountId)
    Lib->>DB: selectAccountSandboxes(accountId)
    DB-->>Lib: mostRecentSandboxId
    
    Lib->>OA: CHANGED: VercelSandbox.connect(id)
    OA->>SDK: Sandbox.get({ name: id })
    SDK-->>OA: Session object
    
    alt Status Check
        Lib->>OA: NEW: get sdkStatus
        OA-->>Lib: "running"
        Lib-->>Client: VercelSandbox Instance
    else Not Running or Error
        OA-->>Lib: "stopped" / Exception
        Lib-->>Client: null
    end
    end

    Note over Client, OA: Note: Legacy SDK logic (installClaudeCode/runClaudeCode)<br/>still calls SDK directly until Phase 2.3.
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread lib/sandbox/processCreateSandbox.ts Outdated
@sweetmantech
Copy link
Copy Markdown
Contributor Author

Manual verification on preview deployment

Tested PR #509's refactor end-to-end against `https://api-git-feat-sandbox-callers-use-abstraction-recoup.vercel.app\`. Created a real Vercel Sandbox via the refactored path (processCreateSandboxcreateSandboxFromSnapshotcreateSandboxWithFallbackcreateSandboxVercelSandbox.create()) and confirmed the HTTP response shape is byte-identical to the pre-refactor surface.

Setup

T1 — POST /api/sandboxes (the refactored path)

curl -X POST $PREVIEW/api/sandboxes \
  -H "x-api-key: $KEY" \
  -H "Content-Type: application/json" \
  -d '{}'

Response (HTTP 200):

{
  "status": "success",
  "sandboxes": [{
    "sandboxId": "coral-coming-newt-bZXeCe",
    "sandboxStatus": "running",
    "timeout": 1800000,
    "createdAt": "2026-05-04T12:53:40.458Z"
  }]
}

What this proves:

  • VercelSandbox.create() (the abstraction's create flow) succeeded against real Vercel infra
  • New sdkStatus getter on VercelSandbox returns the SDK enum value ("running") — identical to what Sandbox.status returned pre-refactor
  • New createdAt getter on VercelSandbox returns the SDK session's createdAt Date — .toISOString() round-trips correctly
  • sandbox.name and sandbox.timeout (already-existing public fields on VercelSandbox) still expose the right values
  • The restoreSnapshotId param-shape change in createSandboxWithFallback works (no snapshot existed for the test account → fell through to fresh sandbox path correctly)
  • account_sandboxes row inserted (otherwise T2/T3 would not see this sandbox)

T2 — GET /api/sandboxes (list, unchanged code path)

This goes through getSandboxesHandler which uses getSandboxStatusnot refactored in this PR per Option B (still uses SDK directly). Run as a regression check that the unchanged path still works after the abstraction extension.

curl $PREVIEW/api/sandboxes -H "x-api-key: $KEY"

Returned a list of 14+ sandboxes for this account. The newly-created one was first, with sandboxStatus: "running"; older sandboxes showed sandboxStatus: "stopped". All identical response shape:

{
  "sandboxId": "coral-coming-newt-bZXeCe",
  "sandboxStatus": "running",
  "timeout": 1830000,
  "createdAt": "2026-05-04T12:53:40.574Z"
}

What this proves:

  • Sandboxes created via the new abstraction path are round-trippable via the unchanged SDK status read — they work together
  • Old sandboxes (created before the migration) still surface correctly
  • The two readonly getters I added to VercelSandbox (sdkStatus, createdAt) didn't break any existing read code (no regressions in 25 sandbox tests + 2391/2391 full suite)

T3 — GET /api/sandboxes?sandbox_id=<id> (filter)

curl "$PREVIEW/api/sandboxes?sandbox_id=coral-coming-newt-bZXeCe" -H "x-api-key: $KEY"

Result:

{
  "count": 1,
  "first": {
    "sandboxId": "coral-coming-newt-bZXeCe",
    "sandboxStatus": "running",
    "timeout": 1830000,
    "createdAt": "2026-05-04T12:53:40.574Z"
  }
}

What this proves: filtered status reads work for sandboxes created via the new abstraction path. Same code path as T2; just exercising the filter.

What this confirms

Surface change Verified live
VercelSandbox.create() replaces Sandbox.create() in api's create path ✓ T1 succeeds, sandbox runs
VercelSandbox.connect() would replace Sandbox.get in getActiveSandbox (not directly exercised here — no existing active sandbox to reconnect to — covered by 4 unit tests) ✓ unit tests
Two new readonly getters (sdkStatus, createdAt) populate the existing HTTP response shape correctly ✓ T1 returned identical-shape JSON to pre-refactor
restoreSnapshotId param-shape change in createSandboxWithFallback ✓ T1 worked (no-snapshot path); fallback test covers snapshot path
installClaudeCode / runClaudeCode (intentionally untouched per Option B) ✓ no callers in api production code → no risk

Cleanup

Sandbox coral-coming-newt-bZXeCe is on a 30-minute timeout — will self-stop without manual cleanup.

Local validation already complete (from PR description)

  • pnpm lint:check clean
  • pnpm test2391/2391 pass (181 sandbox tests, all green; same count as before this PR)
  • pnpm build succeeds (preview deployed cleanly)

Ready to merge.

Three real issues from CodeRabbit + cubic:

1. createdAt staleness (CodeRabbit minor)
   The new `createdAt` getter on VercelSandbox skipped the
   `refreshStateFromCurrentSession()` step that `sdkStatus` uses, so
   readers right after a reconnect could see stale session metadata.
   Add the refresh.

2. Fabricated createdAt (cubic P2)
   Both createSandbox.ts and processCreateSandbox.ts had a
   `?? new Date().toISOString()` fallback that fabricated creation
   metadata when sandbox.createdAt was missing. The SDK guarantees
   createdAt is populated for any reachable instance, so the fallback
   was both wrong (fabricates data) and unnecessary.

   Tighten the getter to return `Date` (not `Date | undefined`) and
   throw with an explicit "SDK contract violation" message if the
   field is missing — fail-fast surfaces a real contract bug instead
   of silently lying.

   Drop the `?? new Date()` fallbacks at both call sites.

3. Misleading snapshot-restore branching (CodeRabbit major)
   createSandbox.ts had two paths — a "snapshot" branch that omitted
   DEFAULT_VCPUS/DEFAULT_RUNTIME (intent: let snapshot dictate), and
   a "fresh" branch that applied defaults. But VercelSandbox.create
   internally defaults vcpus=4 and runtime="node22" regardless, so
   the omission was a no-op — the abstraction always forwarded those
   to the SDK.

   Drop the misleading branching. Document the actual behavior at
   the top of createSandbox: "VercelSandbox.create applies its own
   defaults regardless of source — those apply to the runtime
   resources of the new sandbox even when restoring from a snapshot."

   Updated the snapshot-restore test to assert the actual call shape
   (vcpus + runtime + timeout + restoreSnapshotId) instead of just
   the original SDK-style truncated args.

Verification:
- pnpm lint:check: clean
- pnpm test: 2391/2391 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 4 files (changes from recent commits).

Requires human review: This is a significant refactor of the core sandbox lifecycle management, moving from direct SDK calls to a new internal abstraction (Phase 2.2). Refactor).

@sweetmantech sweetmantech merged commit 06f0822 into test May 4, 2026
6 checks passed
@sweetmantech sweetmantech deleted the feat/sandbox-callers-use-abstraction branch May 4, 2026 13:54
sweetmantech added a commit that referenced this pull request May 4, 2026
) (#510)

* refactor(sandbox): callers use open-agents abstraction (Phase 2.2)

Replaces direct @vercel/sandbox SDK calls with the open-agents sandbox
abstraction layer (inlined in Phase 2.1) for sandbox lifecycle (create
+ reconnect). HTTP response shapes preserved exactly.

Per the agreed Option B (hybrid): only the lifecycle creator helpers
get refactored. installClaudeCode / runClaudeCode / getSandboxStatus
stay on the SDK directly because the abstraction does not cover their
needs (sudo, stdout/stderr streaming, simple status reads). Those
two install/run files are also dead orphans (defined but never called)
and will be removed entirely after the full migration.

Production refactor:
  createSandbox.ts            Sandbox.create(...) -> VercelSandbox.create(...)
                              Input: VercelSandboxConfig (was SDK params)
                              Snapshot trigger: restoreSnapshotId field
                                (was source: { type: "snapshot", ... })
                              Returns VercelSandbox (was SDK Sandbox)
  createSandboxWithFallback.ts cascade — passes restoreSnapshotId to createSandbox
  createSandboxFromSnapshot.ts type cascade only (Sandbox -> VercelSandbox)
  getActiveSandbox.ts         Sandbox.get({name}) -> VercelSandbox.connect(name, {})
                              Status check: sandbox.status -> sandbox.sdkStatus
  getOrCreateSandbox.ts       no code change — type cascades automatically
  processCreateSandbox.ts     reads sandbox.sdkStatus instead of sandbox.status
                              defensive nullish on createdAt

Abstraction extension:
  vercel/sandbox/VercelSandbox.ts adds two readonly getters following
  the existing host/environmentDetails/expiresAt pattern:
    get sdkStatus(): string  — raw SDK session status (running/pending/
                                stopped/failed/aborted/snapshotting),
                                distinct from the abstraction's normalized
                                status getter
    get createdAt(): Date | undefined  — SDK session.createdAt

  These give api callers what they need to construct the existing
  HTTP response shape without breaking the abstraction's interface.

Tests updated:
  createSandbox.test.ts            mocks VercelSandbox.create instead of
                                    Sandbox.create; mock object uses
                                    sdkStatus instead of status
  createSandboxWithFallback.test.ts asserts restoreSnapshotId pass-through
  getActiveSandbox.test.ts         mocks VercelSandbox.connect; sdkStatus
                                    on mock objects
  processCreateSandbox.test.ts     mockSandbox uses sdkStatus

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2391/2391 pass
  - HTTP response shape unchanged: same fields, same enum values for
    sandboxStatus (sourced from the SDK now via sdkStatus, was directly
    via SDK Sandbox.status before — identical strings either way)



* fix: address PR #509 review feedback

Three real issues from CodeRabbit + cubic:

1. createdAt staleness (CodeRabbit minor)
   The new `createdAt` getter on VercelSandbox skipped the
   `refreshStateFromCurrentSession()` step that `sdkStatus` uses, so
   readers right after a reconnect could see stale session metadata.
   Add the refresh.

2. Fabricated createdAt (cubic P2)
   Both createSandbox.ts and processCreateSandbox.ts had a
   `?? new Date().toISOString()` fallback that fabricated creation
   metadata when sandbox.createdAt was missing. The SDK guarantees
   createdAt is populated for any reachable instance, so the fallback
   was both wrong (fabricates data) and unnecessary.

   Tighten the getter to return `Date` (not `Date | undefined`) and
   throw with an explicit "SDK contract violation" message if the
   field is missing — fail-fast surfaces a real contract bug instead
   of silently lying.

   Drop the `?? new Date()` fallbacks at both call sites.

3. Misleading snapshot-restore branching (CodeRabbit major)
   createSandbox.ts had two paths — a "snapshot" branch that omitted
   DEFAULT_VCPUS/DEFAULT_RUNTIME (intent: let snapshot dictate), and
   a "fresh" branch that applied defaults. But VercelSandbox.create
   internally defaults vcpus=4 and runtime="node22" regardless, so
   the omission was a no-op — the abstraction always forwarded those
   to the SDK.

   Drop the misleading branching. Document the actual behavior at
   the top of createSandbox: "VercelSandbox.create applies its own
   defaults regardless of source — those apply to the runtime
   resources of the new sandbox even when restoring from a snapshot."

   Updated the snapshot-restore test to assert the actual call shape
   (vcpus + runtime + timeout + restoreSnapshotId) instead of just
   the original SDK-style truncated args.

Verification:
- pnpm lint:check: clean
- pnpm test: 2391/2391 pass



---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sweetmantech added a commit that referenced this pull request May 4, 2026
* refactor(sandbox): callers use open-agents abstraction (Phase 2.2) (#509)

* refactor(sandbox): callers use open-agents abstraction (Phase 2.2)

Replaces direct @vercel/sandbox SDK calls with the open-agents sandbox
abstraction layer (inlined in Phase 2.1) for sandbox lifecycle (create
+ reconnect). HTTP response shapes preserved exactly.

Per the agreed Option B (hybrid): only the lifecycle creator helpers
get refactored. installClaudeCode / runClaudeCode / getSandboxStatus
stay on the SDK directly because the abstraction does not cover their
needs (sudo, stdout/stderr streaming, simple status reads). Those
two install/run files are also dead orphans (defined but never called)
and will be removed entirely after the full migration.

Production refactor:
  createSandbox.ts            Sandbox.create(...) -> VercelSandbox.create(...)
                              Input: VercelSandboxConfig (was SDK params)
                              Snapshot trigger: restoreSnapshotId field
                                (was source: { type: "snapshot", ... })
                              Returns VercelSandbox (was SDK Sandbox)
  createSandboxWithFallback.ts cascade — passes restoreSnapshotId to createSandbox
  createSandboxFromSnapshot.ts type cascade only (Sandbox -> VercelSandbox)
  getActiveSandbox.ts         Sandbox.get({name}) -> VercelSandbox.connect(name, {})
                              Status check: sandbox.status -> sandbox.sdkStatus
  getOrCreateSandbox.ts       no code change — type cascades automatically
  processCreateSandbox.ts     reads sandbox.sdkStatus instead of sandbox.status
                              defensive nullish on createdAt

Abstraction extension:
  vercel/sandbox/VercelSandbox.ts adds two readonly getters following
  the existing host/environmentDetails/expiresAt pattern:
    get sdkStatus(): string  — raw SDK session status (running/pending/
                                stopped/failed/aborted/snapshotting),
                                distinct from the abstraction's normalized
                                status getter
    get createdAt(): Date | undefined  — SDK session.createdAt

  These give api callers what they need to construct the existing
  HTTP response shape without breaking the abstraction's interface.

Tests updated:
  createSandbox.test.ts            mocks VercelSandbox.create instead of
                                    Sandbox.create; mock object uses
                                    sdkStatus instead of status
  createSandboxWithFallback.test.ts asserts restoreSnapshotId pass-through
  getActiveSandbox.test.ts         mocks VercelSandbox.connect; sdkStatus
                                    on mock objects
  processCreateSandbox.test.ts     mockSandbox uses sdkStatus

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2391/2391 pass
  - HTTP response shape unchanged: same fields, same enum values for
    sandboxStatus (sourced from the SDK now via sdkStatus, was directly
    via SDK Sandbox.status before — identical strings either way)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR #509 review feedback

Three real issues from CodeRabbit + cubic:

1. createdAt staleness (CodeRabbit minor)
   The new `createdAt` getter on VercelSandbox skipped the
   `refreshStateFromCurrentSession()` step that `sdkStatus` uses, so
   readers right after a reconnect could see stale session metadata.
   Add the refresh.

2. Fabricated createdAt (cubic P2)
   Both createSandbox.ts and processCreateSandbox.ts had a
   `?? new Date().toISOString()` fallback that fabricated creation
   metadata when sandbox.createdAt was missing. The SDK guarantees
   createdAt is populated for any reachable instance, so the fallback
   was both wrong (fabricates data) and unnecessary.

   Tighten the getter to return `Date` (not `Date | undefined`) and
   throw with an explicit "SDK contract violation" message if the
   field is missing — fail-fast surfaces a real contract bug instead
   of silently lying.

   Drop the `?? new Date()` fallbacks at both call sites.

3. Misleading snapshot-restore branching (CodeRabbit major)
   createSandbox.ts had two paths — a "snapshot" branch that omitted
   DEFAULT_VCPUS/DEFAULT_RUNTIME (intent: let snapshot dictate), and
   a "fresh" branch that applied defaults. But VercelSandbox.create
   internally defaults vcpus=4 and runtime="node22" regardless, so
   the omission was a no-op — the abstraction always forwarded those
   to the SDK.

   Drop the misleading branching. Document the actual behavior at
   the top of createSandbox: "VercelSandbox.create applies its own
   defaults regardless of source — those apply to the runtime
   resources of the new sandbox even when restoring from a snapshot."

   Updated the snapshot-restore test to assert the actual call shape
   (vcpus + runtime + timeout + restoreSnapshotId) instead of just
   the original SDK-style truncated args.

Verification:
- pnpm lint:check: clean
- pnpm test: 2391/2391 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(sandbox): delete dead Claude Code helpers (Phase 2.3) (#512)

* chore(sandbox): delete dead Claude Code helpers (Phase 2.3)

installClaudeCode and runClaudeCode were defined but never imported
anywhere in api production code — confirmed by grep on main:

  $ grep -rn "installClaudeCode\b\|runClaudeCode\b" lib/ app/
  lib/sandbox/installClaudeCode.ts:9: export async function installClaudeCode(...)
  lib/sandbox/runClaudeCode.ts:10:    export async function runClaudeCode(...)

Both files were skipped during the Phase 2.2 abstraction refactor
(per the agreed Option B — they used SDK features the abstraction
doesn't expose: sudo, stdout/stderr streaming, batched writes). With
the broader migration moving to Vercel Workflow + open-agents' agent
package for sandbox bootstrap, these orphans have no path to being
called again.

Removed:
  lib/sandbox/installClaudeCode.ts                (32 lines)
  lib/sandbox/runClaudeCode.ts                    (29 lines)
  lib/sandbox/__tests__/installClaudeCode.test.ts (4 tests)
  lib/sandbox/__tests__/runClaudeCode.test.ts     (6 tests)

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2381/2381 pass (was 2391 — net -10 tests from the
    two deleted test files)

Note: getOrCreateSandbox.ts also has zero importers per the audit
and is similarly dead, but is intentionally NOT deleted in this PR
since it was not explicitly flagged as orphan in the migration plan.
Worth a separate follow-up decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(sandbox): also delete getOrCreateSandbox + getActiveSandbox (YAGNI)

Cascade audit found two more truly-dead helpers per YAGNI:

  getOrCreateSandbox.ts    0 importers (self-only references)
  getActiveSandbox.ts      only called by getOrCreateSandbox — orphan
                            once that goes

Removed:
  lib/sandbox/getOrCreateSandbox.ts                (39 lines)
  lib/sandbox/getActiveSandbox.ts                  (33 lines)
  lib/sandbox/__tests__/getOrCreateSandbox.test.ts (3 tests)
  lib/sandbox/__tests__/getActiveSandbox.test.ts   (4 tests)

Live consumers of related helpers preserved:
  - createSandboxFromSnapshot still used by processCreateSandbox
  - selectAccountSandboxes still used by aggregateAccountSandboxStats,
    buildGetSandboxesParams, getSandboxesHandler, validateGetSandboxesRequest

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2374/2374 pass (was 2381 — net -7 from the two deleted
    test files; -3 from getOrCreateSandbox.test.ts + -4 from
    getActiveSandbox.test.ts)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sweetmantech added a commit that referenced this pull request May 4, 2026
* refactor(sandbox): callers use open-agents abstraction (Phase 2.2) (#509)

* refactor(sandbox): callers use open-agents abstraction (Phase 2.2)

Replaces direct @vercel/sandbox SDK calls with the open-agents sandbox
abstraction layer (inlined in Phase 2.1) for sandbox lifecycle (create
+ reconnect). HTTP response shapes preserved exactly.

Per the agreed Option B (hybrid): only the lifecycle creator helpers
get refactored. installClaudeCode / runClaudeCode / getSandboxStatus
stay on the SDK directly because the abstraction does not cover their
needs (sudo, stdout/stderr streaming, simple status reads). Those
two install/run files are also dead orphans (defined but never called)
and will be removed entirely after the full migration.

Production refactor:
  createSandbox.ts            Sandbox.create(...) -> VercelSandbox.create(...)
                              Input: VercelSandboxConfig (was SDK params)
                              Snapshot trigger: restoreSnapshotId field
                                (was source: { type: "snapshot", ... })
                              Returns VercelSandbox (was SDK Sandbox)
  createSandboxWithFallback.ts cascade — passes restoreSnapshotId to createSandbox
  createSandboxFromSnapshot.ts type cascade only (Sandbox -> VercelSandbox)
  getActiveSandbox.ts         Sandbox.get({name}) -> VercelSandbox.connect(name, {})
                              Status check: sandbox.status -> sandbox.sdkStatus
  getOrCreateSandbox.ts       no code change — type cascades automatically
  processCreateSandbox.ts     reads sandbox.sdkStatus instead of sandbox.status
                              defensive nullish on createdAt

Abstraction extension:
  vercel/sandbox/VercelSandbox.ts adds two readonly getters following
  the existing host/environmentDetails/expiresAt pattern:
    get sdkStatus(): string  — raw SDK session status (running/pending/
                                stopped/failed/aborted/snapshotting),
                                distinct from the abstraction's normalized
                                status getter
    get createdAt(): Date | undefined  — SDK session.createdAt

  These give api callers what they need to construct the existing
  HTTP response shape without breaking the abstraction's interface.

Tests updated:
  createSandbox.test.ts            mocks VercelSandbox.create instead of
                                    Sandbox.create; mock object uses
                                    sdkStatus instead of status
  createSandboxWithFallback.test.ts asserts restoreSnapshotId pass-through
  getActiveSandbox.test.ts         mocks VercelSandbox.connect; sdkStatus
                                    on mock objects
  processCreateSandbox.test.ts     mockSandbox uses sdkStatus

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2391/2391 pass
  - HTTP response shape unchanged: same fields, same enum values for
    sandboxStatus (sourced from the SDK now via sdkStatus, was directly
    via SDK Sandbox.status before — identical strings either way)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR #509 review feedback

Three real issues from CodeRabbit + cubic:

1. createdAt staleness (CodeRabbit minor)
   The new `createdAt` getter on VercelSandbox skipped the
   `refreshStateFromCurrentSession()` step that `sdkStatus` uses, so
   readers right after a reconnect could see stale session metadata.
   Add the refresh.

2. Fabricated createdAt (cubic P2)
   Both createSandbox.ts and processCreateSandbox.ts had a
   `?? new Date().toISOString()` fallback that fabricated creation
   metadata when sandbox.createdAt was missing. The SDK guarantees
   createdAt is populated for any reachable instance, so the fallback
   was both wrong (fabricates data) and unnecessary.

   Tighten the getter to return `Date` (not `Date | undefined`) and
   throw with an explicit "SDK contract violation" message if the
   field is missing — fail-fast surfaces a real contract bug instead
   of silently lying.

   Drop the `?? new Date()` fallbacks at both call sites.

3. Misleading snapshot-restore branching (CodeRabbit major)
   createSandbox.ts had two paths — a "snapshot" branch that omitted
   DEFAULT_VCPUS/DEFAULT_RUNTIME (intent: let snapshot dictate), and
   a "fresh" branch that applied defaults. But VercelSandbox.create
   internally defaults vcpus=4 and runtime="node22" regardless, so
   the omission was a no-op — the abstraction always forwarded those
   to the SDK.

   Drop the misleading branching. Document the actual behavior at
   the top of createSandbox: "VercelSandbox.create applies its own
   defaults regardless of source — those apply to the runtime
   resources of the new sandbox even when restoring from a snapshot."

   Updated the snapshot-restore test to assert the actual call shape
   (vcpus + runtime + timeout + restoreSnapshotId) instead of just
   the original SDK-style truncated args.

Verification:
- pnpm lint:check: clean
- pnpm test: 2391/2391 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(sandbox): delete dead Claude Code helpers (Phase 2.3) (#512)

* chore(sandbox): delete dead Claude Code helpers (Phase 2.3)

installClaudeCode and runClaudeCode were defined but never imported
anywhere in api production code — confirmed by grep on main:

  $ grep -rn "installClaudeCode\b\|runClaudeCode\b" lib/ app/
  lib/sandbox/installClaudeCode.ts:9: export async function installClaudeCode(...)
  lib/sandbox/runClaudeCode.ts:10:    export async function runClaudeCode(...)

Both files were skipped during the Phase 2.2 abstraction refactor
(per the agreed Option B — they used SDK features the abstraction
doesn't expose: sudo, stdout/stderr streaming, batched writes). With
the broader migration moving to Vercel Workflow + open-agents' agent
package for sandbox bootstrap, these orphans have no path to being
called again.

Removed:
  lib/sandbox/installClaudeCode.ts                (32 lines)
  lib/sandbox/runClaudeCode.ts                    (29 lines)
  lib/sandbox/__tests__/installClaudeCode.test.ts (4 tests)
  lib/sandbox/__tests__/runClaudeCode.test.ts     (6 tests)

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2381/2381 pass (was 2391 — net -10 tests from the
    two deleted test files)

Note: getOrCreateSandbox.ts also has zero importers per the audit
and is similarly dead, but is intentionally NOT deleted in this PR
since it was not explicitly flagged as orphan in the migration plan.
Worth a separate follow-up decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(sandbox): also delete getOrCreateSandbox + getActiveSandbox (YAGNI)

Cascade audit found two more truly-dead helpers per YAGNI:

  getOrCreateSandbox.ts    0 importers (self-only references)
  getActiveSandbox.ts      only called by getOrCreateSandbox — orphan
                            once that goes

Removed:
  lib/sandbox/getOrCreateSandbox.ts                (39 lines)
  lib/sandbox/getActiveSandbox.ts                  (33 lines)
  lib/sandbox/__tests__/getOrCreateSandbox.test.ts (3 tests)
  lib/sandbox/__tests__/getActiveSandbox.test.ts   (4 tests)

Live consumers of related helpers preserved:
  - createSandboxFromSnapshot still used by processCreateSandbox
  - selectAccountSandboxes still used by aggregateAccountSandboxStats,
    buildGetSandboxesParams, getSandboxesHandler, validateGetSandboxesRequest

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2374/2374 pass (was 2381 — net -7 from the two deleted
    test files; -3 from getOrCreateSandbox.test.ts + -4 from
    getActiveSandbox.test.ts)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port POST /api/sessions from open-agents (#515)

* feat(sessions): port GET /api/sessions/[sessionId] from open-agents (Phase 2.4 — first route)

First route in the route-by-route cutover plan. Strategy: open-agents
frontend stays unchanged in shape; api ports each route it calls in
priority order (simplest first), and the open-agents frontend gets
cut over to api one route at a time.

Why this route first:
- Pure DB read (single-row select by id) — no agent runner, no Vercel
  Workflow, no sandbox runtime
- Hits sessions table already migrated in database PR #20
- Frontend usage: agents-frontend hits /api/sessions/{id} on session
  detail page navigation
- Smallest possible blast radius for proving the cutover pattern

Files added:
  lib/supabase/sessions/selectSession.ts  Single-row helper + SessionRow
                                          type (hand-typed; database.types.ts
                                          regen pending — flagged in code
                                          comment)
  app/api/sessions/[sessionId]/route.ts   GET handler matching open-agents
                                          response shape exactly (camelCase
                                          fields, "userId" preserved on the
                                          wire even though stored as
                                          account_id internally)
  app/api/sessions/[sessionId]/__tests__/route.test.ts (5 tests)

Auth: validateAuthContext (Privy Bearer or x-api-key). Response codes
match open-agents: 200 happy path, 401 no auth, 403 not owner, 404 not
found.

Wire-format translation: snake_case Supabase row -> camelCase response,
with account_id surfaced as userId so the existing open-agents frontend
fetches with zero code changes. Translation lives at the route boundary
(toSessionResponse) where it is easy to remove once chat absorbs this
UI and we can switch to schema-natural naming.

Verification:
- pnpm lint:check: clean
- pnpm test: 2379/2379 pass (5 new for this route)

Up next:
- Cutover step (separate PR in open-agents): point the frontend at
  api's URL for this single route. Validate end-to-end before porting
  the next route.
- Next routes in priority order (still pure DB, no agent/workflow):
  GET /api/sessions (list with unread — needs Postgres RPC for the
  multi-table aggregation), GET /api/sessions/[id]/chats, GET
  /api/sessions/[id]/chats/[chatId].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR review — SRP splits + use Tables<\"sessions\"> from regen'd types

Three review comments on PR #514:

1. SRP: extract toSessionResponse to its own file
   was: defined inline in app/api/sessions/[sessionId]/route.ts
   now: lib/sessions/toSessionResponse.ts (one exported fn per file)

2. SRP: add a handler function (mirroring api convention)
   was: GET handler logic inline in route.ts
   now: lib/sessions/getSessionByIdHandler.ts contains all the auth +
        ownership + DB lookup + response logic; route.ts is a thin
        shell that awaits options.params and delegates. Matches the
        pattern used by every other api route (e.g. socials/[id]/scrape,
        artists/[id]/...).

3. DRY: use existing db schema type
   was: hand-typed SessionRow interface in selectSession.ts
   now: Tables<\"sessions\"> from types/database.types.ts (regenerated
        via npx supabase gen types typescript --project-id ...
        --schema public)

The types regen also resolved the preview-build failure
(\"Type instantiation is excessively deep and possibly infinite\") on
the .from(\"sessions\") call — Supabase's type inference was choking
because the table was unknown to the generic.

Files added:
  lib/sessions/toSessionResponse.ts
  lib/sessions/getSessionByIdHandler.ts

Files modified:
  app/api/sessions/[sessionId]/route.ts        thin shell now
  app/api/sessions/[sessionId]/__tests__/
    route.test.ts                              type alias updated
  lib/supabase/sessions/selectSession.ts       Tables<\"sessions\">
  types/database.types.ts                      Supabase regen

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2379/2379 pass (no test changes; same 5 route tests)
  - tsc compile clean (the local pnpm build progresses past compile
    into page-data collection where it fails on missing local env
    vars — Vercel preview will have those set, so the preview rebuild
    should now succeed)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): make 404/403 errors emit status:"error" for shape consistency

The 401 returned by validateAuthContext shaped like
{status:"error", error:"..."} but 404/403 from this handler returned
{error:"..."} only. Same endpoint, two error shapes — inconsistent for
clients. Align all error responses on the validateAuthContext shape.

Tests now assert the full error body, not just the status code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port POST /api/sessions from open-agents

Implements the POST /api/sessions contract documented in
recoupable/docs PR #186 + #187. Creates a session row and an
initial chat row; rolls back the session if chat insert fails so
callers never observe an orphaned session.

Auth: validateAuthContext (Privy Bearer or x-api-key).
Validation: Zod schema + GitHub repo segment regex. Body is
optional — empty body creates a session with sensible defaults
(status=running, lifecycle_state=provisioning, sandbox_state.type=
vercel, title="New session").

Out of scope (will follow once database catches up):
  auto_commit_push_override, auto_create_pr_override, pr_number,
  pr_status — these columns don't yet exist on api's sessions
  table, so the docs spec was trimmed accordingly in docs PR #187.

TDD: 9 handler tests cover 401, 400 (sandboxType / repoOwner /
repoName), 200 happy path, branch generation, title pass-through,
500 (insertSession failure), and 500-with-rollback (insertChat
failure). Plus 1 thin test on the route shell.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): add OPTIONS handler + cache directives to POST route

Match the convention from app/api/sessions/[sessionId]/route.ts:
- OPTIONS handler returning 200 + CORS headers (preflight)
- dynamic="force-dynamic", fetchCache="force-no-store", revalidate=0

POST routes that mutate DB shouldn't be cached, and browsers issuing
preflight checks (POST with JSON body + custom auth headers) need
OPTIONS to respond.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address PR review feedback

- SRP: extract insert-row construction to lib/sessions/buildSessionInsertRow.ts
- YAGNI: drop generateSessionBranchName + isNewBranch handling (sessions
  commit to whatever branch the client provides; auto-generation was
  speculative)
- Tighten isValidGitHubRepoOwner: GitHub's actual rules are alphanumeric
  + hyphen only (no `_` or `.`), 1-39 chars, no leading/trailing or
  consecutive hyphens
- Tighten isValidGitHubRepoName: reject reserved `.` and `..`, reject
  `.git` suffix, cap at 100 chars
- Add unit tests for both validators (15 cases) and for the new
  buildSessionInsertRow (4 cases)
- Split createSessionHandler tests into auth/validation + persistence
  files; share fixtures via createSessionHandlerFixtures.ts. All test
  files now under 100 lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address second round of PR review

- 500 message: "Failed to create session" → "Internal server error"
  (per cubic.dev standardized 500 envelope feedback)
- SRP: extract failedToCreateSession to lib/sessions/failedToCreateSession.ts
- YAGNI: drop repoOwner from request body and remove
  isValidGitHubRepoOwner helper entirely (recoupable is the only
  owner; no need to validate)
- YAGNI: drop repoName from request body and remove
  isValidGitHubRepoName helper (repo identity is derived server-side
  from the authenticated account, not accepted from user input)
- Single-export per file: split createSessionHandlerFixtures.ts into
  makeCreateSessionReq.ts, baseSessionRow.ts, baseChatRow.ts.
  okAuth constant inlined where used.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port random-city title fallback from open-agents

Generated session titles now match the open-agents UX — names like
"Anchorage", "Vienna", "Philadelphia" — instead of every untitled
session being called "New session". Closes a wire-shape gap with
open-agents production identified by the head-to-head test on PR.

Pieces:
- lib/sessions/cityNames.ts: ~200-city curated list (verbatim port)
- lib/sessions/getRandomCityName.ts: pick a city not in `usedNames`,
  numeric-suffix fallback when the curated list is exhausted
- lib/supabase/sessions/selectSessionTitlesByAccountId.ts: Supabase
  helper for collision avoidance
- lib/sessions/resolveSessionTitle.ts: orchestrates provided title
  (trimmed) > random city fallback. Async. Kept separate from the
  insert-row builder so that stays synchronous + pure.
- buildSessionInsertRow now takes `title` as a parameter
- createSessionHandler awaits resolveSessionTitle before building the
  row

TDD: 4 tests for getRandomCityName, 4 for resolveSessionTitle. Handler
tests updated to mock resolveSessionTitle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: remove GET-only files (scope this PR to POST)

The GET endpoint + handler + tests live in PR #514 and were
inadvertently brought in when this branch was rebased after #514's
work. This PR is scoped to POST only; GET ships in #514.

Shared infrastructure stays (types/database.types.ts regen +
lib/sessions/toSessionResponse.ts) — both are required by the POST
handler too. When either #514 or this PR merges to test first, the
other will see those files already present and resolve cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): consolidate request validation + DRY supabase select

Two reviewer asks rolled into one commit:

SRP — validateCreateSessionBody now owns the full validation flow.
The handler used to call safeParseJson, validateAuthContext, and the
Zod body schema separately; that was three places to short-circuit
and three places to duplicate the error envelope. Folded them into
validateCreateSessionBody so the handler does one call → success or
NextResponse error. Returns { body, auth } on success.

DRY — replaced lib/supabase/sessions/selectSession.ts and
selectSessionTitlesByAccountId.ts with a single
selectSessions({ id?, accountId? }) that supports both call sites.
resolveSessionTitle now derives titles from the general fetch.

Tests:
- New validateCreateSessionBody.test.ts covers auth-failure / 400 /
  success / malformed-JSON tolerance (4 cases)
- Handler tests now mock validateCreateSessionBody (single mock
  surface instead of three)
- resolveSessionTitle tests mock selectSessions

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address automated review feedback

Four small fixes from the latest round:

1. Zod v4 migration: { message } → { error } on the sandboxType
   literal. v4 unified the error customization API; { message } is
   deprecated.

2. Orphan rollback observability: when insertChat fails AND the
   session-rollback delete also fails, log the session id so ops
   can detect orphaned rows. New persistence test asserts the log.

3. Defensive try/catch in selectSessions so a thrown exception
   (network-level rejection, not a Supabase {error} return) doesn't
   bubble up and 500 the entire session-creation flow.

4. Deterministic test for getRandomCityName suffix-increment: pin
   Math.random instead of looping until the random pick lands on
   baseCity. Previous test could pass without ever asserting if the
   loop cap was hit.

Skipped: cubic-dev-ai's note about logging raw sessionId in
selectSession.ts — that file was deleted earlier in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: prettier format fix on persistence test

The new orphan-session test had a line that exceeded prettier's wrap
width. Auto-format fixed it; format-check now clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sweetmantech added a commit that referenced this pull request May 5, 2026
* refactor(sandbox): callers use open-agents abstraction (Phase 2.2) (#509)

* refactor(sandbox): callers use open-agents abstraction (Phase 2.2)

Replaces direct @vercel/sandbox SDK calls with the open-agents sandbox
abstraction layer (inlined in Phase 2.1) for sandbox lifecycle (create
+ reconnect). HTTP response shapes preserved exactly.

Per the agreed Option B (hybrid): only the lifecycle creator helpers
get refactored. installClaudeCode / runClaudeCode / getSandboxStatus
stay on the SDK directly because the abstraction does not cover their
needs (sudo, stdout/stderr streaming, simple status reads). Those
two install/run files are also dead orphans (defined but never called)
and will be removed entirely after the full migration.

Production refactor:
  createSandbox.ts            Sandbox.create(...) -> VercelSandbox.create(...)
                              Input: VercelSandboxConfig (was SDK params)
                              Snapshot trigger: restoreSnapshotId field
                                (was source: { type: "snapshot", ... })
                              Returns VercelSandbox (was SDK Sandbox)
  createSandboxWithFallback.ts cascade — passes restoreSnapshotId to createSandbox
  createSandboxFromSnapshot.ts type cascade only (Sandbox -> VercelSandbox)
  getActiveSandbox.ts         Sandbox.get({name}) -> VercelSandbox.connect(name, {})
                              Status check: sandbox.status -> sandbox.sdkStatus
  getOrCreateSandbox.ts       no code change — type cascades automatically
  processCreateSandbox.ts     reads sandbox.sdkStatus instead of sandbox.status
                              defensive nullish on createdAt

Abstraction extension:
  vercel/sandbox/VercelSandbox.ts adds two readonly getters following
  the existing host/environmentDetails/expiresAt pattern:
    get sdkStatus(): string  — raw SDK session status (running/pending/
                                stopped/failed/aborted/snapshotting),
                                distinct from the abstraction's normalized
                                status getter
    get createdAt(): Date | undefined  — SDK session.createdAt

  These give api callers what they need to construct the existing
  HTTP response shape without breaking the abstraction's interface.

Tests updated:
  createSandbox.test.ts            mocks VercelSandbox.create instead of
                                    Sandbox.create; mock object uses
                                    sdkStatus instead of status
  createSandboxWithFallback.test.ts asserts restoreSnapshotId pass-through
  getActiveSandbox.test.ts         mocks VercelSandbox.connect; sdkStatus
                                    on mock objects
  processCreateSandbox.test.ts     mockSandbox uses sdkStatus

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2391/2391 pass
  - HTTP response shape unchanged: same fields, same enum values for
    sandboxStatus (sourced from the SDK now via sdkStatus, was directly
    via SDK Sandbox.status before — identical strings either way)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR #509 review feedback

Three real issues from CodeRabbit + cubic:

1. createdAt staleness (CodeRabbit minor)
   The new `createdAt` getter on VercelSandbox skipped the
   `refreshStateFromCurrentSession()` step that `sdkStatus` uses, so
   readers right after a reconnect could see stale session metadata.
   Add the refresh.

2. Fabricated createdAt (cubic P2)
   Both createSandbox.ts and processCreateSandbox.ts had a
   `?? new Date().toISOString()` fallback that fabricated creation
   metadata when sandbox.createdAt was missing. The SDK guarantees
   createdAt is populated for any reachable instance, so the fallback
   was both wrong (fabricates data) and unnecessary.

   Tighten the getter to return `Date` (not `Date | undefined`) and
   throw with an explicit "SDK contract violation" message if the
   field is missing — fail-fast surfaces a real contract bug instead
   of silently lying.

   Drop the `?? new Date()` fallbacks at both call sites.

3. Misleading snapshot-restore branching (CodeRabbit major)
   createSandbox.ts had two paths — a "snapshot" branch that omitted
   DEFAULT_VCPUS/DEFAULT_RUNTIME (intent: let snapshot dictate), and
   a "fresh" branch that applied defaults. But VercelSandbox.create
   internally defaults vcpus=4 and runtime="node22" regardless, so
   the omission was a no-op — the abstraction always forwarded those
   to the SDK.

   Drop the misleading branching. Document the actual behavior at
   the top of createSandbox: "VercelSandbox.create applies its own
   defaults regardless of source — those apply to the runtime
   resources of the new sandbox even when restoring from a snapshot."

   Updated the snapshot-restore test to assert the actual call shape
   (vcpus + runtime + timeout + restoreSnapshotId) instead of just
   the original SDK-style truncated args.

Verification:
- pnpm lint:check: clean
- pnpm test: 2391/2391 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(sandbox): delete dead Claude Code helpers (Phase 2.3) (#512)

* chore(sandbox): delete dead Claude Code helpers (Phase 2.3)

installClaudeCode and runClaudeCode were defined but never imported
anywhere in api production code — confirmed by grep on main:

  $ grep -rn "installClaudeCode\b\|runClaudeCode\b" lib/ app/
  lib/sandbox/installClaudeCode.ts:9: export async function installClaudeCode(...)
  lib/sandbox/runClaudeCode.ts:10:    export async function runClaudeCode(...)

Both files were skipped during the Phase 2.2 abstraction refactor
(per the agreed Option B — they used SDK features the abstraction
doesn't expose: sudo, stdout/stderr streaming, batched writes). With
the broader migration moving to Vercel Workflow + open-agents' agent
package for sandbox bootstrap, these orphans have no path to being
called again.

Removed:
  lib/sandbox/installClaudeCode.ts                (32 lines)
  lib/sandbox/runClaudeCode.ts                    (29 lines)
  lib/sandbox/__tests__/installClaudeCode.test.ts (4 tests)
  lib/sandbox/__tests__/runClaudeCode.test.ts     (6 tests)

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2381/2381 pass (was 2391 — net -10 tests from the
    two deleted test files)

Note: getOrCreateSandbox.ts also has zero importers per the audit
and is similarly dead, but is intentionally NOT deleted in this PR
since it was not explicitly flagged as orphan in the migration plan.
Worth a separate follow-up decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(sandbox): also delete getOrCreateSandbox + getActiveSandbox (YAGNI)

Cascade audit found two more truly-dead helpers per YAGNI:

  getOrCreateSandbox.ts    0 importers (self-only references)
  getActiveSandbox.ts      only called by getOrCreateSandbox — orphan
                            once that goes

Removed:
  lib/sandbox/getOrCreateSandbox.ts                (39 lines)
  lib/sandbox/getActiveSandbox.ts                  (33 lines)
  lib/sandbox/__tests__/getOrCreateSandbox.test.ts (3 tests)
  lib/sandbox/__tests__/getActiveSandbox.test.ts   (4 tests)

Live consumers of related helpers preserved:
  - createSandboxFromSnapshot still used by processCreateSandbox
  - selectAccountSandboxes still used by aggregateAccountSandboxStats,
    buildGetSandboxesParams, getSandboxesHandler, validateGetSandboxesRequest

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2374/2374 pass (was 2381 — net -7 from the two deleted
    test files; -3 from getOrCreateSandbox.test.ts + -4 from
    getActiveSandbox.test.ts)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port POST /api/sessions from open-agents (#515)

* feat(sessions): port GET /api/sessions/[sessionId] from open-agents (Phase 2.4 — first route)

First route in the route-by-route cutover plan. Strategy: open-agents
frontend stays unchanged in shape; api ports each route it calls in
priority order (simplest first), and the open-agents frontend gets
cut over to api one route at a time.

Why this route first:
- Pure DB read (single-row select by id) — no agent runner, no Vercel
  Workflow, no sandbox runtime
- Hits sessions table already migrated in database PR #20
- Frontend usage: agents-frontend hits /api/sessions/{id} on session
  detail page navigation
- Smallest possible blast radius for proving the cutover pattern

Files added:
  lib/supabase/sessions/selectSession.ts  Single-row helper + SessionRow
                                          type (hand-typed; database.types.ts
                                          regen pending — flagged in code
                                          comment)
  app/api/sessions/[sessionId]/route.ts   GET handler matching open-agents
                                          response shape exactly (camelCase
                                          fields, "userId" preserved on the
                                          wire even though stored as
                                          account_id internally)
  app/api/sessions/[sessionId]/__tests__/route.test.ts (5 tests)

Auth: validateAuthContext (Privy Bearer or x-api-key). Response codes
match open-agents: 200 happy path, 401 no auth, 403 not owner, 404 not
found.

Wire-format translation: snake_case Supabase row -> camelCase response,
with account_id surfaced as userId so the existing open-agents frontend
fetches with zero code changes. Translation lives at the route boundary
(toSessionResponse) where it is easy to remove once chat absorbs this
UI and we can switch to schema-natural naming.

Verification:
- pnpm lint:check: clean
- pnpm test: 2379/2379 pass (5 new for this route)

Up next:
- Cutover step (separate PR in open-agents): point the frontend at
  api's URL for this single route. Validate end-to-end before porting
  the next route.
- Next routes in priority order (still pure DB, no agent/workflow):
  GET /api/sessions (list with unread — needs Postgres RPC for the
  multi-table aggregation), GET /api/sessions/[id]/chats, GET
  /api/sessions/[id]/chats/[chatId].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR review — SRP splits + use Tables<\"sessions\"> from regen'd types

Three review comments on PR #514:

1. SRP: extract toSessionResponse to its own file
   was: defined inline in app/api/sessions/[sessionId]/route.ts
   now: lib/sessions/toSessionResponse.ts (one exported fn per file)

2. SRP: add a handler function (mirroring api convention)
   was: GET handler logic inline in route.ts
   now: lib/sessions/getSessionByIdHandler.ts contains all the auth +
        ownership + DB lookup + response logic; route.ts is a thin
        shell that awaits options.params and delegates. Matches the
        pattern used by every other api route (e.g. socials/[id]/scrape,
        artists/[id]/...).

3. DRY: use existing db schema type
   was: hand-typed SessionRow interface in selectSession.ts
   now: Tables<\"sessions\"> from types/database.types.ts (regenerated
        via npx supabase gen types typescript --project-id ...
        --schema public)

The types regen also resolved the preview-build failure
(\"Type instantiation is excessively deep and possibly infinite\") on
the .from(\"sessions\") call — Supabase's type inference was choking
because the table was unknown to the generic.

Files added:
  lib/sessions/toSessionResponse.ts
  lib/sessions/getSessionByIdHandler.ts

Files modified:
  app/api/sessions/[sessionId]/route.ts        thin shell now
  app/api/sessions/[sessionId]/__tests__/
    route.test.ts                              type alias updated
  lib/supabase/sessions/selectSession.ts       Tables<\"sessions\">
  types/database.types.ts                      Supabase regen

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2379/2379 pass (no test changes; same 5 route tests)
  - tsc compile clean (the local pnpm build progresses past compile
    into page-data collection where it fails on missing local env
    vars — Vercel preview will have those set, so the preview rebuild
    should now succeed)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): make 404/403 errors emit status:"error" for shape consistency

The 401 returned by validateAuthContext shaped like
{status:"error", error:"..."} but 404/403 from this handler returned
{error:"..."} only. Same endpoint, two error shapes — inconsistent for
clients. Align all error responses on the validateAuthContext shape.

Tests now assert the full error body, not just the status code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port POST /api/sessions from open-agents

Implements the POST /api/sessions contract documented in
recoupable/docs PR #186 + #187. Creates a session row and an
initial chat row; rolls back the session if chat insert fails so
callers never observe an orphaned session.

Auth: validateAuthContext (Privy Bearer or x-api-key).
Validation: Zod schema + GitHub repo segment regex. Body is
optional — empty body creates a session with sensible defaults
(status=running, lifecycle_state=provisioning, sandbox_state.type=
vercel, title="New session").

Out of scope (will follow once database catches up):
  auto_commit_push_override, auto_create_pr_override, pr_number,
  pr_status — these columns don't yet exist on api's sessions
  table, so the docs spec was trimmed accordingly in docs PR #187.

TDD: 9 handler tests cover 401, 400 (sandboxType / repoOwner /
repoName), 200 happy path, branch generation, title pass-through,
500 (insertSession failure), and 500-with-rollback (insertChat
failure). Plus 1 thin test on the route shell.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): add OPTIONS handler + cache directives to POST route

Match the convention from app/api/sessions/[sessionId]/route.ts:
- OPTIONS handler returning 200 + CORS headers (preflight)
- dynamic="force-dynamic", fetchCache="force-no-store", revalidate=0

POST routes that mutate DB shouldn't be cached, and browsers issuing
preflight checks (POST with JSON body + custom auth headers) need
OPTIONS to respond.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address PR review feedback

- SRP: extract insert-row construction to lib/sessions/buildSessionInsertRow.ts
- YAGNI: drop generateSessionBranchName + isNewBranch handling (sessions
  commit to whatever branch the client provides; auto-generation was
  speculative)
- Tighten isValidGitHubRepoOwner: GitHub's actual rules are alphanumeric
  + hyphen only (no `_` or `.`), 1-39 chars, no leading/trailing or
  consecutive hyphens
- Tighten isValidGitHubRepoName: reject reserved `.` and `..`, reject
  `.git` suffix, cap at 100 chars
- Add unit tests for both validators (15 cases) and for the new
  buildSessionInsertRow (4 cases)
- Split createSessionHandler tests into auth/validation + persistence
  files; share fixtures via createSessionHandlerFixtures.ts. All test
  files now under 100 lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address second round of PR review

- 500 message: "Failed to create session" → "Internal server error"
  (per cubic.dev standardized 500 envelope feedback)
- SRP: extract failedToCreateSession to lib/sessions/failedToCreateSession.ts
- YAGNI: drop repoOwner from request body and remove
  isValidGitHubRepoOwner helper entirely (recoupable is the only
  owner; no need to validate)
- YAGNI: drop repoName from request body and remove
  isValidGitHubRepoName helper (repo identity is derived server-side
  from the authenticated account, not accepted from user input)
- Single-export per file: split createSessionHandlerFixtures.ts into
  makeCreateSessionReq.ts, baseSessionRow.ts, baseChatRow.ts.
  okAuth constant inlined where used.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port random-city title fallback from open-agents

Generated session titles now match the open-agents UX — names like
"Anchorage", "Vienna", "Philadelphia" — instead of every untitled
session being called "New session". Closes a wire-shape gap with
open-agents production identified by the head-to-head test on PR.

Pieces:
- lib/sessions/cityNames.ts: ~200-city curated list (verbatim port)
- lib/sessions/getRandomCityName.ts: pick a city not in `usedNames`,
  numeric-suffix fallback when the curated list is exhausted
- lib/supabase/sessions/selectSessionTitlesByAccountId.ts: Supabase
  helper for collision avoidance
- lib/sessions/resolveSessionTitle.ts: orchestrates provided title
  (trimmed) > random city fallback. Async. Kept separate from the
  insert-row builder so that stays synchronous + pure.
- buildSessionInsertRow now takes `title` as a parameter
- createSessionHandler awaits resolveSessionTitle before building the
  row

TDD: 4 tests for getRandomCityName, 4 for resolveSessionTitle. Handler
tests updated to mock resolveSessionTitle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: remove GET-only files (scope this PR to POST)

The GET endpoint + handler + tests live in PR #514 and were
inadvertently brought in when this branch was rebased after #514's
work. This PR is scoped to POST only; GET ships in #514.

Shared infrastructure stays (types/database.types.ts regen +
lib/sessions/toSessionResponse.ts) — both are required by the POST
handler too. When either #514 or this PR merges to test first, the
other will see those files already present and resolve cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): consolidate request validation + DRY supabase select

Two reviewer asks rolled into one commit:

SRP — validateCreateSessionBody now owns the full validation flow.
The handler used to call safeParseJson, validateAuthContext, and the
Zod body schema separately; that was three places to short-circuit
and three places to duplicate the error envelope. Folded them into
validateCreateSessionBody so the handler does one call → success or
NextResponse error. Returns { body, auth } on success.

DRY — replaced lib/supabase/sessions/selectSession.ts and
selectSessionTitlesByAccountId.ts with a single
selectSessions({ id?, accountId? }) that supports both call sites.
resolveSessionTitle now derives titles from the general fetch.

Tests:
- New validateCreateSessionBody.test.ts covers auth-failure / 400 /
  success / malformed-JSON tolerance (4 cases)
- Handler tests now mock validateCreateSessionBody (single mock
  surface instead of three)
- resolveSessionTitle tests mock selectSessions

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address automated review feedback

Four small fixes from the latest round:

1. Zod v4 migration: { message } → { error } on the sandboxType
   literal. v4 unified the error customization API; { message } is
   deprecated.

2. Orphan rollback observability: when insertChat fails AND the
   session-rollback delete also fails, log the session id so ops
   can detect orphaned rows. New persistence test asserts the log.

3. Defensive try/catch in selectSessions so a thrown exception
   (network-level rejection, not a Supabase {error} return) doesn't
   bubble up and 500 the entire session-creation flow.

4. Deterministic test for getRandomCityName suffix-increment: pin
   Math.random instead of looping until the random pick lands on
   baseCity. Previous test could pass without ever asserting if the
   loop cap was hit.

Skipped: cubic-dev-ai's note about logging raw sessionId in
selectSession.ts — that file was deleted earlier in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: prettier format fix on persistence test

The new orphan-session test had a line that exceeded prettier's wrap
width. Auto-format fixed it; format-check now clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port GET /api/sessions/[sessionId] from open-agents (#514)

Rebased onto current main (which now has the POST endpoint + shared
infra from PR #515). Three pieces of GET-specific work:

- app/api/sessions/[sessionId]/route.ts: thin shell delegating to the
  handler, plus OPTIONS for CORS preflight + cache directives
- lib/sessions/getSessionByIdHandler.ts: validates auth via
  validateAuthContext, reads via selectSessions({id}), enforces
  ownership (403 if account_id mismatch), 404 if missing
- app/api/sessions/[sessionId]/__tests__/route.test.ts: 5 cases —
  401 / 404 / 403 / 200 happy path / OPTIONS smoke

Uses the new general selectSessions({id}) reader rather than the
deleted single-purpose selectSession helper. All other shared infra
(types, toSessionResponse) is already on main from #515.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sweetmantech added a commit that referenced this pull request May 5, 2026
* refactor(sandbox): callers use open-agents abstraction (Phase 2.2) (#509)

* refactor(sandbox): callers use open-agents abstraction (Phase 2.2)

Replaces direct @vercel/sandbox SDK calls with the open-agents sandbox
abstraction layer (inlined in Phase 2.1) for sandbox lifecycle (create
+ reconnect). HTTP response shapes preserved exactly.

Per the agreed Option B (hybrid): only the lifecycle creator helpers
get refactored. installClaudeCode / runClaudeCode / getSandboxStatus
stay on the SDK directly because the abstraction does not cover their
needs (sudo, stdout/stderr streaming, simple status reads). Those
two install/run files are also dead orphans (defined but never called)
and will be removed entirely after the full migration.

Production refactor:
  createSandbox.ts            Sandbox.create(...) -> VercelSandbox.create(...)
                              Input: VercelSandboxConfig (was SDK params)
                              Snapshot trigger: restoreSnapshotId field
                                (was source: { type: "snapshot", ... })
                              Returns VercelSandbox (was SDK Sandbox)
  createSandboxWithFallback.ts cascade — passes restoreSnapshotId to createSandbox
  createSandboxFromSnapshot.ts type cascade only (Sandbox -> VercelSandbox)
  getActiveSandbox.ts         Sandbox.get({name}) -> VercelSandbox.connect(name, {})
                              Status check: sandbox.status -> sandbox.sdkStatus
  getOrCreateSandbox.ts       no code change — type cascades automatically
  processCreateSandbox.ts     reads sandbox.sdkStatus instead of sandbox.status
                              defensive nullish on createdAt

Abstraction extension:
  vercel/sandbox/VercelSandbox.ts adds two readonly getters following
  the existing host/environmentDetails/expiresAt pattern:
    get sdkStatus(): string  — raw SDK session status (running/pending/
                                stopped/failed/aborted/snapshotting),
                                distinct from the abstraction's normalized
                                status getter
    get createdAt(): Date | undefined  — SDK session.createdAt

  These give api callers what they need to construct the existing
  HTTP response shape without breaking the abstraction's interface.

Tests updated:
  createSandbox.test.ts            mocks VercelSandbox.create instead of
                                    Sandbox.create; mock object uses
                                    sdkStatus instead of status
  createSandboxWithFallback.test.ts asserts restoreSnapshotId pass-through
  getActiveSandbox.test.ts         mocks VercelSandbox.connect; sdkStatus
                                    on mock objects
  processCreateSandbox.test.ts     mockSandbox uses sdkStatus

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2391/2391 pass
  - HTTP response shape unchanged: same fields, same enum values for
    sandboxStatus (sourced from the SDK now via sdkStatus, was directly
    via SDK Sandbox.status before — identical strings either way)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR #509 review feedback

Three real issues from CodeRabbit + cubic:

1. createdAt staleness (CodeRabbit minor)
   The new `createdAt` getter on VercelSandbox skipped the
   `refreshStateFromCurrentSession()` step that `sdkStatus` uses, so
   readers right after a reconnect could see stale session metadata.
   Add the refresh.

2. Fabricated createdAt (cubic P2)
   Both createSandbox.ts and processCreateSandbox.ts had a
   `?? new Date().toISOString()` fallback that fabricated creation
   metadata when sandbox.createdAt was missing. The SDK guarantees
   createdAt is populated for any reachable instance, so the fallback
   was both wrong (fabricates data) and unnecessary.

   Tighten the getter to return `Date` (not `Date | undefined`) and
   throw with an explicit "SDK contract violation" message if the
   field is missing — fail-fast surfaces a real contract bug instead
   of silently lying.

   Drop the `?? new Date()` fallbacks at both call sites.

3. Misleading snapshot-restore branching (CodeRabbit major)
   createSandbox.ts had two paths — a "snapshot" branch that omitted
   DEFAULT_VCPUS/DEFAULT_RUNTIME (intent: let snapshot dictate), and
   a "fresh" branch that applied defaults. But VercelSandbox.create
   internally defaults vcpus=4 and runtime="node22" regardless, so
   the omission was a no-op — the abstraction always forwarded those
   to the SDK.

   Drop the misleading branching. Document the actual behavior at
   the top of createSandbox: "VercelSandbox.create applies its own
   defaults regardless of source — those apply to the runtime
   resources of the new sandbox even when restoring from a snapshot."

   Updated the snapshot-restore test to assert the actual call shape
   (vcpus + runtime + timeout + restoreSnapshotId) instead of just
   the original SDK-style truncated args.

Verification:
- pnpm lint:check: clean
- pnpm test: 2391/2391 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(sandbox): delete dead Claude Code helpers (Phase 2.3) (#512)

* chore(sandbox): delete dead Claude Code helpers (Phase 2.3)

installClaudeCode and runClaudeCode were defined but never imported
anywhere in api production code — confirmed by grep on main:

  $ grep -rn "installClaudeCode\b\|runClaudeCode\b" lib/ app/
  lib/sandbox/installClaudeCode.ts:9: export async function installClaudeCode(...)
  lib/sandbox/runClaudeCode.ts:10:    export async function runClaudeCode(...)

Both files were skipped during the Phase 2.2 abstraction refactor
(per the agreed Option B — they used SDK features the abstraction
doesn't expose: sudo, stdout/stderr streaming, batched writes). With
the broader migration moving to Vercel Workflow + open-agents' agent
package for sandbox bootstrap, these orphans have no path to being
called again.

Removed:
  lib/sandbox/installClaudeCode.ts                (32 lines)
  lib/sandbox/runClaudeCode.ts                    (29 lines)
  lib/sandbox/__tests__/installClaudeCode.test.ts (4 tests)
  lib/sandbox/__tests__/runClaudeCode.test.ts     (6 tests)

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2381/2381 pass (was 2391 — net -10 tests from the
    two deleted test files)

Note: getOrCreateSandbox.ts also has zero importers per the audit
and is similarly dead, but is intentionally NOT deleted in this PR
since it was not explicitly flagged as orphan in the migration plan.
Worth a separate follow-up decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(sandbox): also delete getOrCreateSandbox + getActiveSandbox (YAGNI)

Cascade audit found two more truly-dead helpers per YAGNI:

  getOrCreateSandbox.ts    0 importers (self-only references)
  getActiveSandbox.ts      only called by getOrCreateSandbox — orphan
                            once that goes

Removed:
  lib/sandbox/getOrCreateSandbox.ts                (39 lines)
  lib/sandbox/getActiveSandbox.ts                  (33 lines)
  lib/sandbox/__tests__/getOrCreateSandbox.test.ts (3 tests)
  lib/sandbox/__tests__/getActiveSandbox.test.ts   (4 tests)

Live consumers of related helpers preserved:
  - createSandboxFromSnapshot still used by processCreateSandbox
  - selectAccountSandboxes still used by aggregateAccountSandboxStats,
    buildGetSandboxesParams, getSandboxesHandler, validateGetSandboxesRequest

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2374/2374 pass (was 2381 — net -7 from the two deleted
    test files; -3 from getOrCreateSandbox.test.ts + -4 from
    getActiveSandbox.test.ts)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port POST /api/sessions from open-agents (#515)

* feat(sessions): port GET /api/sessions/[sessionId] from open-agents (Phase 2.4 — first route)

First route in the route-by-route cutover plan. Strategy: open-agents
frontend stays unchanged in shape; api ports each route it calls in
priority order (simplest first), and the open-agents frontend gets
cut over to api one route at a time.

Why this route first:
- Pure DB read (single-row select by id) — no agent runner, no Vercel
  Workflow, no sandbox runtime
- Hits sessions table already migrated in database PR #20
- Frontend usage: agents-frontend hits /api/sessions/{id} on session
  detail page navigation
- Smallest possible blast radius for proving the cutover pattern

Files added:
  lib/supabase/sessions/selectSession.ts  Single-row helper + SessionRow
                                          type (hand-typed; database.types.ts
                                          regen pending — flagged in code
                                          comment)
  app/api/sessions/[sessionId]/route.ts   GET handler matching open-agents
                                          response shape exactly (camelCase
                                          fields, "userId" preserved on the
                                          wire even though stored as
                                          account_id internally)
  app/api/sessions/[sessionId]/__tests__/route.test.ts (5 tests)

Auth: validateAuthContext (Privy Bearer or x-api-key). Response codes
match open-agents: 200 happy path, 401 no auth, 403 not owner, 404 not
found.

Wire-format translation: snake_case Supabase row -> camelCase response,
with account_id surfaced as userId so the existing open-agents frontend
fetches with zero code changes. Translation lives at the route boundary
(toSessionResponse) where it is easy to remove once chat absorbs this
UI and we can switch to schema-natural naming.

Verification:
- pnpm lint:check: clean
- pnpm test: 2379/2379 pass (5 new for this route)

Up next:
- Cutover step (separate PR in open-agents): point the frontend at
  api's URL for this single route. Validate end-to-end before porting
  the next route.
- Next routes in priority order (still pure DB, no agent/workflow):
  GET /api/sessions (list with unread — needs Postgres RPC for the
  multi-table aggregation), GET /api/sessions/[id]/chats, GET
  /api/sessions/[id]/chats/[chatId].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR review — SRP splits + use Tables<\"sessions\"> from regen'd types

Three review comments on PR #514:

1. SRP: extract toSessionResponse to its own file
   was: defined inline in app/api/sessions/[sessionId]/route.ts
   now: lib/sessions/toSessionResponse.ts (one exported fn per file)

2. SRP: add a handler function (mirroring api convention)
   was: GET handler logic inline in route.ts
   now: lib/sessions/getSessionByIdHandler.ts contains all the auth +
        ownership + DB lookup + response logic; route.ts is a thin
        shell that awaits options.params and delegates. Matches the
        pattern used by every other api route (e.g. socials/[id]/scrape,
        artists/[id]/...).

3. DRY: use existing db schema type
   was: hand-typed SessionRow interface in selectSession.ts
   now: Tables<\"sessions\"> from types/database.types.ts (regenerated
        via npx supabase gen types typescript --project-id ...
        --schema public)

The types regen also resolved the preview-build failure
(\"Type instantiation is excessively deep and possibly infinite\") on
the .from(\"sessions\") call — Supabase's type inference was choking
because the table was unknown to the generic.

Files added:
  lib/sessions/toSessionResponse.ts
  lib/sessions/getSessionByIdHandler.ts

Files modified:
  app/api/sessions/[sessionId]/route.ts        thin shell now
  app/api/sessions/[sessionId]/__tests__/
    route.test.ts                              type alias updated
  lib/supabase/sessions/selectSession.ts       Tables<\"sessions\">
  types/database.types.ts                      Supabase regen

Verification:
  - pnpm lint:check: clean
  - pnpm test: 2379/2379 pass (no test changes; same 5 route tests)
  - tsc compile clean (the local pnpm build progresses past compile
    into page-data collection where it fails on missing local env
    vars — Vercel preview will have those set, so the preview rebuild
    should now succeed)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): make 404/403 errors emit status:"error" for shape consistency

The 401 returned by validateAuthContext shaped like
{status:"error", error:"..."} but 404/403 from this handler returned
{error:"..."} only. Same endpoint, two error shapes — inconsistent for
clients. Align all error responses on the validateAuthContext shape.

Tests now assert the full error body, not just the status code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port POST /api/sessions from open-agents

Implements the POST /api/sessions contract documented in
recoupable/docs PR #186 + #187. Creates a session row and an
initial chat row; rolls back the session if chat insert fails so
callers never observe an orphaned session.

Auth: validateAuthContext (Privy Bearer or x-api-key).
Validation: Zod schema + GitHub repo segment regex. Body is
optional — empty body creates a session with sensible defaults
(status=running, lifecycle_state=provisioning, sandbox_state.type=
vercel, title="New session").

Out of scope (will follow once database catches up):
  auto_commit_push_override, auto_create_pr_override, pr_number,
  pr_status — these columns don't yet exist on api's sessions
  table, so the docs spec was trimmed accordingly in docs PR #187.

TDD: 9 handler tests cover 401, 400 (sandboxType / repoOwner /
repoName), 200 happy path, branch generation, title pass-through,
500 (insertSession failure), and 500-with-rollback (insertChat
failure). Plus 1 thin test on the route shell.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): add OPTIONS handler + cache directives to POST route

Match the convention from app/api/sessions/[sessionId]/route.ts:
- OPTIONS handler returning 200 + CORS headers (preflight)
- dynamic="force-dynamic", fetchCache="force-no-store", revalidate=0

POST routes that mutate DB shouldn't be cached, and browsers issuing
preflight checks (POST with JSON body + custom auth headers) need
OPTIONS to respond.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address PR review feedback

- SRP: extract insert-row construction to lib/sessions/buildSessionInsertRow.ts
- YAGNI: drop generateSessionBranchName + isNewBranch handling (sessions
  commit to whatever branch the client provides; auto-generation was
  speculative)
- Tighten isValidGitHubRepoOwner: GitHub's actual rules are alphanumeric
  + hyphen only (no `_` or `.`), 1-39 chars, no leading/trailing or
  consecutive hyphens
- Tighten isValidGitHubRepoName: reject reserved `.` and `..`, reject
  `.git` suffix, cap at 100 chars
- Add unit tests for both validators (15 cases) and for the new
  buildSessionInsertRow (4 cases)
- Split createSessionHandler tests into auth/validation + persistence
  files; share fixtures via createSessionHandlerFixtures.ts. All test
  files now under 100 lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address second round of PR review

- 500 message: "Failed to create session" → "Internal server error"
  (per cubic.dev standardized 500 envelope feedback)
- SRP: extract failedToCreateSession to lib/sessions/failedToCreateSession.ts
- YAGNI: drop repoOwner from request body and remove
  isValidGitHubRepoOwner helper entirely (recoupable is the only
  owner; no need to validate)
- YAGNI: drop repoName from request body and remove
  isValidGitHubRepoName helper (repo identity is derived server-side
  from the authenticated account, not accepted from user input)
- Single-export per file: split createSessionHandlerFixtures.ts into
  makeCreateSessionReq.ts, baseSessionRow.ts, baseChatRow.ts.
  okAuth constant inlined where used.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port random-city title fallback from open-agents

Generated session titles now match the open-agents UX — names like
"Anchorage", "Vienna", "Philadelphia" — instead of every untitled
session being called "New session". Closes a wire-shape gap with
open-agents production identified by the head-to-head test on PR.

Pieces:
- lib/sessions/cityNames.ts: ~200-city curated list (verbatim port)
- lib/sessions/getRandomCityName.ts: pick a city not in `usedNames`,
  numeric-suffix fallback when the curated list is exhausted
- lib/supabase/sessions/selectSessionTitlesByAccountId.ts: Supabase
  helper for collision avoidance
- lib/sessions/resolveSessionTitle.ts: orchestrates provided title
  (trimmed) > random city fallback. Async. Kept separate from the
  insert-row builder so that stays synchronous + pure.
- buildSessionInsertRow now takes `title` as a parameter
- createSessionHandler awaits resolveSessionTitle before building the
  row

TDD: 4 tests for getRandomCityName, 4 for resolveSessionTitle. Handler
tests updated to mock resolveSessionTitle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: remove GET-only files (scope this PR to POST)

The GET endpoint + handler + tests live in PR #514 and were
inadvertently brought in when this branch was rebased after #514's
work. This PR is scoped to POST only; GET ships in #514.

Shared infrastructure stays (types/database.types.ts regen +
lib/sessions/toSessionResponse.ts) — both are required by the POST
handler too. When either #514 or this PR merges to test first, the
other will see those files already present and resolve cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): consolidate request validation + DRY supabase select

Two reviewer asks rolled into one commit:

SRP — validateCreateSessionBody now owns the full validation flow.
The handler used to call safeParseJson, validateAuthContext, and the
Zod body schema separately; that was three places to short-circuit
and three places to duplicate the error envelope. Folded them into
validateCreateSessionBody so the handler does one call → success or
NextResponse error. Returns { body, auth } on success.

DRY — replaced lib/supabase/sessions/selectSession.ts and
selectSessionTitlesByAccountId.ts with a single
selectSessions({ id?, accountId? }) that supports both call sites.
resolveSessionTitle now derives titles from the general fetch.

Tests:
- New validateCreateSessionBody.test.ts covers auth-failure / 400 /
  success / malformed-JSON tolerance (4 cases)
- Handler tests now mock validateCreateSessionBody (single mock
  surface instead of three)
- resolveSessionTitle tests mock selectSessions

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sessions): address automated review feedback

Four small fixes from the latest round:

1. Zod v4 migration: { message } → { error } on the sandboxType
   literal. v4 unified the error customization API; { message } is
   deprecated.

2. Orphan rollback observability: when insertChat fails AND the
   session-rollback delete also fails, log the session id so ops
   can detect orphaned rows. New persistence test asserts the log.

3. Defensive try/catch in selectSessions so a thrown exception
   (network-level rejection, not a Supabase {error} return) doesn't
   bubble up and 500 the entire session-creation flow.

4. Deterministic test for getRandomCityName suffix-increment: pin
   Math.random instead of looping until the random pick lands on
   baseCity. Previous test could pass without ever asserting if the
   loop cap was hit.

Skipped: cubic-dev-ai's note about logging raw sessionId in
selectSession.ts — that file was deleted earlier in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: prettier format fix on persistence test

The new orphan-session test had a line that exceeded prettier's wrap
width. Auto-format fixed it; format-check now clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sessions): port GET /api/sessions/[sessionId] from open-agents (#514)

Rebased onto current main (which now has the POST endpoint + shared
infra from PR #515). Three pieces of GET-specific work:

- app/api/sessions/[sessionId]/route.ts: thin shell delegating to the
  handler, plus OPTIONS for CORS preflight + cache directives
- lib/sessions/getSessionByIdHandler.ts: validates auth via
  validateAuthContext, reads via selectSessions({id}), enforces
  ownership (403 if account_id mismatch), 404 if missing
- app/api/sessions/[sessionId]/__tests__/route.test.ts: 5 cases —
  401 / 404 / 403 / 200 happy path / OPTIONS smoke

Uses the new general selectSessions({id}) reader rather than the
deleted single-purpose selectSession helper. All other shared infra
(types, toSessionResponse) is already on main from #515.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ai/models): enrich response with context_window + cost (#518)

* feat(ai/models): enrich response with context_window + cost from models.dev

api's GET /api/ai/models previously returned just the gateway entries.
Open-agents' frontend depends on two extra fields per model that come
from the public models.dev catalog:

  - context_window (integer) — gates model selection in the picker
  - cost ({input, output}) — per-million-token pricing for display

Adds three pure helpers (TDD'd individually) plus a small refactor of
the existing fetcher to merge metadata in:

  - lib/ai/parseModelsDevMetadata.ts: tolerant unknown→Map parser
  - lib/ai/fetchModelsDevMetadata.ts: 750ms-bounded fetch with full
    error swallowing (metadata is best-effort, must never gate the
    underlying gateway response)
  - lib/ai/enrichGatewayModel.ts: pure, non-mutating merge

getAvailableModels now fetches gateway + metadata in parallel and
maps each non-embed model through enrichGatewayModel. If models.dev
is unreachable the response is identical to today (gateway models
unenriched).

Documented in recoupable/docs#188 (merged). Unblocks the eventual
open-agents frontend cutover for the model picker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ai): extract isRecord into its own lib (SRP)

Per PR feedback: each file should export one primary function.
Pulled isRecord out of parseModelsDevMetadata.ts into
lib/ai/isRecord.ts so the parser file is single-purpose.

Also includes the typecheck fix for enrichGatewayModel — the
`[key: string]: unknown` index signature on its generic constraint
was rejecting `GatewayLanguageModelEntry` and breaking the Vercel
build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant