fix(sandbox): derive sandbox_expires_at from getState() (open-agents parity)#537
Conversation
…iveLifecycleUpdate Restores parity with open-agents' (working) `handleCreateSandboxRequest`, which used `buildActiveLifecycleUpdate(nextState)` to derive the lifecycle write — reading `expiresAt` from the state object that `sandbox.getState()` returns. api's `createSandboxHandler` was instead reading from the top-level `sandbox.expiresAt` handle property, which the SDK doesn't reliably set on prebuilt-snapshot creation paths (org-snapshot restore). For those provisions, api was writing `sandbox_expires_at: null` to the session row. Cascading consequence: the lifecycle workflow's `evaluateSandboxLifecycle` interprets a null expiry as "no live runtime" and immediately writes `lifecycle_state: "hibernated"` on the row — within a few seconds of provision, before the user has a chance to use the sandbox. The chat UI then sticks on "Sandbox is initializing…" because status reports `no_sandbox`. Side benefits picked up by switching to `buildActiveLifecycleUpdate`: - `hibernate_after` is now set on provision (was being left null, forcing the lifecycle workflow to fall back to inactivity-based due time computed from `last_activity_at + INACTIVITY_TIMEOUT`). - `lifecycle_error` is explicitly cleared on provision, matching open-agents. - `last_activity_at` is now sourced from the same `Date` instance as `hibernate_after`, so they're consistent. TDD: red tests for (a) expires sourced from `getState().expiresAt` when the top-level handle property is undefined, (b) `hibernate_after` is set on the update — both green after the switch. Tests 2627 / 2627 pass. Lint + tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
No issues found across 2 files
Confidence score: 5/5
- Automated review surfaced no issues in the provided summaries.
- No files require special attention.
Architecture diagram
sequenceDiagram
participant Client as Chat UI
participant API as createSandboxHandler
participant SDK as Sandbox SDK
participant DB as Supabase Sessions
participant WF as Lifecycle Workflow
Note over Client,WF: Sandbox Provision Flow (Org-Snapshot Path)
Client->>API: POST /api/sandbox (org-snapshot config)
API->>SDK: connectSandbox(org-snapshot-id)
SDK-->>API: sandbox handle (expiresAt undefined, getState() returns {expiresAt, ...})
alt getState() available
API->>API: nextState = sandbox.getState() as Json
API->>API: buildActiveLifecycleUpdate(nextState)
Note over API: Derives sandbox_expires_at from nextState.expiresAt
Note over API: Sets hibernate_after, lifecycle_state="active"
Note over API: Clears lifecycle_error, updates last_activity_at
API->>DB: updateSession(sessionId, { sandbox_state, lifecycle_state, lifecycle_version, sandbox_expires_at, hibernate_after, last_activity_at, lifecycle_error, snapshot_url, snapshot_created_at })
end
DB-->>API: Session updated
API-->>Client: 200 OK
Note over Client,WF: Post-Provision Evaluation by Lifecycle Workflow
WF->>DB: evaluateSandboxLifecycle(sessionId)
DB-->>WF: session row (sandbox_expires_at=non-null ISO string)
WF->>WF: Check sandbox_expires_at against current time
alt sandbox_expires_at is non-null (happy path)
Note over WF: Recognizes live runtime, keeps lifecycle_state="active"
WF-->>DB: No state change
else sandbox_expires_at is null (previous bug)
Note over WF: Interprets as "no live runtime"
WF->>DB: update lifecycle_state="hibernated"
DB-->>WF: Hibernated
WF-->>Client: (via polling) Sandbox stuck on "initializing..."
end
Note over DB,WF: Side Benefits from buildActiveLifecycleUpdate
alt hibernate_after set
Note over DB: Lifecycle workflow has explicit deadline
else hibernate_after null (previous behavior)
Note over WF: Falls back to last_activity_at + INACTIVITY_TIMEOUT
end
Auto-approved: Focused bug fix with clear tests; uses existing helper to derive sandbox_expires_at from state, preventing premature hibernation. Low risk, isolated to sandbox creation handler.
|
Smoke test on preview `api-lshviat85-recoup.vercel.app` (commit `1e235cd7`): What I testedThe org-snapshot path that surfaced the bug — `https://github.com/recoupable/org-rostrum-pacific-cebcc866-34c3-451c-8cd7-f63309acff0a\`. This is the repo URL that the open-agents UI was passing when "Sandbox is initializing…" was sticking. Sequence```bash 1. Create sessionSESSION_ID=$(curl -sS -X POST "$PREVIEW/api/sessions" → SESSION_ID=2104c4ed-5509-45e2-8995-ec1ab48057c82. Provision sandbox bound to that sessioncurl -sS -X POST "$PREVIEW/api/sandbox" → { mode: "vercel", currentBranch: "main", timing.readyMs: 21520 }3. Status immediately after POSTcurl -sS "$PREVIEW/api/sandbox/status?sessionId=$SESSION_ID" -H "x-api-key: $API_KEY" 4. Sleep 20s — covers the window where the lifecycle workflow'sfirst iteration was previously writing "hibernated"sleep 20 5. Status after 20scurl -sS "$PREVIEW/api/sandbox/status?sessionId=$SESSION_ID" -H "x-api-key: $API_KEY" Result
Status response after 20s for reference: How to validate
Session id used: `2104c4ed-5509-45e2-8995-ec1ab48057c8`. The sandbox is still alive at the time of this comment if you want to inspect it directly. |
Summary
Restores parity with open-agents' (working) `handleCreateSandboxRequest`, which read `expiresAt` from `sandbox.getState()` (always populated) via `buildActiveLifecycleUpdate(nextState)`. api was reading from the top-level `sandbox.expiresAt` handle property, which the SDK doesn't reliably set on prebuilt-snapshot creation paths (org-snapshot restore).
For org-snapshot provisions api was writing `sandbox_expires_at: null` → the lifecycle workflow's `evaluateSandboxLifecycle` interprets a null expiry as "no live runtime" → immediately writes `lifecycle_state: "hibernated"` on the row → chat UI sticks on "Sandbox is initializing…".
Diff
api previously:
```ts
const nextState = sandbox.getState() as Json;
const expiresAt = typeof sandbox.expiresAt === "number"
? new Date(sandbox.expiresAt).toISOString() : null;
await updateSession(sessionRow.id, {
sandbox_state: nextState,
lifecycle_state: "active",
sandbox_expires_at: expiresAt, // ← null for org-snapshot path
last_activity_at: new Date().toISOString(),
...
});
```
After:
```ts
const nextState = sandbox.getState() as Json;
await updateSession(sessionRow.id, {
sandbox_state: nextState,
lifecycle_version: sessionRow.lifecycle_version + 1,
...buildActiveLifecycleUpdate(nextState), // sets active + activity + expires from state
snapshot_url: null,
snapshot_created_at: null,
});
```
Side benefits from switching to the helper:
Test plan
🤖 Generated with Claude Code
Summary by cubic
Fixes premature hibernation for org-snapshot sandboxes by deriving
sandbox_expires_atand lifecycle fields fromsandbox.getState()viabuildActiveLifecycleUpdate. Sessions from org snapshots now stay active, and the chat no longer sticks on “Sandbox is initializing…”.expiresAtfromsandbox.getState()and usebuildActiveLifecycleUpdatefor lifecycle updates.sandbox_expires_at: nullon org-snapshot provisions that triggered immediate hibernation.hibernate_after, clearslifecycle_error, and updateslast_activity_atconsistently.Written for commit 1e235cd. Summary will update on new commits.