Skip to content

fix(sandbox): derive sandbox_expires_at from getState() (open-agents parity)#537

Merged
sweetmantech merged 1 commit intotestfrom
fix/sandbox-expires-from-state
May 8, 2026
Merged

fix(sandbox): derive sandbox_expires_at from getState() (open-agents parity)#537
sweetmantech merged 1 commit intotestfrom
fix/sandbox-expires-from-state

Conversation

@sweetmantech
Copy link
Copy Markdown
Contributor

@sweetmantech sweetmantech commented May 8, 2026

Summary

Restores parity with open-agents' (working) `handleCreateSandboxRequest`, which read `expiresAt` from `sandbox.getState()` (always populated) via `buildActiveLifecycleUpdate(nextState)`. api was reading from the top-level `sandbox.expiresAt` handle property, which the SDK doesn't reliably set on prebuilt-snapshot creation paths (org-snapshot restore).

For org-snapshot provisions api was writing `sandbox_expires_at: null` → the lifecycle workflow's `evaluateSandboxLifecycle` interprets a null expiry as "no live runtime" → immediately writes `lifecycle_state: "hibernated"` on the row → chat UI sticks on "Sandbox is initializing…".

Diff

api previously:
```ts
const nextState = sandbox.getState() as Json;
const expiresAt = typeof sandbox.expiresAt === "number"
? new Date(sandbox.expiresAt).toISOString() : null;
await updateSession(sessionRow.id, {
sandbox_state: nextState,
lifecycle_state: "active",
sandbox_expires_at: expiresAt, // ← null for org-snapshot path
last_activity_at: new Date().toISOString(),
...
});
```

After:
```ts
const nextState = sandbox.getState() as Json;
await updateSession(sessionRow.id, {
sandbox_state: nextState,
lifecycle_version: sessionRow.lifecycle_version + 1,
...buildActiveLifecycleUpdate(nextState), // sets active + activity + expires from state
snapshot_url: null,
snapshot_created_at: null,
});
```

Side benefits from switching to the helper:

  • `hibernate_after` is now set on provision (was null, forcing the lifecycle workflow to fall back to `last_activity_at + INACTIVITY_TIMEOUT`).
  • `lifecycle_error` is explicitly cleared.
  • `last_activity_at` and `hibernate_after` share a `Date` instance.

Test plan

  • TDD red→green: "derives sandbox_expires_at from sandbox.getState().expiresAt, not the top-level handle" + "sets hibernate_after on the session row".
  • `pnpm test` — 2627 / 2627 pass
  • `pnpm lint:check` — clean
  • `npx tsc --noEmit` — clean for changed files
  • Smoke on preview: provision a session via the open-agents UI against a recoupable org repo, verify status reports `active` (not `hibernated`) for >30s, confirm `sandbox_expires_at` is non-null on the row.

🤖 Generated with Claude Code


Summary by cubic

Fixes premature hibernation for org-snapshot sandboxes by deriving sandbox_expires_at and lifecycle fields from sandbox.getState() via buildActiveLifecycleUpdate. Sessions from org snapshots now stay active, and the chat no longer sticks on “Sandbox is initializing…”.

  • Bug Fixes
    • Read expiresAt from sandbox.getState() and use buildActiveLifecycleUpdate for lifecycle updates.
    • Prevent writing sandbox_expires_at: null on org-snapshot provisions that triggered immediate hibernation.
    • Also sets hibernate_after, clears lifecycle_error, and updates last_activity_at consistently.

Written for commit 1e235cd. Summary will update on new commits.

…iveLifecycleUpdate

Restores parity with open-agents' (working) `handleCreateSandboxRequest`,
which used `buildActiveLifecycleUpdate(nextState)` to derive the
lifecycle write — reading `expiresAt` from the state object that
`sandbox.getState()` returns.

api's `createSandboxHandler` was instead reading from the top-level
`sandbox.expiresAt` handle property, which the SDK doesn't reliably
set on prebuilt-snapshot creation paths (org-snapshot restore). For
those provisions, api was writing `sandbox_expires_at: null` to the
session row.

Cascading consequence: the lifecycle workflow's
`evaluateSandboxLifecycle` interprets a null expiry as "no live
runtime" and immediately writes `lifecycle_state: "hibernated"` on
the row — within a few seconds of provision, before the user has a
chance to use the sandbox. The chat UI then sticks on "Sandbox is
initializing…" because status reports `no_sandbox`.

Side benefits picked up by switching to `buildActiveLifecycleUpdate`:
- `hibernate_after` is now set on provision (was being left null,
  forcing the lifecycle workflow to fall back to inactivity-based
  due time computed from `last_activity_at + INACTIVITY_TIMEOUT`).
- `lifecycle_error` is explicitly cleared on provision, matching
  open-agents.
- `last_activity_at` is now sourced from the same `Date` instance
  as `hibernate_after`, so they're consistent.

TDD: red tests for (a) expires sourced from `getState().expiresAt`
when the top-level handle property is undefined, (b) `hibernate_after`
is set on the update — both green after the switch.

Tests 2627 / 2627 pass. Lint + tsc clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
api Ready Ready Preview May 8, 2026 0:29am

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Warning

Rate limit exceeded

@sweetmantech has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 2 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b79726b8-9cc2-4d30-8347-53ea5b47a2a9

📥 Commits

Reviewing files that changed from the base of the PR and between 881d9d2 and 1e235cd.

⛔ Files ignored due to path filters (1)
  • lib/sandbox/__tests__/createSandboxHandler.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (1)
  • lib/sandbox/createSandboxHandler.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/sandbox-expires-from-state

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant Client as Chat UI
    participant API as createSandboxHandler
    participant SDK as Sandbox SDK
    participant DB as Supabase Sessions
    participant WF as Lifecycle Workflow

    Note over Client,WF: Sandbox Provision Flow (Org-Snapshot Path)

    Client->>API: POST /api/sandbox (org-snapshot config)
    API->>SDK: connectSandbox(org-snapshot-id)
    SDK-->>API: sandbox handle (expiresAt undefined, getState() returns {expiresAt, ...})

    alt getState() available
        API->>API: nextState = sandbox.getState() as Json
        API->>API: buildActiveLifecycleUpdate(nextState)
        Note over API: Derives sandbox_expires_at from nextState.expiresAt
        Note over API: Sets hibernate_after, lifecycle_state="active"
        Note over API: Clears lifecycle_error, updates last_activity_at
        API->>DB: updateSession(sessionId, { sandbox_state, lifecycle_state, lifecycle_version, sandbox_expires_at, hibernate_after, last_activity_at, lifecycle_error, snapshot_url, snapshot_created_at })
    end

    DB-->>API: Session updated
    API-->>Client: 200 OK

    Note over Client,WF: Post-Provision Evaluation by Lifecycle Workflow

    WF->>DB: evaluateSandboxLifecycle(sessionId)
    DB-->>WF: session row (sandbox_expires_at=non-null ISO string)
    WF->>WF: Check sandbox_expires_at against current time
    alt sandbox_expires_at is non-null (happy path)
        Note over WF: Recognizes live runtime, keeps lifecycle_state="active"
        WF-->>DB: No state change
    else sandbox_expires_at is null (previous bug)
        Note over WF: Interprets as "no live runtime"
        WF->>DB: update lifecycle_state="hibernated"
        DB-->>WF: Hibernated
        WF-->>Client: (via polling) Sandbox stuck on "initializing..."
    end

    Note over DB,WF: Side Benefits from buildActiveLifecycleUpdate
    alt hibernate_after set
        Note over DB: Lifecycle workflow has explicit deadline
    else hibernate_after null (previous behavior)
        Note over WF: Falls back to last_activity_at + INACTIVITY_TIMEOUT
    end
Loading

Auto-approved: Focused bug fix with clear tests; uses existing helper to derive sandbox_expires_at from state, preventing premature hibernation. Low risk, isolated to sandbox creation handler.

@sweetmantech
Copy link
Copy Markdown
Contributor Author

Smoke test on preview `api-lshviat85-recoup.vercel.app` (commit `1e235cd7`):

What I tested

The org-snapshot path that surfaced the bug — `https://github.com/recoupable/org-rostrum-pacific-cebcc866-34c3-451c-8cd7-f63309acff0a\`. This is the repo URL that the open-agents UI was passing when "Sandbox is initializing…" was sticking.

Sequence

```bash
PREVIEW=https://api-lshviat85-recoup.vercel.app
API_KEY=<recoup_sk_…> # any active personal api key works
REPO=https://github.com/recoupable/org-rostrum-pacific-cebcc866-34c3-451c-8cd7-f63309acff0a

1. Create session

SESSION_ID=$(curl -sS -X POST "$PREVIEW/api/sessions"
-H "x-api-key: $API_KEY" -H "content-type: application/json"
-d '{}' | jq -r '.session.id')

→ SESSION_ID=2104c4ed-5509-45e2-8995-ec1ab48057c8

2. Provision sandbox bound to that session

curl -sS -X POST "$PREVIEW/api/sandbox"
-H "x-api-key: $API_KEY" -H "content-type: application/json"
-d "{"sessionId": "$SESSION_ID", "repoUrl": "$REPO"}"

→ { mode: "vercel", currentBranch: "main", timing.readyMs: 21520 }

3. Status immediately after POST

curl -sS "$PREVIEW/api/sandbox/status?sessionId=$SESSION_ID" -H "x-api-key: $API_KEY"

4. Sleep 20s — covers the window where the lifecycle workflow's

first iteration was previously writing "hibernated"

sleep 20

5. Status after 20s

curl -sS "$PREVIEW/api/sandbox/status?sessionId=$SESSION_ID" -H "x-api-key: $API_KEY"
```

Result

Field Before (broken on `main`) After (this PR)
`lifecycle.state` (immediately) flipped to `hibernated` within seconds `active` ✓
`lifecycle.state` (after 20s) `hibernated` `active` ✓
`sandbox_expires_at` `null` (root cause) `1778245359519` (~30 min future) ✓
`hibernate_after` `null` (omitted) `1778243859519` (~5 min future) ✓
`lastActivityAt` snapshot from before POST returned recent ✓

Status response after 20s for reference:
```json
{
"status": "active",
"hasSnapshot": false,
"lifecycleVersion": 1,
"lifecycle": {
"state": "active",
"sandboxExpiresAt": 1778245359519
}
}
```

How to validate

  1. Reproduce the previous failure on `main` (still broken there): run the same sequence above against `https://recoup-api.vercel.app\` (production) or any `main`-based preview. Step 5 returns `status: "no_sandbox"` and `lifecycle.state: "hibernated"`.
  2. Verify the fix on this preview: same sequence against `https://api-lshviat85-recoup.vercel.app\`. Step 5 should return `status: "active"` with a non-null `sandboxExpiresAt` ~30 min in the future.
  3. Use a recoupable org repo (not `vercel/next.js` or any non-recoupable repo) — the bug only fires on the `prebuilt` org-snapshot path. Non-org repos worked even on `main`.

Session id used: `2104c4ed-5509-45e2-8995-ec1ab48057c8`. The sandbox is still alive at the time of this comment if you want to inspect it directly.

@sweetmantech sweetmantech merged commit 815c996 into test May 8, 2026
6 checks passed
@sweetmantech sweetmantech deleted the fix/sandbox-expires-from-state branch May 8, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant