feat(rivetkit): redesign sandbox actor module#4425
Conversation
The loop over c.vars.activeSessionIds never executes because createVars produces a fresh empty Set on each wake cycle. Vars are ephemeral and recreated fresh on each wake, so there are no sessions to restart timers for. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
🚅 Deployed to the rivet-pr-4425 environment in rivet-frontend
|
|
PR Review: feat(rivetkit): redesign sandbox actor module This is a well-architected redesign with excellent documentation and clear separation of concerns. The JSDoc in actor/index.ts is particularly thorough and helpful. Here are observations ranging from bugs to minor nits. Bugs and Correctness
In actor/index.ts, the destroy action resets sandboxId but not providerName. After a destroy action call, resolveProvider would still pass the providerName mismatch check (since state.providerName is still set), but this is a latent inconsistency that could confuse future readers and behaves differently from onDestroy.
Two concurrent calls to destroy can both pass the guard before either sets sandboxDestroyed = true, potentially calling provider.destroy() twice. Consider using a per-wake-cycle flag in vars (which is synchronous / non-async-interleaved) to guard the critical section, or add a destroyInProgress state field.
addSubscribedSession still persists session IDs to state.subscribedSessionIds even if the subscription itself could not be set up. This means on next wake, ensureAgent will try to re-subscribe to sessions retrieved by listSessions even if those sessions were never actively monitored. Consider only persisting sessions that are truly active (created/resumed). Design Concerns
The Zod schema correctly enforces exactly one of provider | createProvider via .refine(). But the TypeScript type SandboxActorProviderConfig is a discriminated union where provider?: never in the second variant allows neither to be provided at the type level. TypeScript accepts sandboxActor({}) with no compile error, but Zod will reject it at runtime. Consider making both branches use required fields with never on the opposing field.
The WebSocket close() call after sendFrame may not guarantee the server processes the JSON control frame before the connection drops. If the server uses the close frame to initiate a graceful process shutdown, this could cause race conditions. Consider waiting for a server acknowledgment frame or relying solely on the standard WebSocket close handshake.
In consumeProcessLogSse, the abort signal check happens only at the top of the while loop. reader.read() is not passed the signal, so if the fetch is aborted while waiting on a read, the AbortError will only be caught in the outer catch. This works but means the reader may block indefinitely until a byte arrives in environments where the underlying stream does not auto-close on abort. Test Coverage
The single test only covers getSandboxUrl and post-destroy behavior. Missing coverage for sleep/wake reconnection, provider mismatch detection, concurrent destroy calls, the createProvider dynamic callback path, and onSessionEvent / onPermissionRequest hook invocation. The client.test.ts file, by contrast, has thorough coverage. Consider adding more actor-level tests, even with mocked providers. Minor Nits
This is intentional (internal helper). clearAllActiveSessions (exported) is the public way to bulk-clear, while clearSessionActiveInMemory is intentionally internal. A brief comment to that effect would help future readers.
The double cast (as unknown as) is expected here given the dynamic proxy construction, but a comment explaining why this is safe would help.
Since these tables were never shipped in a release, consider whether this migration cleanup code can have a TTL (i.e., removed in a future release once all dev databases have been migrated). Keeping dead migration code forever adds noise. Overall The architecture is sound and the code is production-quality. The proxy action pattern cleanly wraps the sandbox-agent SDK, the session lifecycle management is well-thought-out, and the client.ts direct-access layer for binary/streaming operations is a good design choice. The main issues to address before merge are the destroy action inconsistency (1), the double-destroy race (2), and the TypeScript type gap (4). |
Replace in-house sandbox providers with re-exports from the upstream sandbox-agent package, which now ships built-in providers. Each provider is available as a separate subpackage import (e.g. rivetkit/sandbox/docker). - Upgrade sandbox-agent from 0.3.1 to 0.4.0-rc.1 - Remove custom docker, e2b, daytona provider implementations - Re-export upstream providers: docker, e2b, daytona, local, vercel, modal, computesdk - Replace SandboxActorProvider interface with upstream SandboxProvider - Use SandboxAgent.start() for create+connect lifecycle instead of manual provider.create() + provider.connectAgent() - Upstream ensureServer() replaces our wake() hook - Add destroySandbox to action methods (new in 0.4.0-rc.1) - Adapt SessionPersistDriver to upstream API changes (null → undefined, insertEvent signature) - Cloudflare provider not supported (architecture incompatible with direct sandbox URL access) - Update docs with new import paths, provider docs, and custom provider guide
Summary
Testing