Skip to content

Local Container readiness times out before app container starts when using Sandbox terminal #6790

@iGmainC

Description

@iGmainC

What happened?

When running a Container-enabled Worker locally with Vite/Miniflare, opening a Sandbox terminal endpoint consistently times out during container readiness checks and the Worker returns 503 after about 21 seconds.

The same Docker image can be run directly, and the Sandbox control plane / PTY endpoint inside the image works when tested outside workerd. In the failing local dev path, Docker events show the cloudflare/proxy-everything container starts, but the application container for the Sandbox image is not created before workerd reports readiness timeout.

Environment

  • OS: Linux
  • Docker: 29.5.1
  • wrangler: 4.95.0
  • @cloudflare/vite-plugin: 1.39.0
  • @cloudflare/sandbox: 0.10.2
  • Base image: docker.io/cloudflare/sandbox:0.10.2
  • Dev command: vite --host 127.0.0.1
  • Worker uses a container-backed Durable Object class extending Sandbox
  • Container config uses a local Dockerfile image, instance_type: "standard-1", and exposes ports 3000 and 8080

Sanitized logs

Error checking if container is ready: The operation was aborted
[ERROR] e = kj/timer.c++:30: overloaded: operation timed out

stack:
<project>/node_modules/@cloudflare/workerd-linux-64/bin/workerd@2087e9f
<project>/node_modules/@cloudflare/workerd-linux-64/bin/workerd@2088193
<project>/node_modules/@cloudflare/workerd-linux-64/bin/workerd@20ada5f
<project>/node_modules/@cloudflare/workerd-linux-64/bin/workerd@341a1d0
<project>/node_modules/@cloudflare/workerd-linux-64/bin/workerd@3413e80
<project>/node_modules/@cloudflare/workerd-linux-64/bin/workerd@24ff2b0;
sentryErrorContext = jsgInternalError; wdErrId = <workerd-error-reference>

[ERROR] Uncaught Error: internal error; reference = <workerd-error-reference>

Error checking if container is ready: The operation was aborted
[ERROR] {
  level: 'error',
  message: 'Sandbox error',
  component: 'sandbox-do',
  sandboxId: '<redacted-sandbox-id>',
  traceId: '<redacted-trace-id>',
  serviceVersion: undefined,
  instanceId: undefined,
  timestamp: '<timestamp>',
  error: {
    message: 'Container failed to start',
    stack: 'Error: Container failed to start',
    name: 'Error'
  }
}

[WARNING] {
  level: 'warn',
  message: 'container.startup',
  component: 'sandbox-do',
  sandboxId: '<redacted-sandbox-id>',
  traceId: '<redacted-trace-id>',
  serviceVersion: undefined,
  instanceId: undefined,
  outcome: 'unrecognized_error',
  staleStateDetected: false,
  error: 'Container failed to start',
  timestamp: '<timestamp>'
}

--> GET /api/container/terminal 503 21s

Docker events during the failing request only showed the proxy container lifecycle, with no app container startup event before the timeout:

container create/start cloudflare/proxy-everything:<tag> name=workerd-<app>-<class>-<redacted-sandbox-id>-proxy

Expected behavior

The local runtime should start the application container and complete the readiness check, or return an actionable error explaining why the app container could not be created or connected.

Actual behavior

The request waits for about 21 seconds, logs repeated Error checking if container is ready: The operation was aborted, then workerd reports an internal kj/timer.c++:30: overloaded: operation timed out error and the Worker route returns 503.

Reproduction shape

  1. Configure a Worker with a container-backed Durable Object class extending @cloudflare/sandbox's Sandbox.
  2. Use a local Dockerfile based on docker.io/cloudflare/sandbox:0.10.2.
  3. Expose at least one port in the Dockerfile for local development.
  4. Start local dev with vite --host 127.0.0.1.
  5. Hit a Worker route that calls sandbox.getSession(<id>).terminal(request) for a WebSocket upgrade.
  6. The local request returns 503 after the readiness timeout.

Additional checks

Direct Docker checks suggest the image itself is not the immediate failure point:

  • Running the built image directly starts the Sandbox control plane on port 3000.
  • Posting to the control plane execute endpoint directly succeeds.
  • Connecting directly to /ws/pty?sessionId=<id>&cols=80&rows=24 over WebSocket succeeds and can run echo in the PTY.

This makes the failure look specific to the local workerd/Miniflare container startup or readiness orchestration path, not the Sandbox image's PTY implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions