Skip to content

fix(server): use asyncio.sleep instead of time.sleep in sandbox create#489

Merged
Pangjiping merged 5 commits intoalibaba:mainfrom
skyler0513:fix/async-sandbox-create-blocks-event-loop
Mar 20, 2026
Merged

fix(server): use asyncio.sleep instead of time.sleep in sandbox create#489
Pangjiping merged 5 commits intoalibaba:mainfrom
skyler0513:fix/async-sandbox-create-blocks-event-loop

Conversation

@skyler0513
Copy link

Summary

_wait_for_sandbox_ready in kubernetes_service.py used time.sleep() inside an async def call chain, which blocks the entire asyncio event loop thread during each poll interval.

With a 3-pod pool and 3 concurrent requests, each request was ~2s slower than a single request because the polls were effectively serialized.

With 4 concurrent requests (exceeding pool size), all 4 timed out — the 4th request's blocking sleep held the event loop for the entire timeout duration, preventing responses to the first 3 from being sent.

Fix: make _wait_for_sandbox_ready and create_sandbox async def, replace time.sleep with await asyncio.sleep, and add the missing await in lifecycle.py.

Testing

  • Unit tests
  • e2e / manual verification (4 concurrent requests against a 3-pod pool: requests 1–3 complete in ~3s, request 4 completes in ~6s after a pod is released)

Breaking Changes

  • None

Checklist

  • Linked Issue or clearly described motivation
  • Added/updated docs (if needed)
  • Added/updated tests (if needed)
  • Security impact considered
  • Backward compatibility considered

Copilot AI review requested due to automatic review settings March 19, 2026 08:42
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7c529d9f1c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes event-loop blocking during Kubernetes sandbox creation by making readiness polling fully asynchronous and awaiting sandbox creation from the API layer.

Changes:

  • Convert create_sandbox / _wait_for_sandbox_ready to async def and replace time.sleep() with await asyncio.sleep().
  • Update the lifecycle API endpoint to await sandbox creation.
  • Update unit tests to await async service methods.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
server/src/services/k8s/kubernetes_service.py Makes sandbox readiness polling non-blocking via asyncio.sleep and awaits readiness in create_sandbox.
server/src/services/sandbox_service.py Changes the abstract create_sandbox contract to async.
server/src/api/lifecycle.py Awaits the now-async sandbox_service.create_sandbox.
server/tests/k8s/test_kubernetes_service.py Updates tests to run under asyncio and await service methods.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@CLAassistant
Copy link

CLAassistant commented Mar 19, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ Pangjiping
❌ skyler.su


skyler.su seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@skyler0513 skyler0513 requested a review from Copilot March 19, 2026 13:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses event-loop blocking during Kubernetes sandbox creation by making the sandbox creation path fully async and replacing blocking sleeps with non-blocking asyncio.sleep.

Changes:

  • Convert SandboxService.create_sandbox (and implementations) to async def and update call sites to await it.
  • Update Kubernetes sandbox readiness polling to use await asyncio.sleep(...) during polling.
  • Update affected unit tests to be async (pytest.mark.asyncio) and to use AsyncMock where needed.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
server/src/services/k8s/kubernetes_service.py Makes _wait_for_sandbox_ready / create_sandbox async and switches polling sleep to await asyncio.sleep.
server/src/services/sandbox_service.py Updates the abstract service interface to require an async create_sandbox.
server/src/services/docker.py Updates Docker service create_sandbox signature to async to match the interface.
server/src/api/lifecycle.py Fixes the endpoint to await sandbox_service.create_sandbox(...).
server/tests/k8s/test_kubernetes_service.py Converts tests to async and updates calls to await.
server/tests/test_docker_service.py Converts relevant tests to async and awaits create_sandbox; updates mocking to AsyncMock where appropriate.
server/tests/test_routes_create_delete.py Updates route test stub service to provide an async create_sandbox.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Pangjiping
Copy link
Collaborator

Great fix! Please sign the CLA and run uv run ruff check --fix under server dir to fix linter error. 😊😊

@Pangjiping Pangjiping requested a review from ninan-nn as a code owner March 20, 2026 10:04
Copy link
Collaborator

@Pangjiping Pangjiping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I will fix unit test error.

@Pangjiping Pangjiping merged commit 4685fde into alibaba:main Mar 20, 2026
10 of 14 checks passed
Pangjiping added a commit to Pangjiping/OpenSandbox that referenced this pull request Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working component/server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants