Skip to content

test(control-plane): stabilize cancel integration test#643

Merged
ColeMurray merged 1 commit into
ColeMurray:mainfrom
donnfelker:upstream-stabilize-cancel-integration-test
May 17, 2026
Merged

test(control-plane): stabilize cancel integration test#643
ColeMurray merged 1 commit into
ColeMurray:mainfrom
donnfelker:upstream-stabilize-cancel-integration-test

Conversation

@donnfelker
Copy link
Copy Markdown
Contributor

@donnfelker donnfelker commented May 15, 2026

Summary

The control-plane integration test in packages/control-plane/test/integration/do-internal-routes.test.ts has been flaky. The cancel-then-assert-status test occasionally fails because the sandbox row reaches failed before the assertion runs, rather than stopped as expected.

This PR stabilizes the suite with two changes, both confined to that one test file:

  • Mock outbound Modal HTTP calls at the fetch layer. A beforeEach/afterEach pair installs/removes a vi.stubGlobal("fetch", ...) mock that intercepts requests to *.modal.run and returns a synthetic running sandbox response. Non-Modal fetch calls pass through to the real fetch.
  • Add a waitForSandboxSpawn helper. Polls the DO's sandbox table until the row reaches connecting (1s timeout, 10ms interval), so the test waits for the async spawn to land before driving subsequent state transitions.

Why this matters

The integration suite has been making real outbound Modal HTTP calls in CI. Modal returns 404 modal-http: invalid function call (most likely because MODAL_WORKSPACE is the dummy test workspace), so no real sandboxes are being created — but the fetch calls themselves are leaving the workerd runtime.

Recent successful upstream control-plane integration jobs each contained 132 modal-client request log lines:

  • run 25871996174 / job 76029563892
  • run 25846191019 / job 75941960701
  • run 25844133580 / job 75935538363
  • run 25873416058 / job 76034453557

Each shows the same pattern:

"component":"modal-client","msg":"modal.request","endpoint":"createSandbox","http_status":404,"outcome":"error"
"msg":"Sandbox spawn failed"
"Failed to create sandbox: Modal API error: 404 modal-http: invalid function call"

These createSandbox requests originate from the background ctx.waitUntil() callback that session init kicks off. Most tests do not assert on the final sandbox status, so the background 404 has just been noise. The cancel test was one of the few that did assert on status after cancel, which made it race-prone:

  • Sometimes the cancel path wrote stopped and the assertion ran before the background Modal failure wrote failed.
  • Sometimes the Modal failure won the race and the assertion observed failed.

By mocking Modal at the fetch boundary, the test no longer depends on MODAL_WORKSPACE being unset or on Modal's response timing, and the deterministic connecting → running path lets waitForSandboxSpawn synchronize the assertion against the spawn write.

Test plan

  • npm run test:integration -w @open-inspect/control-plane passes locally and in CI.
  • CI logs for this test file show zero outbound *.modal.run requests after the change.
  • No other integration tests regress; the mock is scoped to do-internal-routes.test.ts only.

Summary by CodeRabbit

Release Notes

Tests

  • Enhanced internal test suite for DO routes with improved sandbox spawning verification and deterministic request handling.

Note: This release contains no user-facing changes. Updates are limited to internal testing infrastructure improvements.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

📝 Walkthrough

Walkthrough

The PR enhances the DO internal routes integration test suite by adding fetch mocking and sandbox spawn synchronization. It introduces a fetch stub intercepting modal.run requests and a polling helper that waits for sandbox status to transition to "connecting", then wires these into the test lifecycle and updates the session cancellation scenario to synchronize with sandbox spawning.

Changes

DO Internal Routes Test Infrastructure

Layer / File(s) Summary
Mock and polling helper functions
packages/control-plane/test/integration/do-internal-routes.test.ts
installModalFetchMock stubs globalThis.fetch to detect .modal.run URLs, extract sandbox_id from request bodies, and return deterministic modal-run JSON responses. waitForSandboxSpawn polls the DO sandbox table via queryDO until status reaches "connecting" or throws a timeout error after ~1 second.
Test setup and sandbox synchronization
packages/control-plane/test/integration/do-internal-routes.test.ts
Test lifecycle hooks call installModalFetchMock in beforeEach and unstubAllGlobals in afterEach. The "stops sandbox when cancelling an active session" test calls waitForSandboxSpawn before updating session status to active and issuing the cancel request, ensuring sandbox spawn completes before the cancellation flow.

Possibly related PRs

  • ColeMurray/background-agents#585: Both PRs modify integration test logic to add/replace polling helpers that wait for DO sandbox table status transitions, stabilizing sandbox-related flows.

Poem

🐰 A rabbit in the modal runs with glee,
Mocking fetch calls, keeping tests carefree,
Polling sandboxes through the spawning night,
Till "connecting" blooms—the cancellation's right! 🌙✨

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'test(control-plane): stabilize cancel integration test' directly aligns with the main change: fixing a flaky integration test for the cancel scenario by adding deterministic mocking and synchronization helpers.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/control-plane/test/integration/do-internal-routes.test.ts (1)

46-46: ⚡ Quick win

Consider including actual status in timeout error.

The error message could include the actual sandbox status to aid debugging when the timeout occurs.

🔍 Proposed enhancement to error message
+  const finalStatus = rows[0]?.status ?? "no row";
-  throw new Error("Timed out waiting for sandbox spawn");
+  throw new Error(`Timed out waiting for sandbox spawn (last status: ${finalStatus})`);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/control-plane/test/integration/do-internal-routes.test.ts` at line
46, The timeout throw currently uses a static message ("Timed out waiting for
sandbox spawn"); update the Error construction in do-internal-routes.test.ts to
include the actual sandbox status/state when timing out (e.g., append the value
of the relevant status variable in scope such as status or sandbox.status) so
the thrown Error message reports both the timeout and the current sandbox status
for easier debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/control-plane/test/integration/do-internal-routes.test.ts`:
- Around line 35-47: Extract the magic timeout literals in waitForSandboxSpawn
into named constants (e.g., SANDBOX_SPAWN_TIMEOUT_MS and
SANDBOX_POLL_INTERVAL_MS) defined once at the top of the file (after imports)
with the Ms suffix to indicate milliseconds, then replace the hardcoded 1000 and
10 in waitForSandboxSpawn with those constants; if these timeouts are needed
elsewhere, export the constants for reuse instead of repeating literals.

---

Nitpick comments:
In `@packages/control-plane/test/integration/do-internal-routes.test.ts`:
- Line 46: The timeout throw currently uses a static message ("Timed out waiting
for sandbox spawn"); update the Error construction in do-internal-routes.test.ts
to include the actual sandbox status/state when timing out (e.g., append the
value of the relevant status variable in scope such as status or sandbox.status)
so the thrown Error message reports both the timeout and the current sandbox
status for easier debugging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cd256119-1d4b-4089-bac5-f4bae9600936

📥 Commits

Reviewing files that changed from the base of the PR and between 24ae9bf and df15c30.

📒 Files selected for processing (1)
  • packages/control-plane/test/integration/do-internal-routes.test.ts

Comment on lines +35 to +47
async function waitForSandboxSpawn(stub: DurableObjectStub): Promise<void> {
const deadline = Date.now() + 1000;

while (Date.now() < deadline) {
const rows = await queryDO<{ status: string }>(stub, "SELECT status FROM sandbox LIMIT 1");
if (rows[0]?.status === "connecting") {
return;
}
await new Promise((resolve) => setTimeout(resolve, 10));
}

throw new Error("Timed out waiting for sandbox spawn");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Extract timeout literals to named constants with unit suffixes.

Lines 36 and 43 use literal timeout values (1000 and 10). As per coding guidelines, TypeScript timeout values should be defined as named constants with Ms suffix and defined once rather than restated as literals.

⏱️ Proposed fix to extract timeout constants

Add constants at the top of the file (after imports):

+const SANDBOX_SPAWN_TIMEOUT_MS = 1000;
+const SANDBOX_SPAWN_POLL_INTERVAL_MS = 10;
+
 const originalFetch = globalThis.fetch;

Then update the function:

 async function waitForSandboxSpawn(stub: DurableObjectStub): Promise<void> {
-  const deadline = Date.now() + 1000;
+  const deadline = Date.now() + SANDBOX_SPAWN_TIMEOUT_MS;
 
   while (Date.now() < deadline) {
     const rows = await queryDO<{ status: string }>(stub, "SELECT status FROM sandbox LIMIT 1");
     if (rows[0]?.status === "connecting") {
       return;
     }
-    await new Promise((resolve) => setTimeout(resolve, 10));
+    await new Promise((resolve) => setTimeout(resolve, SANDBOX_SPAWN_POLL_INTERVAL_MS));
   }
 
   throw new Error("Timed out waiting for sandbox spawn");
 }

As per coding guidelines: Use seconds for Python timeouts and milliseconds for TypeScript timeouts, encoding the unit in variable names (TypeScript: timeoutMs, TIMEOUT_MS). Define each default timeout value exactly once as a named constant and import it everywhere rather than restating literal values.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/control-plane/test/integration/do-internal-routes.test.ts` around
lines 35 - 47, Extract the magic timeout literals in waitForSandboxSpawn into
named constants (e.g., SANDBOX_SPAWN_TIMEOUT_MS and SANDBOX_POLL_INTERVAL_MS)
defined once at the top of the file (after imports) with the Ms suffix to
indicate milliseconds, then replace the hardcoded 1000 and 10 in
waitForSandboxSpawn with those constants; if these timeouts are needed
elsewhere, export the constants for reuse instead of repeating literals.

@ColeMurray ColeMurray merged commit e735131 into ColeMurray:main May 17, 2026
18 checks passed
ColeMurray added a commit that referenced this pull request May 17, 2026
## Summary

Follow-up to #643. The `waitForSandboxSpawn` helper polls for
`status = 'connecting'` every 10ms (1s deadline), but with the fetch
mock installed in #643 the `createSandbox` response resolves
synchronously — so on a fast runner the `connecting → running`
transition can complete between two polls. The loop would never
observe `connecting`, then throw `"Timed out waiting for sandbox
spawn"`.

This was the residual race I flagged at review time. Broadening the
predicate to accept either `connecting` or `running` removes it —
both states indicate the spawn has progressed past init, which is the
actual precondition the cancel test needs before driving
`status = 'active'`.

```diff
-    if (rows[0]?.status === "connecting") {
+    const status = rows[0]?.status;
+    if (status === "connecting" || status === "running") {
       return;
     }
```

## Test plan

- [x] `npm run test:integration -w @open-inspect/control-plane --
test/integration/do-internal-routes.test.ts` passes locally.
- [x] Ran the test file 30 times in a loop — 30/30 green.
- [x] `npm run typecheck -w @open-inspect/control-plane` clean.
- [x] `npm run lint -w @open-inspect/control-plane` clean.
- [x] `npx prettier --check` clean.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Improved integration test reliability by updating sandbox readiness
detection to recognize additional valid states before proceeding with
test execution.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/ColeMurray/background-agents/pull/651?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants