fix: avoid Worker CPU limit (1043) in /api/status gateway start#342
Merged
andreasjansson merged 7 commits intomainfrom Mar 29, 2026
Merged
fix: avoid Worker CPU limit (1043) in /api/status gateway start#342andreasjansson merged 7 commits intomainfrom
andreasjansson merged 7 commits intomainfrom
Conversation
ensureGateway's waitForPort blocks for up to 180s. Even with a 25s Promise.race timeout, the underlying RPC continues running and exhausts the Worker's 30s CPU limit (error 1043). Fix: add waitForReady option to ensureGateway. When false, it starts the process but returns immediately without waitForPort. The /api/status handler uses this — the loading page polls every 2s and subsequent polls check if the port is up via the existing process check.
containerFetch blocks until the container responds, which can take 30-60s+ on cold start. The browser gets a blank page because the Worker times out before containerFetch returns. Add a 15s timeout for HTML requests — on timeout, the catch block serves the loading page.
If containerFetch returns headers but the body stream hangs (gateway partially initialized), httpResponse.text() blocks forever. The browser gets a blank page. Add a 10s timeout — on timeout, serve the loading page instead.
The 'base' variant in CI was showing the 'Configuration Required' error page instead of the loading page because E2E_TEST_MODE didn't skip env validation. The AI gateway keys may not be set for all variants. The validateRequiredEnv function already checks isTestMode for CF Access vars — extend the middleware to skip validation entirely in E2E mode, matching the existing behavior in dev mode.
The blank page was caused by containerFetch hanging when the gateway wasn't ready. Even with timeouts, the uncancelled background RPC would exhaust the Worker CPU limit. Fix: for HTML requests, check if the gateway process exists first (3s timeout on findExistingGatewayProcess). If not running, serve the loading page immediately without calling containerFetch. The loading page handles polling, probing, and reloading. This completely avoids calling containerFetch when the gateway isn't ready, eliminating both the blank page and CPU limit issues.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.




Summary
Fixes the intermittent error 1043 (Worker exceeded CPU time limit) that causes the blank page timeout in CI.
Root cause
/api/statuscalledensureGateway()synchronously with a 25sPromise.racetimeout. ButensureGatewayinternally callswaitForPort(timeout: 180s)which is an RPC that can't be cancelled. Even after thePromise.racetimeout fires, thewaitForPortRPC continues running in the background, exhausting the Worker's 30s CPU limit and causing error 1043. After a 1043, the Worker is completely unresponsive — the browser gets a blank page.Fix
Add a
waitForReadyoption toensureGateway(). Whenfalse, it starts the gateway process (fast RPC, ~2-5s) but skips thewaitForPortstep./api/statususeswaitForReady: falseso it returns quickly. The loading page polls every 2s — subsequent polls find the running process and check port readiness via the existing health check.Other callers (crash retry, non-HTML catch-all) continue using
waitForReady: true(the default) since they need to proxy immediately after the gateway starts.