Skip to content

[miniflare] Recover from corrupted @puppeteer/browsers cache in launchBrowser#13980

Merged
petebacondarwin merged 2 commits into
mainfrom
pbd/miniflare/recover-from-corrupted-chrome-cache
May 20, 2026
Merged

[miniflare] Recover from corrupted @puppeteer/browsers cache in launchBrowser#13980
petebacondarwin merged 2 commits into
mainfrom
pbd/miniflare/recover-from-corrupted-chrome-cache

Conversation

@petebacondarwin
Copy link
Copy Markdown
Contributor

@petebacondarwin petebacondarwin commented May 20, 2026

No tracked issue; follow-up to #13971.

What

When Miniflare's local Browser Run binding launches Chrome, it calls @puppeteer/browsers' install() to ensure the binary is present. If a previous install() was interrupted mid-extraction (test timeout, process kill, antivirus quarantine), the cache directory can be left partially populated — the folder exists but the executable inside it is missing. install() then throws on every subsequent call within the same process for the rest of the test session:

Error: The browser folder (C:\Users\runneradmin\AppData\Local\xdg.cache\.wrangler\chrome\win64-126.0.6478.182) exists but the executable (...\chrome.exe) is missing
    at installUrl (...@puppeteer/browsers/src/install.ts:309:15)
    at install (...@puppeteer/browsers/src/install.ts:157:18)
    at launchBrowser (.../packages/miniflare/src/plugins/browser-rendering/index.ts:141:35)

This was visible on a previous CI run for #13971 (job log) and recurs intermittently on Windows runners. With retry: 3 configured on the affected tests, all four attempts hit the same corrupted state and all four fail.

How

Wrap the install() call in launchBrowser with installWithCorruptedCacheRecovery, which:

  1. Calls install(options).
  2. If it throws and the message matches /The browser folder \((.+?)\) exists but the executable .+? is missing/, captures the corrupted path.
  3. Logs a warning, removeDirs the corrupted path, retries install() once.
  4. If cleanup itself fails, rethrows with a clear message asking for manual intervention.
  5. Any other install error rethrows unchanged.

The regex is matched against the literal error string from @puppeteer/browsers@2.10.6/src/install.ts:309-311. If @puppeteer/browsers ever rewords that error, this branch becomes a no-op (we'd rethrow the original error, just without recovery) — no risk of silently swallowing unrelated install failures.

Test plan

  • pnpm -F miniflare check:type
  • pnpm check:lint
  • pnpm check:format
  • pnpm -F miniflare test:ci test/plugins/browser/index.spec.ts ✅ — all 21 browser tests pass on the happy path
  • Manual recovery verification: deleted the chrome executable from the local wrangler cache (~/Library/Caches/.wrangler/chrome/mac_arm-126.0.6478.182/.../Google Chrome for Testing.app/Contents/MacOS/Google Chrome for Testing), then ran pnpm -F miniflare test:ci test/plugins/browser/index.spec.ts -t "it creates a browser session". The test passed — the install error was detected, the cache directory was cleared, and Chrome was re-downloaded on the retry.

Why no new tests

The recovery path is only hit when an external system (filesystem state, antivirus, OS) puts the cache into an invalid state. Writing a unit test that synthetically corrupts the cache would either (a) duplicate the @puppeteer/browsers state machine in mock code, or (b) actually invoke the real install() against a real cache, which makes it slow and flaky. The manual verification above exercises the real flow against a real corrupted cache, which is what we care about.


  • Tests
    • Tests included/updated
    • Automated tests not possible - manual testing has been completed as follows: deleted the chrome executable from the local wrangler cache, ran pnpm -F miniflare test:ci test/plugins/browser/index.spec.ts -t "it creates a browser session", confirmed the install error was detected, the cache directory was cleared, and Chrome was re-downloaded. See "Test plan" above.
    • Additional testing not necessary because:
  • Public documentation
    • Cloudflare docs PR(s):
    • Documentation not necessary because: internal recovery for a Miniflare-internal @puppeteer/browsers cache failure mode. No public API or user-facing behaviour change beyond a more helpful warning log when the recovery fires.

Open in Devin Review

…hBrowser

When Miniflare's local Browser Run binding launches Chrome, it calls
`@puppeteer/browsers`' `install()` to ensure the binary is present.
If a previous `install()` was interrupted mid-extraction (test timeout,
process kill, antivirus quarantine), the cache directory can be left
partially populated — the folder exists but the executable inside it
is missing. `install()` then throws `The browser folder (...) exists
but the executable (...) is missing` on every subsequent call within
the same process and the entire test session, breaking every later
Browser Run operation until the cache is manually cleared.

`launchBrowser` now catches that specific error, removes the corrupted
cache directory, and retries `install()` once. If the corruption
persists after cleanup, the original error is rethrown with a clearer
message.

This complements #13971, which surfaced the original error from inside
the binding worker. With that diagnostic in place and this self-healing
layer, the previously-intermittent "browser folder exists but
executable missing" failure mode should no longer fail an entire CI
run.

Manually verified by deleting the chrome executable from the local
wrangler cache, then running the browser tests — the install error is
detected, the cache directory is cleared, and Chrome is re-downloaded
on the retry.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 20, 2026

🦋 Changeset detected

Latest commit: c5c1999

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages
Name Type
miniflare Patch
@cloudflare/pages-shared Patch
@cloudflare/vite-plugin Patch
@cloudflare/vitest-pool-workers Patch
wrangler Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-project-automation github-project-automation Bot moved this to Untriaged in workers-sdk May 20, 2026
@workers-devprod workers-devprod requested review from a team and NuroDev and removed request for a team May 20, 2026 08:04
@workers-devprod
Copy link
Copy Markdown
Contributor

workers-devprod commented May 20, 2026

Codeowners approval required for this PR:

  • ✅ @cloudflare/wrangler
Show detailed file reviewers

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

✅ All changesets look good

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented May 20, 2026

Now I have a full picture. Let me verify the type signature. The install function has overloads:

export function install(options: InstallOptions & {unpack?: true}): Promise<InstalledBrowser>;
export function install(options: InstallOptions & {unpack: false}): Promise<string>;

The wrapper function signature InstallOptions & { unpack?: true } correctly matches the first overload that returns Promise<InstalledBrowser>. Good.

Now let me verify the error message in the source matches the regex:

The error in @puppeteer/browsers is:

`The browser folder (${outputPath}) exists but the executable (${installedBrowser.executablePath}) is missing`

The regex is:

/The browser folder \((.+?)\) exists but the executable .+? is missing/

This correctly captures outputPath (the corrupted folder path) in group 1. The regex uses non-greedy .+? which is appropriate here.

Let me also verify that the error path captured (outputPath) is the correct directory to remove. Looking at the @puppeteer/browsers source, outputPath = cache.installationDir(...) and the check is if (existsSync(outputPath)) — so outputPath is the installation directory. Removing it would allow install() to re-download and re-extract, which is exactly what we want.

One thing I notice: the downloadProgressCallback parameter typing was changed from inferred to explicit (downloadedBytes: number, totalBytes: number). Looking at the upstream InstallOptions, the downloadProgressCallback type is:

downloadProgressCallback?: 'default' | ((downloadedBytes: number, totalBytes: number) => void);

The explicit annotations on the callback parameters are fine — they were likely needed because extracting the options object into a separate variable lost the contextual typing from install()'s parameter type. This is correct.

Let me also check for any potential issues with the removeDir call — specifically, whether it could accidentally remove something it shouldn't if the regex match is somehow wrong:

The regex captures the path between ( and ) in The browser folder (<path>) exists.... This path comes directly from @puppeteer/browsers's outputPath which is cache.installationDir(browser, platform, buildId) — a subdirectory under the cache dir. So it's safe to remove.

Let me also verify the error thrown in the cleanup failure case preserves the original error information:

throw new Error(
    `Failed to clear corrupted Chrome cache at ${corruptedPath} after detecting "${(e as Error).message}". Manual cleanup may be required.`,
    { cause: cleanupError }
);

This preserves the cleanup error as cause and includes the original error message in the text. However, it does not preserve the original error e as a cause — the cause is only the cleanupError. The original install error message is embedded in the string, but the original error object (with its stack trace) is lost.

Actually, looking more carefully — this is a minor concern. The original error message is fully quoted in the new error message string, which is sufficient for diagnostics. The cause is the cleanup error, which is the more actionable piece (why cleanup failed). This is reasonable.

Overall this is a clean, well-scoped change. The logic is sound, the recovery is single-retry (no infinite loops), non-matching errors pass through unchanged, and the regex degradation mode (if @puppeteer/browsers changes the error message) is a safe no-op.

LGTM

github run

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 20, 2026

create-cloudflare

npm i https://pkg.pr.new/create-cloudflare@13980

@cloudflare/kv-asset-handler

npm i https://pkg.pr.new/@cloudflare/kv-asset-handler@13980

miniflare

npm i https://pkg.pr.new/miniflare@13980

@cloudflare/pages-shared

npm i https://pkg.pr.new/@cloudflare/pages-shared@13980

@cloudflare/unenv-preset

npm i https://pkg.pr.new/@cloudflare/unenv-preset@13980

@cloudflare/vite-plugin

npm i https://pkg.pr.new/@cloudflare/vite-plugin@13980

@cloudflare/vitest-pool-workers

npm i https://pkg.pr.new/@cloudflare/vitest-pool-workers@13980

@cloudflare/workers-editor-shared

npm i https://pkg.pr.new/@cloudflare/workers-editor-shared@13980

@cloudflare/workers-utils

npm i https://pkg.pr.new/@cloudflare/workers-utils@13980

wrangler

npm i https://pkg.pr.new/wrangler@13980

commit: c5c1999

Copy link
Copy Markdown
Contributor

@workers-devprod workers-devprod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codeowners reviews satisfied

@github-project-automation github-project-automation Bot moved this from Untriaged to Approved in workers-sdk May 20, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

- Stop spinner with a failure message if install throws after a
  download was already in progress, so the spinner does not stay
  spinning on error.
- Wrap the post-recovery retry install() error with a message that
  mentions the corruption-recovery context, preserving the original
  error via `Error.cause`.

Addresses Devin review comments on #13980.
@petebacondarwin petebacondarwin enabled auto-merge (squash) May 20, 2026 10:04
@petebacondarwin petebacondarwin merged commit e8c2031 into main May 20, 2026
63 of 65 checks passed
@petebacondarwin petebacondarwin deleted the pbd/miniflare/recover-from-corrupted-chrome-cache branch May 20, 2026 10:29
@github-project-automation github-project-automation Bot moved this from Approved to Done in workers-sdk May 20, 2026
petebacondarwin added a commit that referenced this pull request May 20, 2026
Browser Run tests in `packages/miniflare/test/plugins/browser/index.spec.ts`
and the `fixtures/browser-run` fixture call into `@puppeteer/browsers`
to ensure Chrome is downloaded into the global Wrangler cache. Every
CI run currently re-downloads ~150 MB of Chrome from scratch because
the cache directory is per-runner-instance and not shared between runs.

Add an `actions/cache@v4` step keyed on the OS + the Chrome version
hardcoded in `packages/miniflare/src/index.ts` (`126.0.6478.182")`,
restoring/saving `~/.cache/.wrangler/chrome` (Linux),
`~/Library/Caches/.wrangler/chrome` (macOS), and
`~/AppData/Local/xdg.cache/.wrangler/chrome` (Windows).

Benefits:
- Cuts ~150 MB and the associated download time off cold CI runs.
- Reduces the surface area for the intermittent partial-extraction
  race that surfaces as `The browser folder (...) exists but the
  executable (...) is missing` (see #13971 for the diagnostic that
  exposed this, #13980 for the in-process recovery layer). When the
  cache is warm and the binary is already extracted, this race can't
  fire at all because `install()` short-circuits.

The cache step runs for the `packages-and-tools` suite on all three
OSes and for the `fixtures` suite on macOS + Windows (the Browser Run
fixture is excluded on Ubuntu because of AppArmor).

When the Chrome version in `packages/miniflare/src/index.ts` changes,
the cache key here needs to be bumped manually. A miss only triggers a
fresh download — no functional impact.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants