Skip to content

[browser][coreCLR] Enable download retry by default and improve retry sequencing#127559

Draft
pavelsavara wants to merge 1 commit intodotnet:mainfrom
pavelsavara:enableDownloadRetry
Draft

[browser][coreCLR] Enable download retry by default and improve retry sequencing#127559
pavelsavara wants to merge 1 commit intodotnet:mainfrom
pavelsavara:enableDownloadRetry

Conversation

@pavelsavara
Copy link
Copy Markdown
Member

Summary

Enables enableDownloadRetry by default in the new coreCLR WASM loader and improves the retry strategy so that retries only begin after all first-attempt downloads have been queued. This prevents retry storms from starving initial download slots during startup.

Fixes #124946

Changes

Retry sequencing (assets.ts)

  • Introduces an allDownloadsQueuedPCS promise completion source that gates retry attempts — no retry fires until every first-attempt download has been kicked off.
  • resolveAllDownloadsQueued() is called from createRuntime() in run.ts after all forEachResource / fetch calls are queued.
  • Adds dotnetLogger.debug messages (Retrying download / Retrying download (2)) so retries are observable in diagnostic tracing.

Error re-wrapping (assets.ts)

  • fetchBytes and fetchText now catch errors thrown by loadResource and re-throw a plain Error with the asset name and URL. This strips the .silent flag that loadBootResource callbacks may attach, ensuring the failure surfaces through the normal exit/error listeners instead of being silently swallowed.

Default config (config.ts)

  • Sets enableDownloadRetry to true when not explicitly configured, matching the legacy loader's default.

Tests (ModuleConfigTests.cs)

  • Re-enables the [InlineData(true)] case for DownloadProgressFinishes (was disabled via #124946).
  • Adds DownloadRetryRecoversFromFailure — a dedicated [Fact] that publishes in Release, triggers simulated assembly download failures, and asserts that:
    • Progress reaches "Finished" (retry recovered the download).
    • At least one "Retrying download" log appears.
    • No second retry ("Retrying download (2)") is needed (first retry succeeds because the test harness only fails the first attempt per resource).

@pavelsavara pavelsavara added this to the 11.0.0 milestone Apr 29, 2026
@pavelsavara pavelsavara self-assigned this Apr 29, 2026
Copilot AI review requested due to automatic review settings April 29, 2026 13:19
@pavelsavara pavelsavara added arch-wasm WebAssembly architecture area-Host os-browser Browser variant of arch-wasm labels Apr 29, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables download retry by default for the CoreCLR WASM loader and adjusts retry sequencing so second-attempt downloads don’t begin until the initial set of downloads has been queued, reducing the risk of retry storms during startup.

Changes:

  • Enable enableDownloadRetry by default in loader configuration.
  • Gate retry attempts on a “all first-attempt downloads queued” signal, and add diagnostic retry logging.
  • Re-enable and expand WASM build tests to validate download-retry recovery and progress completion.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/native/libs/Common/JavaScript/loader/run.ts Signals when initial asset download work has been queued, to gate retry attempts.
src/native/libs/Common/JavaScript/loader/config.ts Defaults enableDownloadRetry to true when not explicitly configured.
src/native/libs/Common/JavaScript/loader/assets.ts Adds gating for retry attempts and wraps resource-load exceptions so they aren’t silently suppressed.
src/mono/wasm/Wasm.Build.Tests/ModuleConfigTests.cs Re-enables retry-related coverage and adds a dedicated retry recovery test.

Comment on lines +340 to +346
let response: Response;
try {
response = await loadResource(asset);
} catch (err: any) {
// Strip .silent flag from download errors so they are properly reported via exit listeners
throw new Error(`Failed to load resource '${asset.name}' from '${asset.resolvedUrl}': ${err.message || err}`);
}
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fetchBytes, the catch block wraps and rethrows a new Error but drops the original exception details/stack (and doesn’t set cause). Since this file already uses new Error(..., { cause: err }) elsewhere (e.g., callLibraryInitializerOnRuntimeConfigLoaded), consider using the same pattern here while still stripping .silent. Also, err.message || err can yield [object Object]; prefer normalizing via err instanceof Error ? err.message : String(err) (or similar) for a stable message.

Copilot uses AI. Check for mistakes.
Comment on lines +360 to +366
let response: Response;
try {
response = await loadResource(asset);
} catch (err: any) {
// Strip .silent flag from download errors so they are properly reported via exit listeners
throw new Error(`Failed to load resource '${asset.name}' from '${asset.resolvedUrl}': ${err.message || err}`);
}
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as fetchBytes: fetchText wraps and rethrows a new Error without preserving the original error as a cause, which makes debugging harder and loses the original stack. Consider adding cause: err (while still removing .silent) and normalizing the error message via err instanceof Error ? err.message : String(err) instead of err.message || err.

Copilot uses AI. Check for mistakes.
Comment on lines +135 to +136
// Signal that all first-attempt downloads have been queued.
// Retry logic waits on this before attempting second downloads.
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says this signals that “all first-attempt downloads have been queued”, but modulesAfterRuntimeReady JS module loads are initiated after this point when downloadOnly is false. Either adjust the comment to match what’s actually being gated (e.g., asset fetches queued via forEachResource/fetch*), or move the signal to after any additional first-attempt download work you intend to include.

Suggested change
// Signal that all first-attempt downloads have been queued.
// Retry logic waits on this before attempting second downloads.
// Signal that all first-attempt asset fetches queued above via forEachResource/fetch* have been queued.
// Retry logic waits on this before attempting second downloads for those asset fetches.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-Host os-browser Browser variant of arch-wasm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[coreclr] DownloadProgressFinishes(failAssemblyDownload: True) times out

2 participants