Clear stale SW caches before reloading on ChunkLoadError#92339
Conversation
Both the manual Refresh path (usePageRefresh) and the automatic lazyRetry path were calling window.location.reload() without first clearing the Workbox service worker caches. In Safari PWA standalone mode the new SW (skipWaiting + clientsClaim) intercepts the reload and re-serves the same stale precached shell, reproducing the identical ChunkLoadError on every refresh attempt. Route both reloads through clearWorkboxRecoveryCaches() first, which unregisters the SW and deletes all Cache Storage so the reload fetches a fresh, internally-consistent shell from the CDN. Adds regression tests for both paths.
|
🚧 @mjasikowski has triggered a test Expensify/App build. You can view the workflow run here. |
This comment has been minimized.
This comment has been minimized.
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
|
@MelvinBot create an issue for this |
|
🤖 Created an issue for this: #92407 It describes the Safari PWA |
index.html is served with Cache-Control: max-age=86400, so after the service worker and Cache Storage are cleared, a plain window.location.reload() can still be handed a stale shell from Safari's HTTP cache (or the CDN edge) that references dead chunk hashes, keeping the loop alive on Safari PWA. Add reloadWithCacheBust(), which navigates to a unique URL via location.replace so the document misses the HTTP/edge cache and is fetched fresh, and use it in both recovery paths after clearing caches. Falls back to a plain reload if the navigation throws.
|
🚧 @mjasikowski has triggered a test Expensify/App build. You can view the workflow run here. |
Codecov Report❌ Looks like you've decreased code coverage for some files. Please write tests to increase, or at least maintain, the existing level of code coverage. See our documentation here for how to interpret this table.
|
This comment has been minimized.
This comment has been minimized.
The service-worker precache is the confirmed source of the stale shell, and clearWorkboxRecoveryCaches() before a plain reload addresses it. A reload navigation already bypasses Safari's local HTTP cache (cache-mode "reload"), so the cache-busting URL added little beyond defeating a stale CDN edge entry (better handled by edge no-cache headers) while leaving a forceReload query param in the address bar. Revert it back to a plain window.location.reload().
A chunk load can fail from a transient network blip, not just a stale post-deploy shell. Routing the first failure through clearWorkboxRecoveryCaches() nuked all caches and re-precached the full app unnecessarily for a problem a plain reload would fix. First failure: plain window.location.reload() (cheap, handles blips). Second failure: clearWorkboxRecoveryCaches() then reload (handles the stale SW precache scenario where the plain reload did not help). lazyRetry no longer rejects on second failure - it always reloads.
|
🚧 @mjasikowski has triggered a test Expensify/App build. You can view the workflow run here. |
This comment has been minimized.
This comment has been minimized.
|
|
||
| window.location.reload(); | ||
| sessionStorage.removeItem(CONST.SESSION_STORAGE_KEYS.LAST_REFRESH_TIMESTAMP); | ||
| clearWorkboxRecoveryCaches().then(() => window.location.reload()); |
There was a problem hiding this comment.
If caches.keys or navigator.serviceWorker.getRegistrations fails on safari (the exact platform this bug targets), this promise is never resolved so reload won't happen. Is this expected?
There was a problem hiding this comment.
Both of them are wrapped in try/catch, so the promise will resolve
| function isChunkLoadError(error: unknown): boolean { | ||
| if (!(error instanceof Error)) { | ||
| return false; | ||
| } | ||
| return error.name === CONST.CHUNK_LOAD_ERROR || /Loading chunk \S+ failed/i.test(error.message); | ||
| } |
There was a problem hiding this comment.
This function is duplicated. Consider extracting a shared function and use it everywhere.
App/src/components/LazyModalSlot.tsx
Line 6 in 321ab18
Also here:
One minor difference is [\d]+ vs \S+. Latter one seems better.
There was a problem hiding this comment.
Extracted isChunkLoadError.ts to a separate file
|
|
||
| window.location.reload(); | ||
| sessionStorage.removeItem(CONST.SESSION_STORAGE_KEYS.LAST_REFRESH_TIMESTAMP); | ||
| clearWorkboxRecoveryCaches().then(() => window.location.reload()); |
There was a problem hiding this comment.
When isChunkLoadError is false but the user hits Refresh twice within ERROR_WINDOW_RELOAD_TIMEOUT, it now falls through to clearWorkboxRecoveryCaches() here even for non-chunk errors. Hope this is fine.
There was a problem hiding this comment.
Changed it so clearWorkboxRecoveryCaches() is now only called when isChunkLoadError is true. Non-chunk errors use a plain window.location.reload(). Also added a test to cover that scenario.
… path isChunkLoadError was defined three times with slight differences (LazyModalSlot.tsx, GenericErrorPage.tsx, lazyRetry.ts). Extract it to src/libs/isChunkLoadError.ts and import from there in all three places. Uses CONST.CHUNK_LOAD_ERROR and the broader \S+ pattern everywhere. usePageRefresh was calling clearWorkboxRecoveryCaches() for all reload paths including non-chunk errors. When Refresh was clicked twice within ERROR_WINDOW_RELOAD_TIMEOUT for a generic render error the fall-through branch would unnecessarily clear the service worker cache. Gate the cache-clear on isChunkLoadError and use a plain reload otherwise.
Reviewer Checklist
Screenshots/VideosAndroid: HybridAppandroid.movAndroid: mWeb Chromemchrome.moviOS: HybridAppios.moviOS: mWeb Safarimsafari.movMacOS: Chrome / Safarisafari.movweb.mov |
|
Please merge main to fix failing tests |
|
@MelvinBot merge main |
Co-authored-by: Michał Jasikowski <mjasikowski@users.noreply.github.com>
|
🤖 Merged |
Regression Analysis🔴 1.
|
| Before | After | |
|---|---|---|
| Land on error page | flag stays 'true' |
flag reset to 'false' |
| User taps Refresh → reload | next lazyRetry → 'true' branch → immediate reject, 1 reload total |
INITIAL→ reload, RELOADED→clear caches+reload, then reject → 2 automatic reloads + a cache wipe per click |
So every manual refresh now triggers a 2-reload dance and another Cache Storage wipe before the error page returns, versus a single reload before.
The reset is also redundant for recovery — the success branch already sets INITIAL on the next good import.
So reset-on-reject buys nothing for the happy path and only weakens loop-prevention.
Recommend dropping the reset in the else branch (keep CACHE_CLEARED latched, like the old code latched 'true').
Summary: Lower-severity UX/cache-churn regression with a clean fix (don't reset on reject)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aabcf88d5c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| .then((component) => { | ||
| // Reset the retry status to 'false' on successful import | ||
| sessionStorage.setItem(CONST.SESSION_STORAGE_KEYS.RETRY_LAZY_REFRESHED, 'false'); // success so reset the refresh | ||
| sessionStorage.setItem(CONST.SESSION_STORAGE_KEYS.RETRY_LAZY_REFRESHED, RETRY_STATE.INITIAL); |
There was a problem hiding this comment.
Preserve retry state across unrelated lazy successes
When a nested lazy import fails after the first reload, another lazy import can succeed first and reset the global RETRY_LAZY_REFRESHED flag here. In the inspected navigation path, AppNavigator is lazy-loaded before AuthScreens/PublicScreens, so a stale AuthScreens chunk can be treated as a first failure again after AppNavigator succeeds, meaning the new cache-clearing branch never runs and the app can keep plain-reloading against the stale SW cache. Keep this retry state until the failing import recovers, or scope it per chunk/import.
Useful? React with 👍 / 👎.
| // The error page is shown after lazyRetry has already done a plain reload | ||
| // and it did not fix the problem. Clear the service worker cache so the | ||
| // next load fetches a fresh app shell from the CDN. | ||
| clearWorkboxRecoveryCaches().then(() => window.location.reload()); |
There was a problem hiding this comment.
Avoid clearing offline precache from Refresh
When lazyRetry rejects a second ChunkLoadError because navigator.onLine is false, the error page still calls refreshPage(true). Pressing Refresh while still offline now unconditionally deletes Cache Storage and unregisters the service worker before reloading, removing the offline app shell that the lazyRetry offline guard deliberately preserved. Gate this cache clear on being online, or fall back to the previous reload/reset behavior while offline.
Useful? React with 👍 / 👎.
|
🤖 Code review — 7 files, +275/−25. The fix is well-structured: the three-state 🟠 The manual Refresh button clears the SW cache while offline — the exact scenario the automatic path guards againstThe automatic path only clears the service-worker cache when online, and the code comment calls this guard "critical": clearing it offline "would destroy the cached app shell that is the only thing keeping the PWA usable until connectivity returns" — But the manual Refresh button has no such guard. This is reachable end-to-end: when a chunk fails offline, Suggested fix: mirror the guard on the manual path, e.g. Smaller considerations (non-blocking)
Static review only — I did not run the app or the test suite. The headline finding is verified by reading the code paths; the items above are lower-confidence. |
situchan
left a comment
There was a problem hiding this comment.
Looks good.
Please check and address above comments if applicable.
|
@mjasikowski let me know once you resolve those last comments, thanks! |
- Scope the retry state by a per-import key so a sibling chunk's success (e.g. AppNavigator resolving) can no longer reset the retry flag of a still-failing chunk (e.g. AuthScreens), which otherwise restarted that chunk's cycle forever instead of reaching the cache-clearing branch. Callers now pass a stable retryKey. - Gate usePageRefresh's cache clear on navigator.onLine. When lazyRetry rejects a second ChunkLoadError offline, tapping Refresh would otherwise wipe the cached app shell that the offline guard deliberately preserved.
…chal-fix-safari-crash
|
@codex review |
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
|
@mountiny I think we're good here |
Regression reviewNo crashing/loop regressions found — the new retry state machine is bounded (max 2 automatic reloads per import key, then it rejects to the error boundary), all 3 1.
|
- Do not reset the retry flag to INITIAL when lazyRetry exhausts recovery and rejects to the error boundary. Resetting made a later failure of the same import restart the full reload cycle; leaving the advanced state means it fails fast (already cache-cleared) or retries the cache clear once back online. A successful import still resets the flag. - Add a .catch to clearWorkboxRecoveryCaches() before reload in both call sites so the reload still fires if that function is ever changed to reject (today it cannot, but the coupling was fragile).
| clearWorkboxRecoveryCaches() | ||
| .catch(() => undefined) | ||
| .then(() => window.location.reload()); |
There was a problem hiding this comment.
I think catch will never happen here as already caught in clearWorkboxRecoveryCaches.
clearWorkboxRecoveryCaches already swallows errors from caches.keys/delete and serviceWorker.getRegistrations/unregister internally, so the promise it returns never rejects. The .catch(() => undefined) was dead code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release. |
|
🚧 @mountiny has triggered a test Expensify/App build. You can view the workflow run here. |
|
🧪🧪 Use the links below to test this adhoc build on Android, iOS, and Web. Happy testing! 🧪🧪
|
Both the manual Refresh path (usePageRefresh) and the automatic lazyRetry path were calling window.location.reload() without first clearing the Workbox service worker caches. In Safari PWA standalone mode the new SW (skipWaiting + clientsClaim) intercepts the reload and re-serves the same stale precached shell, reproducing the identical ChunkLoadError on every refresh attempt.
Route both reloads through clearWorkboxRecoveryCaches() first, which unregisters the SW and deletes all Cache Storage so the reload fetches a fresh, internally-consistent shell from the CDN.
Adds regression tests for both paths.
Explanation of Change
Fixed Issues
$ #92407
PROPOSAL:
Tests
(if the QA fails and it breaks, run the following the script to fix)
Offline tests
QA Steps
Same as tests
PR Author Checklist
### Fixed Issuessection aboveTestssectionOffline stepssectionQA stepssectioncanBeMissingparam foruseOnyxtoggleReportand notonIconClick)src/languages/*files and using the translation methodSTYLE.md) were followedAvatar, I verified the components usingAvatarare working as expected)StyleUtils.getBackgroundAndBorderStyle(theme.componentBG))npm run compress-svg)Avataris modified, I verified thatAvataris working as expected in all cases)Designlabel and/or tagged@Expensify/designso the design team can review the changes.ScrollViewcomponent to make it scrollable when more elements are added to the page.mainbranch was merged into this PR after a review, I tested again and verified the outcome was still expected according to theTeststeps.