Allow on-demand precaching for cache recovery #2858

justinbayctxs · 2021-06-02T15:50:28Z

Library Affected:
workbox-precaching

Browser & Platform:
All

Issue or Feature Request Description:
In Workbox v6, PrecacheController changed such that precaching can only be performed during certain SW lifecycle events. This unfortunately breaks a use case that was supported in v5 and before -- cache recovery. If SW cache entries are removed by a browser while the SW stays active, offline mode will not work even though the SW seems healthy. We were doing cache validation based on the embedded manifest during app startup, and triggering a precache run with controller.install() if any entries from the manifest were missing from the cache.

A workaround for this case is to unregister the service worker such that reinstallation will trigger precaching, but critically, this won't happen until the next application load, which is problematic for long-lived app sessions.

A top-level API like validateAndRecoverPrecache next to precacheAndRoute would be ideal, but barring that, restoring ad-hoc usage of the PrecacheController's installer would also work.

The text was updated successfully, but these errors were encountered:

jeffposnick · 2021-06-02T21:28:57Z

A few random thoughts:

Do you know why your users end up with inconsistent precaches? Is this due to users explicitly messing around with things in DevTools, or does it happen organically for some reason? FWIW, the methods I'm aware of for clearing site storage will also unregister any service workers.
Performing operations each time the service worker starts up (which is I'm assuming what you mean by "during app startup") can end up negatively impacting your web app's performance, especially for late-loaded subresource requests. I don't know how long your revalidation logic takes, but it would need to run to completion before any fetch handlers get a chance to respond to network events when a service worker is revived after the initial page load.
I've been thinking more about enabling subresource integrity by default in our build tools with Workbox v7, with an opt-out mode for deployment scenarios in which it really wouldn't work. One of the things that would open the door for is, if there's a precache miss (due to a deleted entry, etc.) and the user is online, using the network fallback response to repopulate the cache—assuming the network response has the expected SRI hash. We could actually enable this behavior prior to v7 if you think it would be useful, but it would be your responsibility to add in the SRI hashes to the manifest at build time.
Finally, I don't necessarily recommend it, but if you really wanted to implement the same logic that you were previously using to "heal" the precache each time the service worker started up, it really shouldn't be too difficult to do by just finding the missing entries, using fetch() to retrieve the latest responses from the server (hopefully getting the expected version back if there's been a redeployment in the interim...), and then using cache.put() to store the response in the precache, with the appropriate ?__WB_REVISION__=<revision> query string added to it if the manifest entry has a revision field that isn't null. That's basically what's happening inside of the install handler anyway. I could share some actual code that would do this if you really need it, but since I'm not sure it's a good idea, I'm not going to write it up now.

justinbayctxs · 2021-06-03T16:28:35Z

Thanks for the considered response! There are some interesting things here.

So far exposure to this problem is quite limited, but it was not due to explicit end-user action. My understanding is that browsers are free to nuke "origin storage" as they see fit, so on low-spec devices presumably some kind of quota restriction may be hit more often. Critically, I'm not aware of anything tying a storage clear by the browser to SW uninstallation.
I would also note this was most observed in a fairly exotic environment -- a Webkit webview in a native app, running on Linux.
Yes, I've seen revalidation can actually take a few seconds, though it is at least off the main thread. We've got it positioned roughly after initial app load -- the intent is that each "app session" heals itself for offline capabilities, but we don't intend it to block anything at the moment. Whether and how to notify the user their app might not be ready for offline usage is far from a solved problem for us.
We are actually doing resource integrity checks as indicated in the doc :) I think healing the precache as requests are made may be helpful, but we are looking primarily for a periodic full revalidation of the cached app assets.
This did occur to me, but I was slightly put off by the complexity of the precache controller and its strategy handlers (as far as trying to replicate its functionality). From your description though it sounds like the only additional contract of the workbox precache to be concerned about matching is the "revision" query string bit... is that safe to say?

jeffposnick · 2021-06-03T18:37:42Z

Sure—this is a complex space where some browser environments might not always behave as expected, so it's good to talk it through.

The service worker spec isn't normative about this, but this and this section cover some of the recommended behaviors for browsers—basically, service worker registrations should be deleted when a user explicitly clears out an origin's storage, and individual entries in the Cache Storage API shouldn't just disappear unless explicitly deleted. But sure, I could imagine how a WebKit WebView might end up a bit funky.
While revalidation takes place off the main thread, it's unfortunately taking place on the thread that's responsible for generating responses to all your your network requests. So if your code runs outside of an event handler, it will execute each time the service worker starts up after a period of idleness, introducing a few seconds of latency before responding to any (late-loaded) subresource requests. Maybe you've already worked around this by triggering your revalidation code to run inside of a fetch handler after responding to a navigation request, but because the service worker is single threaded, even that could delay loading the critical early subresource requests.
I think I might add in that auto-repair functionality anyway, just because using subresource integrity works around the one reason we had for not doing auto-repair. But yes, I understand that you'd like to pre-emptively address this, rather than wait for runtime.
There's a lot of extra stuff going on in the PrecacheStrategy, around reporting what gets installed, handling both caching and serving, and more. But if you really wanted to do this repair stuff yourself, using getCacheKeyForURL() to find the appropriate cache key and calling cache.put() to add it should be "okay". An alternative approach, if you're really worried about this, might be to include a timestamp within your service worker file and redeploy it each day/week (depending on your preferred cadence). The updated timestamp will trigger a new service worker installation, even if nothing has changed in the precache manifest, and if there is a client with an incomplete precache, it will automatically be populated when the new service worker is installed. Clients that already have a complete precache will effectively have a no-op install event. The benefit of this approach (which, again, shouldn't be generally necessary) is that the repopulation will happen in a separate thread for the installing service worker, and won't block network requests handled by the currently active service worker.

jeffposnick · 2021-09-09T17:45:55Z

You can try out the precaching repair functionality in https://github.com/GoogleChrome/workbox/releases/tag/v6.3.0

jeffposnick added Discuss An open question, where input from the community would be appreciated. workbox-precaching labels Jun 2, 2021

This was referenced Aug 18, 2021

Issue with cacheDidUpdate plugin in workbox-precaching when other callbacks are also used #2878

Closed

Allow precaching "repair" when using subresource integrity #2921

Merged

tropicadri closed this as completed in #2921 Aug 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow on-demand precaching for cache recovery #2858

Allow on-demand precaching for cache recovery #2858

justinbayctxs commented Jun 2, 2021

jeffposnick commented Jun 2, 2021

justinbayctxs commented Jun 3, 2021 •

edited

jeffposnick commented Jun 3, 2021

jeffposnick commented Sep 9, 2021

Allow on-demand precaching for cache recovery #2858

Allow on-demand precaching for cache recovery #2858

Comments

justinbayctxs commented Jun 2, 2021

jeffposnick commented Jun 2, 2021

justinbayctxs commented Jun 3, 2021 • edited

jeffposnick commented Jun 3, 2021

jeffposnick commented Sep 9, 2021

justinbayctxs commented Jun 3, 2021 •

edited