Join GitHub today
Pattern for handling fingerprinted assets in cache? #657
Is there any discussion around handling fingerprinted assets in the SW cache? Say your app's build process adds a fingerprint to an asset’s filename like
I get that Service Workers would remove the need for fingerprinting files like that, but for people who already have their build process in place/want fingerprinting where SW is not supported...what might be the best pattern for handling this?
Initially I imagine you have to:
Search the cache for a "match" on your asset
If an exact match is found (the fingerprints are equal):
If an exact match isn't found, but an older version of the resource is available (
If no match is found:
Talked with @wanderview briefly about this and he mentioned there used to be a
Perhaps solving this is decidedly outside the scope of the spec for the time being -- in which case I'll just write something to sit on top -- but I also wouldn't be surprised if this turns out to be a more common issue.
@annevk thanks for the response -- unfortunately I was hoping to avoid doing it like that. I feel like most build processes don't construct URLs that way. (I could be wrong.) There's also 1,001 opinions about how RESTful it is to use query strings, but people seem to be less averse to the idea of avoiding them.
Regardless of how people prefer to build their URLs, if I'm going to write something to sit on top, I'd want to take as many scenarios into account as possible. Wouldn't want to force people into changing their build processes.
Yea, this is true. As an official spec, ServiceWorkers would have the power to make a certain strategy more advantageous...but in saying that, I realize Service Workers shouldn't really be in the business of persuading URL structures. Which I guess is what it would have to do in order to cater to the initial use case I brought up.
I'll explain why the implementation is hard below, but if we can make this use case easier it will help developers adopt SW's faster. VARY headers are harder to configure and understand. Search URL components are not very restful and don't seem to fit what is often done today. Currently the only other option is to do cache.keys() and manually search.
I wonder if we could add a simple string "tag" to a request and then let
The implementation problem with
We either have to create an index on the URL in our storage DB, which can use a lot of disk space, or we have to keep everything in memory to scan it quickly. It just kind of suck. :-(
Exact matches are not as bad because we can hash them which keeps the DB index a reasonable size.
If the tag option isn't too messy to implement, I like that a lot. The problem isn't really about structuring URLs (as I was thinking earlier). It's just a matter of having Service Workers be smart enough to know that 2 files are related/loose equivalents.
If Service Workers can identify a relationship like that between files, developers can easily handle which assets to serve, determine their fallbacks, etc.
To clarify the tag idea a bit:
Notably, there is no way to inspect a request or response for the tag. If content needs that, then it would have to add a header to the response object, etc.
referenced this issue
Jun 30, 2015
@brittanystoroz (firstly, really sorry it's taken so long to get to this)
I don't think that's true. You can fight good http caching with SW, but it's not the optimal thing to do.
Tag feels a bit of a specific as a fix here. If we see this pattern we can bring back
Going to file this under future ideas.
If there's asset fingerprinting then there's almost certainly a build process for the site. If that's the case, then modifying the server worker script during the build process is an option, and that modification can include injecting an array of the up to date fingerprinted URLs into the script file.
This also plays nicely with the service worker lifecycle, in that the modified script will kick off a new
You do have to think a bit about timing in order to ensure that the cache is properly populated with the latest fingerprinted assets by the time the URLs in the controlled page are updated to refer to those assets.
That being said, I'd love to see some additional metadata, be it
From an implementation point of view, something like a tag would be easier to implement.
Unless I misunderstand the point, having the build process inject an array of updated fingerprinted URLs into the serviceworker script assumes an on install strategy for cache population?
Where as I think the behaviour described by the OP is more like the stale-while-revalidate pattern.
Except that in this context, we don't get a cache hit that we can respond with, while we go out to the network to revalidate the cached asset.
Rather, we get a cache miss, and we want the serviceworker to respond with a different cache entry while we go to the network and fetch the missing resource (and later remove that cache entry once the missing resource has successfully fetched/cached).
FWIW, this was the same stumbling block I hit when looking at introducing ServiceWorker to an existing app that uses fingerprinted assets (which led me here to this issue).
Switching to a cache, falling back to network would work OK with fingerprinted assets; but you lose the ability to provide an early (but stale) response in the event of a cache miss; and there's still the question of how to cleanup stale cache entries once a newer fingerprinted version of the asset is available in the cache.
FWIW, @jeffposnick ...just watched your talk at Chrome Dev Summit, which convinced me to look again at
Under Cache Storage on the DevTools Resources tab, I can see what you mean by using multiple caches, with a single entry per cache; and I agree it's not ideal...but hey, it does the job.