-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consider allowing a non-scope identifier for registrations #1512
Comments
This seems reasonable. Presumably this would be spec'ed so that this couldn't be used as an end-run around the https://w3c.github.io/ServiceWorker/#path-restriction mechanism by stealing a registration with actively controlled clients that could not otherwise be controlled by a given script? |
I'm not sure I understand, can you expand on the risk you are thinking of? Perhaps the question is around what happens in the scenario:
What happens? It seems skipWaiting() should either (a) cause the clients that no longer match the scope to stop being controlled or (b) fail to promote new version to active if there are non-matching controlled clients in play. |
Yes, that. Thanks! |
I guess it would reject if a service worker changed scope to a scope that already existed on the origin? await navigation.serviceWorker.register('/sw.js', {
scope: '/product/bar',
id: 'product-bar',
});
await navigation.serviceWorker.register('/sw2.js', {
scope: '/bar',
});
await navigation.serviceWorker.register('/sw.js', {
scope: '/bar',
id: 'product-bar',
}); |
As a compatibility path, what if // Past me did this:
await navigation.serviceWorker.register('/sw.js', {
scope: '/product/bar',
});
// But it's ok, I can move it:
await navigation.serviceWorker.register('/sw.js', {
scope: '/bar',
id: '/product/bar',
}); |
It could, but I think it would be consistent with current behavior to unregister the Tangentially related to this is something I've been thinking about for the scopes pattern matching effort. If we add multiple scope strings for a registration (as recommended from the TPAC meeting), then the overlapping scope scenario here could have just a partial overlap. I think we would probably want to treat them the same as what we do here, though; error or replace.
This makes sense to me, but I haven't thought it fully through. |
I would like to move forward on this with the following:
@jakearchibald @youennf @asutherland @jungkees please let me know if you have any comments or concerns with this. Thanks! |
@wanderview Elaborating on my original concern in #1512 (comment) I'm not sure from your description in #1512 (comment) how this prevents an end-run around https://w3c.github.io/ServiceWorker/#path-restriction by stealing existing controlled clients. Can you expand on how your prior comments interact with your current proposal given the following scenario:
Note that I'm not suggesting that https://w3c.github.io/ServiceWorker/#path-restriction is going to moot the massive badness of such a hypothetical compromise given that same-origin is where the actual security boundary is, but https://w3c.github.io/ServiceWorker/#path-restriction is an existing protection that is stricter than same-origin and we should be intentional about weakening or removing it. |
Sorry, I did not address it in my previous comment. I propose:
The main ugliness here is that the scope is now associated with a version of the service worker and not the registration as a whole. Its a worse form of the updateViaCache change on the registration previously. Should I expose a ServiceWorker.scope to reflect the scope of each version? I'm not sure it would be used in the algorithm, but it might help folks understand that it can change. Not sure on that one. |
Actually, I kind of like being explicit about the ServiceWorker having a scope and I think I would make some of the algorithms use that. |
That sounds reasonable to me, thank you. I recall the path-restriction check to be self-consistent and having the scope be per-ServiceWorker should make things work nicely. Pedantnit: I am assuming s/controlchange/controllerchange/ was intended for the event. |
Moving scope to a service worker level thing sounds fine, and I agree it should be exposed to script.
I agree this as the only sensible thing to do. However, I don't think
I don't think this is the right model. Calling const reg1 = await navigator.serviceWorker.register('/sw1.js');
const reg2 = await navigator.serviceWorker.register('/sw2.js', { id: 'foo' }); I'm assuming that:
Either:
In the second case, My only worry about this behaviour is it involves data loss. Things like push registrations, bg fetch, associated with |
I can make it reject for now and we can consider loosening it in the future. I do worry a bit about some sharp edges for sites in the future with my pattern matching scopes work. We want to support registrations with multiple scopes. If a site is migrating from having two registrations to a single registration that covers both of the old scopes how do they reasonably do that migration? It may be hard to ensure the timing of the registration/move of the final registration vs the removal of the old registration. They certainly can't do it atomically. Maybe we could consider adding a further |
Work-in-progress draft PR here: #1526 |
If you're going from two registrations to one, 'merge' might be a better option than 'overwrite'. await navigator.serviceWorker.register('/sw2.js', {
id: 'foo',
conflictResolution: 'merge',
}); Where the other options are |
Can you say more about how you would expect "merge" to work? Which service worker script is chosen? Similarly, when other attributes are different, which is selected? |
Hmm yeah it's not as easy as it seemed to me this morning. The idea would be to unregister the 'overwritten' registration, but move its inner registrations (like push, bg-fetch) to the new registration. But the timing of that is not at all clear. |
I also think we probably want to add a |
While thinking about this today I realized that I need to map the global "id to registration" map really have {origin,id} tuples as the key. Different origins can use the same id values and they should not conflict. The alternative where we restrict id values to URLs like scope could instead just use the id as the key since the origin would be directly embedded in it. This would be a bit simpler to implement. I think I still prefer the ergonomics of the non-URL id, though. I personally find it confusing when URLs are used as identifiers, but not actually used for anything functional like fetching resources, etc. Just throwing this out there in case it changes anyone's mind. |
I agree it should be a non-URL id. Implementations can try and make these URLs internally if it's easier (and they can do so without creating security issues). |
And just to confirm, we are ok deviating from what manifest is doing? I believe they want to use a URL for the id in w3c/manifest#586. I think deviating should be ok since the manifest id value will still be usable as a service worker id value. And there is nothing requiring them to be identical for a PWA beyond code hygiene as far as I know. |
I think I am going to try to spec an option to replace registrations with conflicting scopes. Thinking about my conversations with partners I think its very high probability to be used/needed. I'm thinking an enum, though, instead of a boolean. So something like: navigator.serviceWorker.register('sw.js', {
scope: '/foo',
scopeConflict: 'unregister',
}); This would auto-unregister any conflicting service workers before installing the new service worker. If the installation fails, then you would end up with neither the new service worker nor the old conflicting registration. Doing an atomic replace after install is much more complex and I'm not sure its worth it. I think "unregister" will communicate this behavior a bit better than "replace". If we find we need an atomic replace in the future we can add another enum value for it. Thoughts? |
Hmm, but if its not an atomic replace, then the only value in doing it natively over in user js would be doing an immediate purge of the conflicting registration. But the purge PR in #1506 is not landed yet. And if it did land, we could just expose an So I guess ignore the last comment. I will not pursue the scopeConflict option for now. |
…ixes w3c#1512) This commit adds support for a registration `id` value. This will be used to uniquely identify registrations instead of `scope`. This will make it easier for a developer to migrate a registration from one `scope` to another. The commit consists of the following changes: * Add a DOMString `id` and associated getter to the registration. * Add an `origin` to the internal registration type. * Move the `scope` internal representation to `ServiceWorker` and add a new getter. * Make the `ServiceWorkerRegistration.scope` return the oldest associated worker's `scope`. * Migrate the registration map to be keyed by the tuple `(origin,id)`. * Migrate the job queue map to be keyed by the tuple `(origin,id)`. * Adjust job equality checks to treat register jobs to account for new id and scope semantics. * Support changing the scope during register operations. * Reject register operations that change the scope to a value that is already in use by another registration. * Properly un-control clients whose URL no longer matches a new active worker's scope.
I believe the PR is now ready for review if anyone is up for it. I'd like to do WPT tests as part of chromium implementation if that is ok. |
I'll aim to review this week |
I think I like the idea of relaxing the link between scopes to service workers. AIUI, the issue is about the ability to dynamically change service worker scopes. Let's say we introduce the possibility for a service worker to have a 'null' scope (aka a service worker is registered but has no entry in the scope to registration map). |
What is the use case for this? I am having a hard time coming up with one based on the partners I have talked to. A more realistic scenario is collapsing multiple old registrations into a single list scope registration. But I think we could tackle that by allowing the new registration to simply overwrite old registrations with a "force:true" option or something. That would be much less complex than a general locking mechanism. If we want to allow registrations to change scope, it seems necessary to me for the scope not to be used as the unique identifier for the registration. I don't see how trying to have mutable unique identifiers for registrations makes things any easier. If anything it seems to make things dramatically more complex.
Can you please elaborate on this? I see you've identified two scopes, but I don't understand what this is supposed to do to the service worker registrations.
This sort of thing was proposed long ago by mozilla with |
I do not have a particular use case. It is just an extrapolation of the use cases you were mentioning.
Agreed lock/unlock might be too complex. I fear force:true (or other booleans proposed in this issue) might be too restrictive.
As I said earlier, I agree with the idea to relax the relationship between scopes and registrations.
Let's say we have two old service workers sw1 and sw2 and want to replace them with a brand new sw3. A weakness of this proposal is the fact that sw3 would be registered to an initial scope, hence why I was mentioning the idea of an empty scope. |
Without exposing the id there is no way for the page to reliably lookup and refer to registrations. I really don't understand the concern with exposing the id either. What negative consequence are you concerned about? |
To refer to registrations, we currently have ServiceWorkerRegistration. |
If the scope is mutable, than getRegistration(clientURL) is non-deterministic as to what will be returned. Has the scope already changed? Similarly, getRegistrations() returns all registrations, but you now have to divine which one is the one you want when there is no fixed identifier you can rely on being unchanged. Again I ask, what specific risk or harm are you concerned about with exposing registration.id? Trying to think of objections I can come up with these assurances:
As to whether register() is the right place to make a change I feel pretty strongly that it is. All other multiple properties directly on ServiceWorkerRegistration and ServiceWorker are modified through register() as the mutating function. It works for both new registrations and modify existing registrations. (The only exception to this are things like NavigationPreloadManager which are on separate manager objects, but scope is a property and not on a manager object.) Finally, the partners I have talked to don't seem concerned about the multiple registration case. Perhaps we will get future feedback that its a use case that needs to be solved, but it can be done incrementally on top of this proposal. Nothing here precludes dealing with that use case later. We can add additional options to register() or a separate method as you propose. Right now, though, I don't have any developers telling me that multiple registration scope mutations is an important problem. And it will be very complex and costly to make work atomically due to life cycle issues; i.e. you need to remove conflicting registrations before you know if the new installation works. I strongly feel we should not commit to this use case without developers providing data that its worth paying the cost. @jakearchibald and @asutherland as other folks who have chimed here do you have any opinions? |
This is part of my concern or my misunderstanding of how this is expected to work. If there is a one-to-one mapping between IDs and ServiceWorkerRegistration objects, why not passing ServiceWorkerRegistration directly to the API? Would there be a difference in behavior?
Are we confident in that? How is IDs exposure expected to help there? A service worker might support several scopes in the future, we might end up wanting to migrate some but maybe not all of the scopes to a different service worker. |
I think #1085 made the case for why id's are potentially quite useful. If you have a complex site using a complex ServiceWorker, having registrations named using a site-provided id lets one create "v1" and "v2" registrations and then identify what version is currently controlling a scope, determine whether it's installed, pivot it into place, etc. The current ServiceWorker model for registrations and updates doesn't correspond well with the reality of how software/websites are developed. This enhancement helps address that and having the id makes that much more straightforward. From a testing perspective I think the id is also invaluable. This can be reverse engineered by ensuring you have a global that lives for the entire duration of a test that leverages the object identity of the ServiceWorkerRegistrations to assign persistent names to them, but that's a lot of work in a world where it's already difficult to write, let alone understand, ServiceWorker tests. |
A website can get the registration for a given scope, post message to the corresponding service worker which will post message back its version. For instance, the version value could be baked into JS script, which might be handy to handle DOM cache versioning. I am not sure how a web site can be sure that an ID it gave to register will exactly refer to the registration object the page is thinking of in multi-process architectures? Say we get registration.id and plan to use it, but take enough time that another page uses to unregister/register registrations with the same ID. Do IDs need to be given by web pages? Can/should they be generated by the user agent? |
postMessage-ing a ServiceWorker which potentially could bring both processes and threads into existence and involve evaluating megabytes of JS to avoid exposing a string identifier seems like a bad battery trade-off? Especially if such an idiom might result in a page postMesssaging potentially N registrations for its whole origin?
My understanding of the proposal was that basically everything that is currently "scope to FOO map" becomes "id to FOO map". So the existing scope to job queue map and its ordering guarantees and coalescing and idempotency semantics will be maintained. When coordination that exceeds this weak ordering mechanism is required, there's https://wicg.github.io/web-locks/ (which Firefox plans to implement).
I believe the prior suggestion was to default to using the scope as the id if not explicitly specified, which seems reasonable and equivalent to current behavior. |
In addition to Andrew's comments I would also note that the web platform is rife with precedent for developer-set identifiers; e.g. Cache API cache names, IDB database names, |
I would feel more comfortable if we could validate this will work out well with the more complex cases that will arise in the future (scope sets, partial migrations...). In particular, as I asked earlier, do we have a rough understanding on how
For instance, what is the implication of scope sets to that proposal?
This could be optimised, say service worker at install time adds its scope and version in some database.
In terms of terminology, it seems that name is a tad better than ID here. |
What is a "scope set"? Do you mean the list from the scope pattern effort? If so, my current plan is the id value would get the serialized string form of the scope pattern, whether its a list or not. Since that work is not finalized I don't have a finalized serialization. I don't see why that should stop the addition of an id, though. We could also make an id mandatory for new scope pattern or pattern list uses. There is no backward compat issues to deal with there since its a new feature. I don't have a strong preference on that vs stringifying the scope pattern.
I don't care if we call this a
I still don't fully understand what use cases you are exactly trying to cover. The more clearly you define those the better I can address them. Speaking to the possible "many-to-one" registration convergence case we discussed above I've already briefly mentioned on incremental approach: navigator.serviceWorker.register(script, {
id: 'foo',
scopePattern: patternList,
replaceConflicts: true // replace any conflicting registrations
}); Or we could make it more explicit: navigator.serviceWorker.register(script, {
id: 'foo',
scopePattern: patternList,
unregister: { sw_id_1, sw_id_2, sw_id_3 } // remove other registrations we are replacing
}); Or we could create a separate function like you proposed above, but it would take id's instead of scopes: navigator.serviceWorker.updateScopes({sw1: sw_id_1, sw2: sw_id_2, ...}) Again, I don't think we have justification to pay the complexity cost for any of these solutions yet. However, I really don't see any limitations imposed by having a name/id to reference registrations. |
xref #1331 and in particular Asa's comment #1331 (comment) which cited latencies of up to 1 second with IndexedDB. Also xref #1157 |
This comment was done 2 years ago. Since then, I know some implementation efforts have been made to speed up IDB.
This seems like a good plan, good enough that web developers might not feel the need to set IDs very often. For instance, if we take your example, we could pass scopes instead of IDs: Another possibility is to pass registration objects directly: One problem I have with the ID proposal is that ID is optional, because we already have scope and scope is mandatory. |
I would argue it would be a best practice to set an id if you are using a pattern or pattern list. This makes me more inclined to go with your previous suggestion of requiring the id in those cases.
I think the id is much more developer friendly. Typing a serialized scope list may not be very easy or readable. Also, an API which requires you to type the list as it was and the list as it will is going to be very verbose. Allowing the developer to provide a meaningful product name like "my-product-foo" is going to be a lot more ergonomic. Also, the id makes it clear we are modifying a single registration and not creating a new registration to replace an old one. This is important when you consider other state that sits in managers (push, nav preload, etc) attached to the registration. We don't want to lose all that when the scope changes. Finally, I think its just bad design to allow mutation of the scope on a registration and also try to keep using the scope as a key. How is the spec supposed to sanely deal with a registration where the active worker is scope A and the installing worker is for scope B? We need a stable, immutable key. |
If web app does not provide the identifier, the element is not in the map.
Let's get back to your first example. The same argument applies if we try to sneak version info in the IDs. If a web app uses 'product-v1' and wants to go to 'product-v2', how can the web app do without removing the 'product-v1' registration? Another point related to the versioning use case. Also, from a readability point of view, using 'id' in the register method does not make it clear at all that sw2 is 'replacing' sw1.
Agreed we do not want automatic dropping of managers. Another aspect that does not look great with the ID proposal is that it makes it easy to 'replace' one existing registration but not two existing registrations. If we introduce a way for one service worker registration to lock another job queue, this paves the way for multiple registration removal support, should we feel the need.
It is not yet clear to me whether we want to mutate scopes of a registration or whether we want to easily replace some registrations by a new registration.
I am not against IDs per se. |
Moving from a sub-product up to a top level product is not something I've heard any developers tell me they want to do. But even, so the fact that they could do this if the need arose is a feature. The fact that the name becomes sub-optimal should not prevent us from supporting the other more common use cases. In the more general case, I don't think the browser should say developers can't name things because they might choose bad names.
Taking stateful things that used to have an immutable relation with a registration and now allowing them to move seems to add a lot of complexity. In particular, you would have to move the job queue while you are in the middle of a job running from that queue. This does not seem like a good design option to me.
I'm not sure how this helps to be honest. I guess you are saying allow "old scope" and "new scope" to point to the same registration for some time? I think the complexity comes from when "old scope" stops working and how to handle that transition without edge cases. That problem seems to persist regardless of how long you delay dropping the "old scope". Maybe it would be possible to make these work with a lot of effort, but the cost benefit does not seem to make sense to me. Identifier/name values are such a common, simple thing in web APIs already and would be relatively straightforward to add here. Anyway, if there is time I'd like to discuss this proposal at one of the TPAC meetings this being discussed in #1536. |
Sorry, I was not clear.
It was not really old scope vs. new scope but more a way of implementing your idea of a registration with a list of scopes.
Sounds good to me. |
Yes, I agree implementations will need a lookup index from each individual scope to the registration to perform navigation url matching. The spec, though, needs a unique, immutable key in various places to lookup registrations. For example, you can have multiple jobs in the queue with keys baked into the structures to lookup registrations. You can't just hold registrations instead because the registrations may be removed before the job runs. If you try to use the scope as a key the scope could change before the job executes. Therefore I don't think we can use scope here without a lot of added complexity to detect and handle these edge cases. |
Interesting. Identifiying these edge cases seem useful. |
I'm not sure I agree. I've already done the work to write a PR that meets the use cases in a more straightforward manner without these edge cases. I'm not sure what benefit there is to justify the cost of working through every edge case of a more complex approach when we have a simple solution at hand already. |
This was discussed on the November 2020 virtual call. I believe there was positive support from Mozilla, Microsoft, and LinkedIn. |
FYI, I'm finally starting work on implementing this. |
…ixes w3c#1512) This commit adds support for a registration `id` value. This will be used to uniquely identify registrations instead of `scope`. This will make it easier for a developer to migrate a registration from one `scope` to another. The commit consists of the following changes: * Add a DOMString `id` and associated getter to the registration. * Add an `origin` to the internal registration type. * Move the `scope` internal representation to `ServiceWorker` and add a new getter. * Make the `ServiceWorkerRegistration.scope` return the oldest associated worker's `scope`. * Migrate the registration map to be keyed by the tuple `(origin,id)`. * Migrate the job queue map to be keyed by the tuple `(origin,id)`. * Adjust job equality checks to treat register jobs to account for new id and scope semantics. * Support changing the scope during register operations. * Reject register operations that change the scope to a value that is already in use by another registration. * Properly un-control clients whose URL no longer matches a new active worker's scope.
Currently registrations are identified and keyed based on their scope string. This works well for most cases, but can create difficulties for folks that want to modify the structure of their site.
For example, consider a site that has a product hosted at a location like:
foo.com/product/bar
But they want to re-organize their site to remove the "/product" sub-path since its needlessly verbose. They want to move to:
foo.com/bar
In this case they must unregister their old scope and register their new scope. Both the removal of the old registration and the installation of the new registration are async operations that can take time to complete. In addition, both of these service workers may wish to manage the same underlying resources. Trying to coordinate between the service workers to avoid accidentally breaking something can be difficult.
As an alternative, we could allow the service worker to specify a separate origin-unique identifier. For example:
Then when the site registers a service worker on a different scope with the same
id
, it would be treated as an update of the service worker that modifies the scope.While there are uses for this today, this sort of thing may be a lot more useful in a world where the scopes pattern matching has been implemented. With the scopes pattern matching sites may want to mutate their scope pattern over time to cover more sites or to fix mistakes in the pattern, etc.
Note, there is also an effort to add a unique identifier to manifests for webapps in w3c/manifest#586. I asked if they wanted to try to unify with service workers, but the feedback was that they should be separate.
The manifest issue also suggests the identifier should be a URL instead of an opaque string. I don't have a strong feeling about, although I feel it might confuse people into thinking it somehow operates as a scope.
Edit:
Final proposal is summarized in #1526.
Security/Privacy Considerations:
This proposal does not change the security or privacy characteristics of service workers:
The text was updated successfully, but these errors were encountered: