Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do service/shared workers and BroadcastChannel deserve a special strategy? #9

Closed
annevk opened this issue May 15, 2020 · 26 comments
Closed

Comments

@annevk
Copy link
Collaborator

annevk commented May 15, 2020

From https://bugzilla.mozilla.org/show_bug.cgi?id=1495241#c1 (more context at https://privacycg.github.io/storage-partitioning/):

A problem with isolating service workers is that they are somewhat intrinsically linked to globalThis.caches aka the Cache API, a typical origin-scoped storage API. And that in turn is expected to be the same as localStorage or Indexed DB as sites might have interdependencies between the data they put in each.

Possible solutions:

  1. Using the "storage access" principal is what dFPI does and creates a strange transition scenario in that you have the old and new service worker that can each talk to a different group. At that point all the third parties the old service worker is in touch with can be given the first party data from the new service worker. Also, once B embedded in A is granted storage access, A might be able to tell some additional things about B, but I'm not sure how avoidable that is anyway.
  2. We could attempt to disable service workers (as well as BroadcastChannel and shared workers) when a document does not have storage access to avoid the weirdness of being able to communicate with documents in a third party and first party state at the same time. (An assumption here is that sites do not assume that if they have storage they also have service workers (as well as BroadcastChannel and shared workers).)
  3. We could scope service workers (as well as BroadcastChannel and shared workers) to the agent cluster (or perhaps browsing context group).
    1. If we did this unconditionally it would largely defeat the point of BroadcastChannel and shared workers, which is to be able to share work across many instances of an application (e.g., consider having multiple editable documents open in separate tabs). And it might also defeat the clients API in service workers.
    2. If we only did this for third parties we would again hit the problematic transition scenario when there's a popup. Though perhaps it's reasonable to consider an opener popup (as opposed to a noopener popup) in a special way to encourage sites to adopt Cross-Origin-Opener-Policy and get their own browsing context group? I.e., while you get first-party storage, you still get don't get top shelf communication channels.

Based on this I still favor 2, but 3.2 is also interesting.

cc @andrewsutherland @jakearchibald @inexorabletash @jkarlin @johnwilander

@jakearchibald
Copy link

What's the difference between service workers, broadcast channel, and indexeddb in terms of this?

With IDB:

  • Many clients from an origin may have different storage buckets, due to isolation.
  • This means multiple instances of IDB storage.
  • At some point, one of these clients may switch from an isolated bucket to a first party bucket, eg through requestStorageAccess.
  • Something needs to happen with existing IDB connections & transactions, and that switch needs to be atomic to prevent a client having access to both first and third party storage at the same time.

I figured it'd be the same with service workers & cache API instances:

  • Many clients from an origin may have different storage buckets, due to isolation.
  • This means multiple instances of the service worker.
  • At some point, one of these clients may switch from an isolated bucket to a first party bucket, eg through requestStorageAccess.
  • Something needs to happen with existing service worker instances, and that switch needs to be atomic to prevent a client having access to both first and third party storage at the same time.

With IDB, it probably means abruptly terminating connections, which may impact in-progress transitions. With service worker, it means abruptly relinquishing control of the clients that are changing storage (and those clients may now be controlled by a first-party service worker). This may abort on-going fetches.

This would need to be atomic across all storage, service worker, and broadcast channel, but don't we have some of the mechanisms for this already with Clear-Site-Data, or is that just hand-waved?

@annevk
Copy link
Collaborator Author

annevk commented May 15, 2020

In terms of defining primitives, see whatwg/storage#18 (comment) and the first step of that is whatwg/storage#86 which I need to land so I can work on defining Clear-Site-Data (and everyone can update their storage endpoint to be a defined storage endpoint). I'm not sure it's going to be fully atomic, but as close as can be.

And yeah, perhaps termination for all of these is the way to go. But that is somewhat disruptive and might not work that well if sometimes storage access is granted implicitly.

@jakearchibald
Copy link

And yeah, perhaps termination for all of these is the way to go. But that is somewhat disruptive

I guess you could have a middle phase where it queues new bits of first-party access (IDB connections, caches.open, fetches) on things from the current storage ending naturally. Sync storage access like localstorage and cookies would still use third-party. Then, once usage of third-party ends, all the first-party stuff is unblocked.

However, this is tricky with long-running things like:

  • HTTP/websocket connections.
  • IndexedDB.
  • Transferrable streams?
  • Shared array buffers?

So maybe there just needs to be a timeout.

@inexorabletash
Copy link

Web Locks will need similar consideration. (Only a single implementer so far, though)

@asutherland
Copy link

asutherland commented May 15, 2020

Service Workers, Shared Workers, and BroadcastChannel should all be storage bottles in the same "default" storage bucket. (Conveniently, the choice of storage bottles as a metaphor works well with "message in a bottle".)

It is an option to forbid the use of the specific bottles. Also, the origins can just be forbidden from accessing the storage shed entirely.

In terms of the specific solutions proposed:

  1. Storage Shed Moving Day.
    • It's a lot of work to try and atomically switch storage sheds/buckets such that globals don't have access to multiple buckets simultaneously. It's also not clear what the benefit is. Under a tracking threat model, the concern is being able to say "uniquely generated identifier token A and uniquely generated identifier token B are really the same". Even if you totally destroyed the global during the transition, it seems likely a server endpoint can uniquely stitch together the tuples of [IP address, "hey, I'm transitioning from 3rd party with token B"] and [IP address, "hey, I just transitioned to 1st party token A"] given the time locality.
    • Given that information leaks are almost certain, why not let the global move itself into its new 1st party storage shed shelf and then destroy the old storage shelf and all its buckets once moving day is over?
  2. Forbid access to the storage shed entirely or outlaw specific storage bottles in specific sheds.
    • This seems fine.
  3. Creation of agent cluster/browsing context group-specific storage sheds with session lifetime. At least, once we've agreed on the storage bucket inviolable identity invariant.
    • This seems like a browser specific decision with the browser choosing to do this when it thinks things will break if not given access to a storage shed. (And that the global can technically already communicate with other instances of the origin so why not scope the storage shed to the agent cluster/browsing context group.)
    • Storage shed moving day would still apply.

@annevk
Copy link
Collaborator Author

annevk commented Jun 9, 2020

So, to address Jake's question once more, my thinking was that there's a subtle difference. Say you have two browsing context groups containing a single top-level browsing context. This top-level browsing context contains a document A that embeds a cross-site document B. We will call the groups 1 and 2.

I think it's different for B in 1 and 2 to have an active communication channel, say, through service workers or broadcast channels, than to have a shared storage backend. But maybe that's primarily because sites are unlikely to use that as a communication channel and they would if you took the "real" communication channels away.


It sounds like everyone is in favor of 1 here, which is also fine, and yeah, we should make mark the old storage shelf for clearance somehow.


I guess I hadn't even mentioned Safari's strategy, which is to always partition them in third-party contexts, leading to somewhat strange situations with popups (as they are not partitioned and therefore cannot communicate with you even though you have synchronous script access and thus totally can in other ways), but that's also still a possibility if we keep these aligned with other storage APIs as seems to be the preference.

@wanderview
Copy link

To make a potential storage transition easier for the service worker case we have internally talked about possibly requiring the page to trigger a reload in order to get the new first party service worker controller. Essentially, if they want to be loaded by the first party service worker fetch event handler then they have to reset their page state.

@annevk
Copy link
Collaborator Author

annevk commented Jun 9, 2020

@wanderview what happens with Clear-Site-Data? But yeah, the transition scenarios for some of these seem complex. Need to think about them some more.

@asutherland
Copy link

@wanderview Would this preclude use of Clients.claim somehow, such as having Match Service Worker Registration in step 3.1.3 ignore the change in the "storage access" principal?

@wanderview
Copy link

@wanderview what happens with Clear-Site-Data? But yeah, the transition scenarios for some of these seem complex. Need to think about them some more.

We have at least been thinking of the transition being from a sharded state to a sharded+1st-party state. So clear-site-data in sharded+1st-party state I would think would wipe both the sharded and 1st-party storage.

@wanderview Would this preclude use of Clients.claim somehow, such as having Match Service Worker Registration in step 3.1.3 ignore the change in the "storage access" principal?

Unsure. We could prevent it or not. In the case of claim the site is already opting in to switching service workers in the middle of the life of the page. The reload is mainly needed to get into the "load me completely from 1st party state" condition.

It seems like a reload could be possibly reasonable UX if we are already requiring an interaction. Push a button, then it loads what you were looking for, etc.

Just to be clear, these are early thoughts, but make sense to me.

@wanderview
Copy link

wanderview commented Jun 9, 2020

I'm unsure about switching SharedWorker and BroadcastChannel. We've been talking about maybe having a separate bucket exposed for your newly revealed 1st-party storage instead of replacing the sharded API on the default bucket. I guess that bucket could also have something for SharedWorker and BroadcastChannel. Or we could likewise require a reload to get 1st party access to SharedWorker and BroadcastChannel.

@annevk
Copy link
Collaborator Author

annevk commented Jun 10, 2020

What I meant with Clear-Site-Data is that I was curious what happens to existing service workers and such when that comes in. The way I have been thinking about it (and discussed on whatwg/storage in various issues) is that going from partitioned to unpartitioned is equivalent to replacing your data with nothing.

@wanderview
Copy link

Not sure I completely follow.

I was curious what happens to existing service workers and such when that comes in

Note, the service worker spec was recently changed to support immediately removing service workers from clear-site-data. (And chrome shipped that in M83, although there are some known bugs.)

going from partitioned to unpartitioned is equivalent to replacing your data with nothing.

Not sure I agree these are equivalent. If there is a 1st-party service worker then you are not going to "nothing"? Also, there seems like a difference between switching between storage buckets and deleting a storage bucket.

In our internal conversations at least, though, we have been wary of trying to replace existing storage end points and leaning more towards any transition being additively exposing access to 1st party storage through some bucket API surface. I think this approach is even less like clear-site-data.

But again, maybe I'm still missing your point. Sorry if that is the case.

@asutherland
Copy link

In our internal conversations at least, though, we have been wary of trying to replace existing storage end points and leaning more towards any transition being additively exposing access to 1st party storage through some bucket API surface. I think this approach is even less like clear-site-data.

I like this idea. It helps normalize the concepts of multiple buckets and reduces the likelihood of developer confusion and breakage as the localStorage, sessionStorage, indexedDB, caches, and serviceWorker getters all start returning different instances after requestStorageAccess() resolves. (In the multiple storage buckets proposal slides slide 24 we'd already included serviceWorker on the buckets.)

There's still the subject of this issue then... how to expose SharedWorker and BroadcastChannel construction capabilities on the bucket given that they're interfaces with constructors and exposed to JS as functions and all the stuff at https://heycam.github.io/webidl/#interface-object. It seems hard or awkward to re-expose them as bucket.BroadcastChannel so that new bucket.BroadcastChannel("foo") would behave sanely, especially if doing var BC = bucket.BroadcastChannel; new BC("foo");.

Would we make the constructors take optional storageBucket arguments (in WorkerOptions for SharedWorker and a new BroadcastChannelOptions dictionary for BroadcastChannel) and then expose bucket.makeBroadcastChannel and bucket.makeSharedWorker or some other helper method to provide some symmetry/discoverability that sets the storageBucket argument as it passes through to the constructor?

@wanderview
Copy link

There's still the subject of this issue then... how to expose SharedWorker and BroadcastChannel construction capabilities on the bucket given that they're interfaces with constructors and exposed to JS as functions

Good point about them being constructors. I think I would lean towards just not exposing them on transition for now and force pages to reload to get them if they want them. If there is a strong use case for access without a reload then we can figure out how to expose them, perhaps with an option param as you say.

@annevk
Copy link
Collaborator Author

annevk commented Jun 11, 2020

Both buckets and reloads would only work for non-heuristic based transitions so I'm not convinced that's the way to go. If there are problems with shared/service/broadcast that are particularly hard to overcome I'd rather we try to disable them in third parties that don't have unpartitioned storage.

Also, there seems like a difference between switching between storage buckets and deleting a storage bucket.

To be clear, the plan is not to delete it. The plan is to replace it with an empty bucket.

@wanderview
Copy link

What kind of heuristic based transition are you anticipating? I think one of our goals is to make things predictable and have been trying to avoid non-deterministic storage access.

Providing no storage seems to foreclose a large number of use cases. For example, how do you build any kind of offline experience? Maybe its fine to drop those use cases for sites on the disconnect.me list, etc, but seems too restrictive for sites in general.

@annevk
Copy link
Collaborator Author

annevk commented Jun 12, 2020

Both Firefox and Safari have a number of heuristics by which the storage-access permission is granted. https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Privacy/Storage_access_policy documents this for Firefox.

I wasn't trying to say they should not get storage, but that perhaps they should not get shared/service/broadcast. And offline could still work, if the user grants them the storage-access permission upon first use.

I'm still somewhat hopeful about figuring out the replacement operation though. Could you perhaps expand on why "need to reload" was the conclusion you all reached?

@wanderview
Copy link

I'm still somewhat hopeful about figuring out the replacement operation though. Could you perhaps expand on why "need to reload" was the conclusion you all reached?

I wouldn't say we've reached a conclusion, but its a leaning. Its my personal preference at least. And we've only really discussed it internally around service workers.

I think we expect any replacement to be 1) difficult to get right 2) difficult for developers to understand. That's why we are trying to do additive approach where you end up with both shared and first party storage.

Additive, however, does not work well with something like service worker controllers. The page can only have one controller. We still want to avoid replacement, though. In addition to the reasons above, service workers were designed with the idea that we cannot assume sites can handle the controlling service worker changing out from under them. That has always been opt-in via clients.claim().

Leaving the shared service worker in place and only switching the controller to the first party service worker on reload seems like the most straightforward approach. As @asutherland suggested we could also permit clients.claim() from the first party service worker to opt-in to a controller change without a reload.

For BroadcastChannel and SharedWorker I don't really have strong feelings yet. I do think we want to find an additive way to get first-party use, but the constructor API shape makes that more awkward. Maybe just accepting the weirdness of factory methods on the first-party bucket would be the easiest way out of the situation. I think these uses are going to be pretty niche anyway.

@wanderview
Copy link

Deleted my previous comment since I re-read this part:

I wasn't trying to say they should not get storage, but that perhaps they should not get shared/service/broadcast. And offline could still work, if the user grants them the storage-access permission upon first use.

I guess this depends if you are coming from some kind of storage in the partitioned state or not. In our case we are leaning towards sharded storage instead of blocking storage. So I don't think we would want to go from sharded access of APIs to taking them away.

Perhaps it would be reasonable to say "whatever access was available for shared/service/broadcast in the partitioned state should continue and then first party is available on the first load after access is granted". This would work for both blocking access and sharded access approaches in the partitioned state.

@annevk
Copy link
Collaborator Author

annevk commented Jun 12, 2020

FWIW, requiring claim() makes sense to me. I'm still curious what happens to existing shared/service/broadcast when Clear-Site-Data is seen on a response as I hope we can reuse that pattern here. I guess I need to start testing that.

@wanderview
Copy link

I'm still curious what happens to existing shared/service/broadcast when Clear-Site-Data is seen on a response as I hope we can reuse that pattern here. I guess I need to start testing that.

Note that there is in process spec work to make the service worker purge more immediate from clear-site-data here:

w3c/ServiceWorker#1506

Microsoft implemented the purge in chromium already and its in M83 which is now stable. It has some known bugs, though.

@johnwilander
Copy link

johnwilander commented Jun 12, 2020

@annevk
Copy link
Collaborator Author

annevk commented Feb 11, 2022

I'm going to close this issue on the basis that all these storage APIs should be permanently partitioned, with no departitioning allowed. Safari is already there. Chrome is planning to get there. And Firefox is now planning that as well.

There might still be room for document.requestStorageAccess() to affect storage, but that should be done through a dedicated bucket or some such, not the normal storage APIs.

@annevk annevk closed this as completed Feb 11, 2022
@jespertheend
Copy link

It seems like document.requestStorageAccess() does not currently have any effect on SharedWorkers.
But since browsers started partitioning SharedWorkers now, this is causing some breakage.

Is there extra spec work that needs to be done in order for requestStorageAccess() to work in this case, or are these browser bugs that should be reported?
I've already reported https://bugzilla.mozilla.org/show_bug.cgi?id=1859128 and there's some discussion going on in https://crbug.com/1490528.

@Trikolon
Copy link

It's not a bug. In its current form the Storage Access API does not unpartition storage, only cookies. See https://privacycg.github.io/storage-access/#storage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants