Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promptless storage access with constrained communication capabilities #41

Closed
johnwilander opened this issue May 21, 2020 · 22 comments
Closed
Assignees

Comments

@johnwilander
Copy link
Collaborator

@jkarlin brought up some ideas for the Storage Access API during the May 2020 virtual face-to-face meeting. From the meeting notes:

Josh Karlin: We want to avoid showing prompts to users when they'd have trouble making an informed decision. One possible idea: new types of iframes that might have constrained communication ability. e.g. think of an iframe that can't talk to the embedder. Maybe it's like a navigation: the user clicks, the constrained frame gets storage. Could work for a like button. Could work for a video that identifies that the user has paid and so they shouldn't see ads. Wouldn't work in cases where the embedding page needs to talk to the iframe. Still trying to investigate use cases, and we'll fall back to Storage Access if necessary.

Josh: We've been talking about having the 1st party embed it in a way that disclaims communication, which allows the 3rd party to have access. Could also imagine the 3rd party messaging it in response headers.

I thought about this and would like to share my view to see if we can get some progress here.

My interpretation of what Josh said is that there should be a way for a third-party iframe to request and be granted storage access without the purpose and means to communicate to the top frame, probably also not to sibling frames, possibly not even to child frames. I believe the intent is to stop migration of user identities between an iframe with storage access and the first party context of the top frame.

Is that accurate, Josh?

Say we were to offer an opt-in mode of the Storage Access API where the iframe gives up all its capabilities to talk to the rest of the page once it gets storage access. You could even envision this opt-in mode to be relaxed in some way, for instance be promptless which I think Google is shooting for given the "we want to avoid showing prompts to users" comment.

A straw man:
document.requestStorageAccess(options { scope : { singleIframeWithoutCommuncation, singleIframeWithCommunication, allSubresourcesWithCommunication } });

The vision for singleIframeWithoutCommuncation would be "Render the iframe with cookies as if it was a first party, isolated webpage with no further capabilities." This would be great for authenticated video embeds (henceforth videos.example) or authenticated document embeds that don't need to talk to anything else.

To achieve this opt-in isolation we'd:

  • Cut off postMessage.
  • Cut off scripting access across same-origin frames.
  • Return the empty string for document.referrer and any other signal of the iframe being embedded. (It might be impossible to fully fool the iframe to think that it's a top frame context.)
  • Cut off read access to the iframe's URL from the top frame.
  • Reload the iframe as part of granting it storage access so as to get rid of any temporary state stored in its JavaScript state or DOM.

The goal is to prevent any user identity leakage between the iframe and the top frame after storage access is granted.

Now imagine this:

  1. videos.example runs script in the first party context of a games.example webpage.
  2. The videos.example script stores a new random user ID in games.example's first party storage.
  3. The videos.example script opens an iframe with this URL: videos.example/?[videoClipID]&[new random user ID]
  4. The user taps to play the video, the iframe requests storage access with the optional singleIframeWithoutCommuncation, the browser skips the prompt, isolates the iframe, and reloads the iframe with cookies. Now videos.example has connected the new random user ID in games.example's storage with its own cookie user ID.

I don't see a way to stop that kind of user ID leakage. Do you?

Pinging @othermaciej @michaelkleber and @englehardt whom I think are interested in this too.

@jkarlin
Copy link

jkarlin commented May 21, 2020

Thanks for bringing this up John. @shivanigithub is putting together an explainer for the idea that she'll post in a week or two, but you have the gist of it.

My interpretation of what Josh said is that there should be a way for a third-party iframe to request and be granted storage access without the purpose and means to communicate to the top frame, probably also not to sibling frames, possibly not even to child frames. I believe the intent is to stop migration of user identities between an iframe with storage access and the first party context of the top frame.
Is that accurate, Josh?

That's correct. We think of it as having its own frame tree. It can talk to its children, but not to any frame from the embedder's frame tree. And the only input it gets is its URL (which must be protected from link decoration) and its frame size.

Say we were to offer an opt-in mode of the Storage Access API where the iframe gives up all its capabilities to talk to the rest of the page once it gets storage access. You could even envision this opt-in mode to be relaxed in some way, for instance be promptless which I think Google is shooting for given the "we want to avoid showing prompts to users" comment.

It's similar to a third party creating a popup to get at its 1p state and display its content. Only, the popup is visually embedded in the page instead of being in a new tab or window.

We've been thinking of instead of reloading, requiring the iframe to be in its isolated state from creation. And then once the user clicks it gets access to its 1p storage. So this would be more of an iframe attribute or perhaps some kind of a document policy. We think this type of frame will be useful for other cases than requestStorageAccess as well, so would prefer it to be a more general frame option.

Now imagine this:
videos.example runs script in the first party context of a games.example webpage.
The videos.example script stores a new random user ID in games.example's first party storage.
The videos.example script opens an iframe with this URL: videos.example/?[videoClipID]&[new random user ID]
The user taps to play the video, the iframe requests storage access with the optional singleIframeWithoutCommuncation, the browser skips the prompt, isolates the iframe, and reloads the iframe with cookies. Now videos.example has connected the new random user ID in games.example's storage with its own cookie user ID.
I don't see a way to stop that kind of user ID leakage. Do you?

When you think about this as an embedded popup, you can see how isolated iframes are akin to a user navigation. As with any cross-site user navigation, link decoration will need to be identified and addressed. So we should use whatever methods UAs are using to limit link decoration leakage.

@jackfrankland
Copy link

I like the sound of this. Just to think of ways that it could still be exploited in its current state (don't want to come across as negative though):

  1. the size of the frame could be changed over time, in accordance to a unique identifier.
  2. if the frames still share the same JS event loop with outer frames, you could use timing mechanisms as a tracking vector maybe

@othermaciej
Copy link

Because a unique ID can be passed in through the URL, I don't think this is good enough for promtlessness because it allows linking of identities.

At the very least, communication needs to be blocked before the frame requests storage access. Cutting off only after the call is clearly inadequate, even setting aside the URL issue.

Many embeds also have a reasonable basis to want to know the top URL (e.g. like buttons, comment sections) so we need to allow for that, which allows yet another communication side channel.

@shivanigithub
Copy link

I like the sound of this. Just to think of ways that it could still be exploited in its current state (don't want to come across as negative though):

  1. the size of the frame could be changed over time, in accordance to a unique identifier.

The idea is to not allow resizing, positioning APIs as well, for the frames in this restricted/isolated frame tree.

  1. if the frames still share the same JS event loop with outer frames, you could use timing mechanisms as a tracking vector maybe

Do you mean the JS event loop in the browser's implementation? Could you clarify the timing attack based on that?

@jackfrankland
Copy link

The idea is to not allow resizing, positioning APIs as well, for the frames in this restricted/isolated frame tree.

I imagine the frame would have to be able to be resized/repositioned by the top-level document. Do you mean the aim is that the frame wouldn't have access to its resize events, or access to the document and child elements' width and height?

Do you mean the JS event loop in the browser's implementation? Could you clarify the timing attack based on that?

Sorry, wrote this hastily, and not fully aware of different browsers implementations, and if cross-origin frames are on separate threads or not. If script in an iframe is able to block execution on a thread for a specific time period - this could be read by the parent window, perhaps.

@johnwilander
Copy link
Collaborator Author

Thanks for bringing this up John. @shivanigithub is putting together an explainer for the idea that she'll post in a week or two, but you have the gist of it.

My interpretation of what Josh said is that there should be a way for a third-party iframe to request and be granted storage access without the purpose and means to communicate to the top frame, probably also not to sibling frames, possibly not even to child frames. I believe the intent is to stop migration of user identities between an iframe with storage access and the first party context of the top frame.
Is that accurate, Josh?

That's correct. We think of it as having its own frame tree. It can talk to its children, but not to any frame from the embedder's frame tree.

We'd have to block it from talking to first-party child frames. Imagine this:

  • Top frame: games.example
  • Sub frame with isolated storage access: videos.example
  • Sub frame of the sub frame with isolated storage access: games.example

Depending on how third-party cookie blocking is implemented, the sub sub frame from games.example will be considered first party and have access to its cookies by default.

And the only input it gets is its URL (which must be protected from link decoration) and its frame size.

Say we were to offer an opt-in mode of the Storage Access API where the iframe gives up all its capabilities to talk to the rest of the page once it gets storage access. You could even envision this opt-in mode to be relaxed in some way, for instance be promptless which I think Google is shooting for given the "we want to avoid showing prompts to users" comment.

It's similar to a third party creating a popup to get at its 1p state and display its content. Only, the popup is visually embedded in the page instead of being in a new tab or window.

We've been thinking of instead of reloading, requiring the iframe to be in its isolated state from creation. And then once the user clicks it gets access to its 1p storage. So this would be more of an iframe attribute or perhaps some kind of a document policy. We think this type of frame will be useful for other cases than requestStorageAccess as well, so would prefer it to be a more general frame option.

Now imagine this:
videos.example runs script in the first party context of a games.example webpage.
The videos.example script stores a new random user ID in games.example's first party storage.
The videos.example script opens an iframe with this URL: videos.example/?[videoClipID]&[new random user ID]
The user taps to play the video, the iframe requests storage access with the optional singleIframeWithoutCommuncation, the browser skips the prompt, isolates the iframe, and reloads the iframe with cookies. Now videos.example has connected the new random user ID in games.example's storage with its own cookie user ID.
I don't see a way to stop that kind of user ID leakage. Do you?

When you think about this as an embedded popup, you can see how isolated iframes are akin to a user navigation. As with any cross-site user navigation, link decoration will need to be identified and addressed. So we should use whatever methods UAs are using to limit link decoration leakage.

I think there is a significant difference between popups and iframes. The latter can be transparent or made to fully blend in with the first party webpage. You can imagine misuse through iframes that never show any visual cue, they just collect a user gesture to pull off the user ID linking trick. Popups are visual even if they auto-dismiss, especially on mobile where there's no distinction between tabs and popups.

Further, third-party popups may not be allowed to continue to get access to their first-party data automatically. A first step away from that is to require user interaction in the popup. A second is to either require a call to the Storage Access API or the browser automatically doing something that effectively calls the API on behalf of the popup. These things have already been discussed but not formalized.

In summary, we should not say "promptless iframe storage access with link decoration is no worse than link decoration in popups" because 1) iframes can be much less visible than popups and 2) we are likely going to have to restrict popups.

@johnwilander johnwilander added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label May 26, 2020
@annevk
Copy link
Collaborator

annevk commented May 28, 2020

Indeed, popups also show the address. (Note that on desktop there's also mostly not a difference between tabs and popups. E.g., <a target=foo> creates an auxiliary browsing context as well.)

Session history is another side channel by virtue of being scoped to the top-level browsing context. There's a long standing issue around this leaking shadow trees that contain frames that still isn't fixed.

As is indexing of child frames (this is addressed for shadow trees, iirc).

@jkarlin
Copy link

jkarlin commented Jun 9, 2020

All fair points. Popups are an imperfect analogy as we miss the visual indications (url, new window) that inform the user of the navigation.

Our goal is to find a situation in which it's safe to provide unpartitioned storage to the subframe without a prompt. For simplicity, the primary threat I'm concerned with here is unwanted cross-site recognition. In an ideal world where there is zero communication between embedder and embeddee, then it should always be safe for the isolated subframe to have unpartitioned storage.

Given that some information is expected to leak between the two (some bits of url, frame size, timing attacks), we need to constrain those leaks (e.g., link decoration mitigation, frame size limitations, separate threads) as much as possible. Even constrained, at least a few bits could flow. So we further add a user gesture requirement. The idea is to make it difficult enough to transmit user identity that sites won’t see enough value in trying.

@annevk
Copy link
Collaborator

annevk commented Jun 10, 2020

Sizing is another communication channel and IntersectionObserver might be too. I think you'd have to effectively reinvent embedding documents from scratch with this goal in mind.

@michael-oneill
Copy link

michael-oneill commented Jun 23, 2020

As @othermaciej said, any value can be appended to the iframe's url, allowing cookie-synching of first-party UIDs via the third-party cookies. How would you get over that?

@shivanigithub
Copy link

As @othermaciej said, any value can be appended to the iframe's url, allowing cookie-synching of first-party UIDs via the third-party cookies. How would you get over that?

For the url there will need to be some form of link decoration detection applied to the url and a mitigation based on that detection.

@michael-oneill
Copy link

Hard to detect general link decoration. Could only work if automatic storage access always denied for dynamically created (or amended) iframes.

@hober hober removed the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Jul 6, 2020
@colinclerk
Copy link

One potential use-case for promptless storage access is Google's one-tap sign in:
https://developers.google.com/identity/one-tap/web

A solution that relies on well-known could enable this while disallowing link decoration. Unfortunately, I don't think the mechanism extrapolates out to videos.example, where there is always a unique token in the URL.

Is there any concern that third-party javascript will usually be adding the third-party iframe to the DOM? It might be dangerous for the third-party javascript (running in the parent context) to know which IP & browser is about to load an iframe with promptless storage access.

@jkarlin
Copy link

jkarlin commented Aug 12, 2020

Sorry for the long delay here. We've explored this idea a bit over the last few weeks. Our goals are still to isolate a frame with unpartitioned storage from the rest of the page so its identifiers can't be joined with the embedder and to reduce prompting where possible.
We've posted the fenced frame (what we're now calling these isolated frames) explainer. It discusses its potential use for third party storage but for now is focused on other isolation cases.

As discussed earlier, part of what makes this so difficult for storage is link decoration, both of the URL of the isolated frame and the URL of the embedder. We've had some ideas but nothing that covers all of the use cases we want, and they all have their own challenges. Some examples:

  1. Don't allow query params in the URL. This is simple but doesn't stop anyone willing to put a small amount of work in to edit their paths.
  2. The URL of the isolated frame must be the same regardless of the user. We can check this by making a second uncredentialed request to the embedder's document. Of course, if the URL of the uncredentialed request has the user id in it, then it's not really uncredentialed now is it? So certainly not perfect, but one step further along.
  3. A .well-known file on the site lists all of the fenced frame urls the site will ever link to. This could work, but has scalability concerns (this file will monotonically grow) and is a bit of a pain for developers.

The last two options don't address sites with dynamic 3p widgets (e.g., programmatic ads) nor do they address sites that need the full referer URL (e.g., like buttons or comment widgets). But they could work for cases such as embedded maps, videos, or documents.

There is also a side-channel for fenced frames, which is timing attacks (hat tip @colinclerk). The time that the frame is created is known to both the embedder and the new frame, which can be used as a joining key for their identifiers. For example, the embedder sends its user id to matcher.example with the timestamp, and the new frame does the same. Matcher.example can join the two based on the timestamp (and IP address, UA, etc..).

@hober
Copy link
Member

hober commented Aug 12, 2020

Sorry for the long delay here. We've explored this idea a bit over the last few weeks. Our goals are still to isolate a frame with unpartitioned storage from the rest of the page so its identifiers can't be joined with the embedder and to reduce prompting where possible.
We've posted the fenced frame (what we're now calling these isolated frames) explainer. It discusses its potential use for third party storage but for now is focused on other isolation cases.

Would you like to propose Fenced Frames to the Privacy CG as a work item to take up?

@johnwilander
Copy link
Collaborator Author

Thanks, Josh and team! I appreciate the detailed threat analysis.

@johannhof
Copy link
Member

@jkarlin Hi Josh, I'd be interested in how you see the relationship between fenced frames and SAA going forward and whether you'd like to discuss that in the Privacy CG meeting some time. As Tess already said it might make sense to have FF as a separate work item for the CG, but in the meantime I'd like to get clarity on what to do with this issue specifically :)

I know it's quite late notice but I'm also happy to try and make some time in the SAA slot of the Privacy CG F2F tomorrow if you're planning to attend that.

Thanks!

@jkarlin
Copy link

jkarlin commented May 18, 2021

@krgovind has taken over the work of trying to provide third-party storage on the Chrome side, so will leave those questions for her.

In regards to where FF will land, it's likely to head to WICG due to FLoC and FLEDGE being its primary use cases at first. We may increase its scope for things like third-party storage down the road.

@krgovind
Copy link

@johannhof - Sorry, I don't have anything new to share on this front yet; but I'll be sure to bring my notes back here after my team has a chance to dig into this further.

@jkarlin
Copy link

jkarlin commented Jun 11, 2021

@johannhof I've given this some more thought, and I think PrivacyCG is a good fit for Fenced Frames. While our initial primary use cases are advertising based, the intent is to be a general purpose API and most of the design engagement needs to be with browsers to make sure we get the privacy/security separation right. If y'all think that makes sense then we'd be happy to introduce it at the next meeting.

@jkarlin
Copy link

jkarlin commented Jun 16, 2021

I created privacycg/proposals#25.

@johannhof
Copy link
Member

Thanks @jkarlin, that seems great at a glance! I think chairs will get this on the agenda for the next CG meetings.

I'll tentatively close this issue now, we can discuss integration with SAA as part of the proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests