Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request: allow to retrieve a frameID from an <iframe> element #12

Open
therealglazou opened this issue Jun 17, 2021 · 26 comments
Open

request: allow to retrieve a frameID from an <iframe> element #12

therealglazou opened this issue Jun 17, 2021 · 26 comments
Labels
Chrome: follow-up enhancement

Comments

@therealglazou
Copy link

@therealglazou therealglazou commented Jun 17, 2021

One of the problems we faced at Dashlane is dealing with frameIDs. It's currently super tricky to associate a frameID to a given <iframe> element and it's even too complicated to retrieve a frameID at all since the only existing way to do that ATM is to send a message, from the inner context to that iframe, to the background, collect the frameID there and send it back to the origin iframe.

I think there should be an easy way to retrieve a frameID for a given iframe element, from the enclosing context of that iframe element. Similarly, from the JS context inside an iframe, there should be an easy way to retrieve the current frameID without having to ping-pong the background.

@therealglazou therealglazou changed the title request: impossible to retrieve a frameID from an <iframe> element request: allow to retrieve a frameID from an <iframe> element Jun 17, 2021
@therealglazou
Copy link
Author

@therealglazou therealglazou commented Jun 18, 2021

Side note: even if <embed> is deprecated, some sites still use it (yeah, fools...). Example: https://www2.rxnt.com/phr/

Getting the frameID for a document embedded through <embed> is even more complicated because it's not possible to postMessage() the inner window.

@dotproto
Copy link
Collaborator

@dotproto dotproto commented Jun 18, 2021

… it's even too complicated to retrieve a frameID at all since the only existing way to do that ATM is …

Another way to do this is to use the webNavigation.getAllFrames(), but this requires the webNavigation permission. Offhand I'm not sure what permission prompt other browsers display for this, but Chrome warns users that this permission allows extensions to "Read your browsing history."

Can you share a bit more about why the content script needs to know the ID of its host frame? As opposed to, say, the background script using this data.

Exposing a way for the host frame to get it's own frame ID sounds reasonable. We may also want to investigate a more ergonomic way of getting all frame IDs for a given tab than the current webNavigation approach, as that grants access to much more data than necessary to, say, properly target script injection for a given frame.

@therealglazou
Copy link
Author

@therealglazou therealglazou commented Jun 19, 2021

Can you share a bit more about why the content script needs to know the ID of its host frame? As opposed to, say, the background script using thi data.

We're a Password Manager. When a form contains iframes themselves containing form fields (well-known example visible at https://account.protonmail.com/signup?language=en but many banks do that too), we need to "replace" an iframe in such a form by its inner form fields when we "recognize" forms at the outermost level. Reaching the frameElement is often impossible and a direct postMessage() to the iframe's window can be unreliable if the contents of the iframe rely themselves on messages and do not accept easily third-party messages (well-known example is Office365, ahem...). So one of the only reliable things to associate an iframe element and its inner contents is the frameID. All Password Managers share the same burden.

As of today, of course, we ping/pong the background to get the frameID but that feels like a hack: in short you send a dummy message to the background to grab the originating frameID from the message and send it back. getAllFrames() is hardly usable because of the permission constraints it adds on our webextension. Maybe getAllFrames() should not live inside webNavigation. And even if you can capture the frameID that way, you can capture from within the iframe, it's still not directly associated to the iframe element at the upper level...

@dylanb
Copy link

@dylanb dylanb commented Jun 21, 2021

100% agree with this suggestion, this change would make it easier to implement secure messaging between content scripts in cross domain iframe environments. It is currently very difficult with the messaging hack mentioned above being one of the ways to do it.

@oliverdunk
Copy link
Member

@oliverdunk oliverdunk commented Jun 22, 2021

We face exactly the same problem at 1Password, working on our own password manager. Some sites split username and password fields in login forms across frames, and for the purposes of filling, we need a reliable way of knowing where each is located.

Additionally, some of the UI we display is added to the DOM of the top frame, but positioned based on the location of a field within an iframe. For this, we need to add the offset of the field within the frame to the offset of the frame itself in the page. Doing so requires that we identify the particular iframe we are showing UI for. Workarounds include comparing the size of an iframe with the size of the window within it. Ideally, however, this would be as simple as using the frame ID that sent us the message.

@bershanskiy
Copy link

@bershanskiy bershanskiy commented Jul 12, 2021

As of today, of course, we ping/pong the background to get the frameID

Dark Reader does this too.

Maybe getAllFrames() should not live inside webNavigation.

It would be great solution. Perhaps, it can live under tabs? Also, "tabs" permission could gate some information (like URL)like it already does Tab.url, Tab.title, and Tab.faviconUrl.

@dotproto dotproto added the enhancement label Aug 5, 2021
@xeenon xeenon changed the title request: allow to retrieve a frameID from an <iframe> element request: allow to retrieve a frameID from an <iframe> element Sep 24, 2021
@zombie
Copy link
Collaborator

@zombie zombie commented Sep 29, 2021

I tried to design something to address all mentioned use cases, but still keep it minimal to enable quick and easy adoption by other implementations. My assumptions are:

  • Content scripts want frameId of the current frame.
  • Also a way to get frameId for all descendant <iframes>.
  • (probably nice to have) Get frameId for parent and ancestor frames.
  • (desirable because permissions) Avoid the webNavigation namespace.
  • (if possible) Make it a sync method, it would be used for communication between frames on page load.

So here's the proposal:

browser.runtime.getFrameId(target: WindowProxy | HTMLIFrameElement): number

This returns the frameId from the passed Window or container element, or for the current (caller) frame if omitted. Since the Browsing Context subtree already needs to be available in the process, it should be possible to make this a sync method in all implementations.

You can pass the <iframe> directly if you're interested in positioning inside the page, window.parent.parent... to get all ancestor frames, or any descendant WindowProxy from the postMessage() event.source.

One potential issue is that AFAIKT this would be the first extension API that receives a DOM element as an argument. I had to work around a minor issue (for cross-origin values), and it might be a larger issue in other engines.

I've put together a quick prototype, you can test it yourself if you're able to build Firefox locally.

(That patch also includes an alternative design for getFrameInfo I was considering, which enables enumerating all child frames, but it turned out overall more complex and less capable, and it's not clear that's a real use case).

@therealglazou
Copy link
Author

@therealglazou therealglazou commented Sep 30, 2021

  • (probably nice to have) Get frameId for parent and ancestor frames.

Not only "probably nice to have" but absolutely necessary. Let me explain....

  • injected JS code needs to use the various origins of the frames in a page (or worse "*") to postMessage() a parent frame because otherwise the messages will always trigger an origin error.
  • because of that, Website JS cannot make any difference between messages sent by the site itself and messages sent by an extension
  • most websites, including MAJOR ones (Microsoft, Google, etc.) do not make any coherence control on incoming messages, ie, verify that incoming messages are conformant to an expected grammar AND do nothing if they don't. As an example, a major website, a bank I won't name, has a very weird behaviour: it does one thing on incoming string messages it understands, another one on incoming messages being a number and always refresh the page in ALL other cases... So if a WebExtension sends itself, from a frame to another, a JSON, k-boom the website gets a refresh. No, we won't be able to evangelize the whole world on good postMessage() handling practices, that's too late.
  • in other words, postMessage() to communicate between frames is totally unusable in WebExtensions.
  • a way to associate a <frame> or <iframe> element to a frameId goes through finding the index of a frame's window into its window.parent.frames array. BUT window.frames and document.querySelectorAll('frame,iframe') are NOT in sync, contrarily to what developer docs are saying. Frame windows seem to be added to that array at creation time while the other list is in tree traversal order.... It's then completely impossible to reliably use that path.
  • conclusions: 1. only sendMessage() is reliable between frames inside WebExtension code, postMessage() just cannot be used, it breaks websites; 2. it's crucial for each frame to have its frameId and its parentFrameId (communication upwards); 3. it's crucial for extension code to be able, for any arbitrary <frame> or <iframe> element, to get its frameId (communication downwards)

This is doable but only through webNavigation permission (that many extensions are reluctant to ask for given the scary message browsers show) or non-blocking webRequest. In both cases, that's hacky and complicated for something that should be trivial. And we have absolutely nothing to fix the conclusion3 issue ATM.

@zombie
Copy link
Collaborator

@zombie zombie commented Sep 30, 2021

you can test it yourself if you're able to build Firefox locally.

I forgot you can actually download builds from our CI infra for Linux, OSX or Windows -- though I only tested the last one, please ping me if something doesn't work for you.

Not only "probably nice to have" but absolutely necessary. Let me explain....

Thank you for the clarification.

I didn't notice that requirement mentioned in above use cases, but while trying to work through them, I realized it's probably important, so thanks for confirming.

@therealglazou
Copy link
Author

@therealglazou therealglazou commented Sep 30, 2021

browser.runtime.getFrameId(target?: WindowProxy | HTMLIFrameElement): number

That would perfectly fit our needs. Thanks a lot @zombie !!!

@xeenon
Copy link
Collaborator

@xeenon xeenon commented Sep 30, 2021

This works for the Safari team.

@Rob--W
Copy link
Collaborator

@Rob--W Rob--W commented Sep 30, 2021

One point of feedback on the API design.

The parameter being a mixed type and potentially optional may be problematic. Consider the following example:

let frameId = chrome.runtime.getFrameId(window.opener);

The intention behind this code is to get the frameId of the window that opened this page.
However, window.opener can be null, in which case the API would return the current frame.
This unexpected result can potentially cause the extension to use the wrong frameId, and therefore the wrong frame/window.

This could be resolved by disambiguating the parameters, by being explicit about which frame is meant.

For example by always requiring a parameter instead of null (i.e. getFrameId(self) aka getFrameId(window) to get the current frame's windowId).

Or by being even more explicit, taking a dictionary that does not have "unsafe" fallbacks (i.e. overlap between optional parameters and different results). For example:

// Get frameId of current window.
let frameId = chrome.runtime.getFrameId({
  type: "self",
});

let frameId = chrome.runtime.getFrameId({
  type: "windowProxy",
  windowProxy: window, // or parent / opener / top / frames[x] / iframe.contentWindow / event.source, ...
});

let frameId = chrome.runtime.getFrameId({
  type: "htmlElement",
  htmlElement: iframe, // One of: <iframe> / <frame> / <embed> / <object>
});

@therealglazou
Copy link
Author

@therealglazou therealglazou commented Oct 1, 2021

I wonder if the dual type argument, WindowProxy | HTMLIFrameElement, is really necessary. Since chrome.runtime.getFrameId(frameElement) and chrome.runtime.getFrameId(frameElement.contentWindow) should be strictly equivalent, maybe WindowProxy is enough.

@Rob--W
Copy link
Collaborator

@Rob--W Rob--W commented Oct 1, 2021

I wonder if the dual type argument, WindowProxy | HTMLIFrameElement, is really necessary. Since chrome.runtime.getFrameId(frameElement) and chrome.runtime.getFrameId(frameElement.contentWindow) should be strictly equivalent, maybe WindowProxy is enough.

As mentioned in the example in my last comment, there are more ways to embed a document other than <iframe>. For frames (<iframe> and <frame>), it's indeed possible to get the window with iframe.contentWindow. It's even possible to do the same for <object>. For <embed>, that's not possible. Interestingly, documents from embeds do appear in the window.frames object.

<embed> used to be used to embed plugins in web pages. Plugins can expose a scripting interface, so up until a few years ago, it was not really safe to expose more properties. Now that support for plugins have been dropped from browsers, it may be feasible to add properties such as contentWindow and contentDocument to <embed>, which would indeed make the htmlElement kind of parameter redundant. I filed an issue with a request to add .contentWindow to <embed> at whatwg/html#7140

If that issue gets resolved, then getFrameId only needs to take a window.

@therealglazou
Copy link
Author

@therealglazou therealglazou commented Oct 1, 2021

Ok, thanks a lot @Rob--W .

@zombie
Copy link
Collaborator

@zombie zombie commented Oct 1, 2021

For example by always requiring a parameter instead of null (i.e. getFrameId(self) aka getFrameId(window) to get the current frame's windowId).

Good callout, the <no params to get current frame's id> is a leftover from when the method only worked with EmbedderElements, and since you suggested including WindowProxy, it's not really needed anymore (I've edited the original proposal to make it clearer).

@zombie
Copy link
Collaborator

@zombie zombie commented Oct 1, 2021

@dotproto Can you please check with the Chrome team if this would present implementation difficulties, or if they have any other feedback to this proposal?

@theseanl
Copy link

@theseanl theseanl commented Oct 2, 2021

This discussion sounds similar to what has been done for closed shadow roots. Closed shadow roots are meant to be inaccessible from main world, and it has been requested to allow extensions to access them. It is now available on chrome.dom.openOrClosedShadowRoot from Chrome's side, and Element.openOrClosedShadowRoot from Firefox's side.

I mostly agree with spec writers' rationale for not wanting to expose contentWindow on HTMLEmbedElement. I propose that browser venders should implement a corresponding chrome.dom.contentWindow / HTMLEmbedElement.contentWindow API to allow content scripts to access content window of <embed> nodes.

Previous proposals add unnecessary overhead to the API call just for one edge case: Making API polymorphic will likely require browser's wrapper code to introduce a type check, and requiring consumers to pass a one-off dictionary seems bad for garbage collection.

Moreover, once we have an API for accessing contentWindow of <embed> nodes, developers may find other good uses of it.

For completeness's sake, below is one possible API design that I just came up with.

chrome.dom.contentWindow(element:HTMLIFrameElement | HTMLFrameElement |
    HTMLObjectElement | HTMLEmbedElement): Window | null

interface HTMLEmbedElement {
	contentWindow: Window | null
}

For extending existing HTMLEmbedElement, one may consider using a different property name to signal developers that this is an extension-only API.

@carlosjeurissen
Copy link
Contributor

@carlosjeurissen carlosjeurissen commented Oct 7, 2021

As mentioned here:
#77 (comment)

@dotproto created a crbug for the proposal here:
https://bugs.chromium.org/p/chromium/issues/detail?id=1256872

@rdcronin
Copy link

@rdcronin rdcronin commented Oct 14, 2021

Chiming in from the Chromium side.

I was able to bring this up with our security team, and there were a few different considerations. First off, to explain a few of our primary considerations:

First, we often try to assume that almost any renderer process can be compromised (and, it turns out, that this is pretty much true : )). Because of these, we try to limit the amount of information and capabilities we give to renderer processes. In particular, it ideally shouldn’t be possible for a compromised renderer process to access or affect other renderer processes.

Second, the boundary between content scripts and the main page is also weaker, since it’s only a JS context boundary (and not a hard process boundary). In either the case of a compromised renderer or just a “pierced” boundary between JS contexts, an attacker can potentially gain access to any power or knowledge that a content script has.

These are the reasons that the vast majority of extension APIs aren’t accessible from a content script context and that we’ve adjusted the cross-origin fetching capabilities of content scripts, and that we are generally opposed to expanding those capabilities.

In this case, though, it’s a bit of a blurred line. For one thing, I can absolutely see the benefit and desire here for extension developers - dealing with frame IDs is painful, and if there’s frame-specific logic you need in content scripts, it’s even more so (and sometimes impractical, if you need it synchronously for a script running at document_start). There's also an argument that, at least for retrieving the current frame's ID, it's not really an "extra" capability (though this is somewhat inaccurate for Chromium; discussed below). From a platform viewpoint, I’m supportive of exposing this information to ease development.

Now, though, we get into the nitty gritty. In Chromium, frame IDs (apart from the main frame, which is always 0) are currently associated with something called the Frame Tree Node ID. This a globally-unique, monotonically-increasing identifier given to a frame. If we expose this information to a compromised renderer, it can give away information about the current state of the browser, as well as being able to detect happenings in other tabs (by adding new frames and seeing the delta from the previous ones). This type of information can then help attackers mount timing attacks, among others. A second risk here is that if an attacker has control over a process and the knowledge of the frame tree node IDs within it, that attacker could try and leverage an inter-process message to the browser targeting different frame tree node IDs that are likely to exist.

It’s true that these are largely only security concerns when one (or frequently, more than one) thing already goes wrong, but we very much strive to have defense-in-depth. We’ve historically seen many vulnerabilities that chain together multiple things going wrong.

Now, for the path forward. On the Chromium side, I think many of these concerns would be assuaged if we had an unguessable identifier associated with frame IDs. I think this is an approach that we could take (I’ll investigate it), but we’d likely block runtime.getFrameId() on that being completed. Even then, there could be a slight (though significantly mitigated) concern around allowing frames to retrieve the unique identifier of other frames in the hierarchy, especially ones that might be cross-origin and/or cross-process; I’m continuing the discussion there to see if, with an extension-specific identifier here, that’s a reasonably low risk.

I’m curious for the other browsers’ implementations here - are frame IDs (apart from the main frame) already random? Or are any of these also concerns for other implementations?

@zombie
Copy link
Collaborator

@zombie zombie commented Oct 28, 2021

Thank you Devlin for the Chromium perspective. I think your proposal for using random frame IDs is compatible with the getFrameId() design, and something we could implement if/when we decide to (though it might not be a priority for us, see notes below for mitigating circumstances in Gecko).

Since additional exposure seems marginal (at least for Firefox), and using random IDs would be an implementation-level change transparent to extension, it doesn't look like a blocker for the proposed design. I think we can proceed with shipping this as an experimental API to get it into the hands of developers, to test it and provide any potential feedback.

A second risk here is that if an attacker has control over a process and the knowledge of the frame tree node IDs within it, that attacker could try and leverage an inter-process message to the browser targeting different frame tree node IDs that are likely to exist.

I'm somewhat confused by this. My presumption based on talking to Gecko's site isolation team is that these frame node IDs (or Browsing Context IDs I mentioned in the proposal above) would already be available in each render process, since we already need to support WindowProxy references to each frame in the same subtree (or perhaps even the whole Browsing Context Group), including cross-origin ones. In case of a compromised render process you mention, these IDs would already be in the process's memory and available to the attacker.

Of course this could be mitigated by parent process generating a per-process mapping of frame node IDs, or maybe with something similar to what you're proposing about randomized IDs exposed to extensions, and Chromium might already be doing something along those lines, but at least Gecko isn't (perhaps partially because of the second point below).

I’m curious for the other browsers’ implementations here - are frame IDs (apart from the main frame) already random? Or are any of these also concerns for other implementations?

They are not random in Gecko either, but they're (mostly) monotonically-increasing only per-process. At least for the case you described, a compromised child process wouldn't be able to observe the creation of frames initiated from other processes.

@zombie
Copy link
Collaborator

@zombie zombie commented Jan 4, 2022

Quick update, we landed this in Firefox 96, which is currently in beta or dev edition, and will be released on January 11th.

Not documented on MDN yet, in case we need to adjust the design based on feedback from developers or other implementations, so please try it now and report any issues or bugs.

@zombie
Copy link
Collaborator

@zombie zombie commented Jan 4, 2022

And for clarity, since there's no documentation, this is the implemented API:

browser.runtime.getFrameId(target: WindowProxy | HTMLIFrameElement): number

Should be fairly obvious, returns the frameId for any passed Window reference or an embedder element, and throws if passed anything else.

@therealglazou
Copy link
Author

@therealglazou therealglazou commented Jan 5, 2022

Quick update, we landed this in Firefox 96, which is currently in beta or dev edition, and will be released on January 11th.

Not documented on MDN yet, in case we need to adjust the design based on feedback from developers or other implementations, so please try it now and report any issues or bugs.

This is wonderful ! Thanks a lot @zombie, we'll make sure to test it ASAP.

Chrome, the ball is now in your hands. 😄

@oliverdunk
Copy link
Member

@oliverdunk oliverdunk commented Jan 5, 2022

Hey @zombie! We jumped on this at 1Password and I'm happy to share that I was able to use the new API as a drop-in replacement for our existing matching logic. That involved both getting the id of a window as well as an iframe from the parent frame. I don't think I have any extra feedback at the moment - a huge thanks for implementing this and we look forward to it reaching other browsers.

@therealglazou
Copy link
Author

@therealglazou therealglazou commented Jan 10, 2022

@zombie we tested it too at Dashlane inside our browser extension and it works absolutely fine! Thanks a zillion again and we're waiting for Chrome and Safari implementations now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chrome: follow-up enhancement
Projects
None yet
Development

No branches or pull requests