Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Transition 'sync-xhr' feature to Document Policy #410

Open
clelland opened this issue Oct 28, 2020 · 13 comments
Open

Proposal: Transition 'sync-xhr' feature to Document Policy #410

clelland opened this issue Oct 28, 2020 · 13 comments

Comments

@clelland
Copy link
Collaborator

'sync-xhr' is currently implemented as a policy-controlled feature whose default allowlist is '*'. This means that it follows the model of "available in every frame until it disabled somewhere, and then disabled from there down". That is, disabling sync-xhr at one node in the frame tree means that it is necessarily disabled in all descendants of that node. (@martinthomson opines in #248 that "'sync-xhr' seems like its primary application is creating bustage in framed content.").

Moving this to Document Policy would provide a better model; the availability could be controlled per-document, without necessarily affecting framed content. (Required policies could be used to provide guarantees of non-usage in subframes if absolutely necessary, with their cooperation.) It establishes sync-xhr as following the 'document-sandboxing' model, as discussed in #296.

This would also support an eventual deprecation strategy: Switching the default value from true to false will disable synchronous XHR in documents, but would allow the same Document Policy mechanism to re-enable it where required. The additional configuration required should serve to discourage new uses of the API, and should help drive down existing usage, hopefully to the point where it can eventually be removed from the platform.

@martinthomson
Copy link
Member

Not sure I like your chances of deprecating, but this is the right choice.

@yoavweiss
Copy link
Contributor

(@martinthomson opines in #248 that "'sync-xhr' seems like its primary application is creating bustage in framed content.").

That use case will no longer be enabled, right? That means that a top-level frame will no longer be able to make sure its 3P resources (or even devs from The Other Team™) are not using sync XHR. That seems like the use case for the policy.

What would be the use-case for the equivalent Document Policy? How do we foresee it being used?

@annevk
Copy link
Member

annevk commented Oct 29, 2020

You can still enforce it, but the framed document has to allow that restriction be imposed upon it. Meaning that the parent cannot alter its control flow in unexpected ways.

@mikewest
Copy link
Member

This seems right to me. We should likely move/copy document-domain as well.

@arturjanc
Copy link

A big +1 from me for both this change and for document-domain. In terms of the web's threat model, Document Policy seems like a better fit for these features because -- as Anne said -- it requires consent from an iframe to have its control flow affected by the embedder.

@clelland
Copy link
Collaborator Author

Not sure I like your chances of deprecating

It's completely possible that features like this end up as optional legacy features of the platform, that require a bit of configuration to enable, and stay like that for a long time. That's probably not a bad outcome, as long as we can provide better primitives so that new content doesn't need it by default.

I'm still hopeful, though :)

@RByers
Copy link

RByers commented Feb 11, 2024

I don't think I'm willing to give up in the idea of parent frames removing this power from their children unilaterally. I think that would effectively mean giving up on deprecating sync XHR. I see no realistic path for solving the chicken and egg problem of getting 3Ps to opt into the policy such that embedders can make it required. Instead, I believe we must give parent frames some limited tools to pragmatically reign in the undesirable behavior of their child frames. In the case the child is blocking the parents ability to navigate - that's totally unacceptable IMHO.

The unexpected altered control flow argument seems theoretical to me. Are there any real examples? XHR can already fail for network conditions, why not give parent frames the power to induce such failures? It also seems like an argument to say that we can never deprecate any API ever in a way that causes new failures. If we decide that it's safe from a security perspective to deprecate sync XHR entirely, then it should also be safe to roll that out incrementally under the control of parent frames, right?

I think we're going through a similar debate with unload events right now. I appreciate that WebKit and Gecko are opposed to our permissions policy for control over unload events (and I probably would be in their shoes too). But lacking such pragmatic "hacks" for managing a deprecation responsibly, it leaves the other engines in a position of either 1) causing high breakage such that a big class of sites tells their users to use another browser (WebKit and enterprise sites) or 2) having no realistic path to being able to fix such flaws in the web platform (Gecko).

Personally I think Chromium has a responsibility to push hard on a pragmatic middle ground here, even if it's messy and hacky. I'm not going to argue that phasing out sync xhr is urgent, but I really don't want chromium taking steps backward from making it possible in the long run.

@arturjanc
Copy link

I definitely support the goal of deprecating sync XHR, but I don't think the reasoning above accurately describes the situation from a security perspective.

The concern is that any time an application has sensitive functionality dependent on data retrieved via sync XHR, and assumes the fetch succeeds (which is not an outrageous assumption), an attacker which disables sync XHR will be able to bypass application logic that expects the data to be present. For example, imagine that an application fetches an array of regular expressions to validate data it will receive via postMessage and appends it to some variable; then for every message it gets, it runs all the validation rules against the value to make sure it's well-formed. An attacker who can zero-out this configuration will be able to bypass the application's security logic which can have application-dependent consequences, up to and including compromise through script execution (e.g. an XSS can happen if the application expects to run a regexp such as /^[a-z]+$/ on the data and then passes the data to an XSS sink such as innerHTML).

XHR can already fail for network conditions, why not give parent frames the power to induce such failures?

In the web's security model attackers don't have the ability to selectively prevent the loading of chosen resources in a target cross-origin document. The fact that any load could potentially fail doesn't mean that it's safe to give attackers the capability to induce such failures -- and creating such a general capability would actually be a significant vulnerability because there are many situations where external scripts contain security-relevant functionality without which the application becomes unsafe. (Sync XHR seems less security-sensitive than scripts, but I'd still be a bit uneasy about giving cross-origin documents control over it.)

It also seems like an argument to say that we can never deprecate any API ever in a way that causes new failures.

This depends on how likely a given capability is to give attackers an interesting primitive that lets them affect cross-origin behavior. For example, I wouldn't be particularly worried about exposing a switch to disable setting document.domain; but I imagine sync XHR is used fairly commonly to load interesting data, so the blast radius of letting attackers selectively break loading such data seems larger.

If we decide that it's safe from a security perspective to deprecate sync XHR entirely, then it should also be safe to roll that out incrementally under the control of parent frames, right?

Sadly, I don't think so. As one example: if the platform disables sync XHR then developers will generally quickly realize that their requests are failing and likely update their application logic to fix this. If we give cross-origin attackers the ability to induce such failures, this will not be visible to application developers -- their application will work as before, but it now might be vulnerable in scenarios like the one I mentioned above.

Practically, my guess is that the risk is probably relatively low and it's possible that the value of removing sync XHR from the platform outweighs the security concerns. But IMHO we need to be clear about the risks of doing this.

@RByers
Copy link

RByers commented Feb 18, 2024

Thanks @arturjanc, you make some good points I hadn't thought of. Do you think there's some way to help quantify the level of risk here? Like if we were able to reduce usage to some lowish level via outreach and then do a spot check of the top remaining usage to manually evaluate the security risk in each case? And/or maybe we do a GitHub search for uses of sync XHR and come up with a bound on the fraction which could have security implications? If we can find real examples of what you're describing then I'm much more amenable to it being a blocking concern. But if, after trying, we can't find a single example then I'd be quite sad to block progress fixing a platform mistake on a pure hypothetical.

Here's a slightly different idea. Maybe we could get to the point where we could launch this policy with a default allowlist of self so that it's off by default in 3P iframes, but top frames can turn it back on. Given your argument about platform-wide deprecation being safer, this would be significantly less risky right? Of course I don't know what it would take to meet our compat bar for this, but maybe a bit of analysis and outreach could get there...

@yoavweiss
Copy link
Contributor

Thanks @arturjanc, you make some good points I hadn't thought of.

Same. Thank you for the clear example!

If we can find real examples of what you're describing then I'm much more amenable to it being a blocking concern. But if, after trying, we can't find a single example then I'd be quite sad to block progress fixing a platform mistake on a pure hypothetical.

I agree it'd be good for us to find out if this is a hypothetical or a real-world risk.
We can probably classify sync XHR usage by cross-origin iframes into categories:

  1. Analytics beaconing
  2. "freeze the world" (where the side effect of stopping the main thread is what developers are trying to accomplish)
  3. Fetching of data.

Only (3) may suffer from this issue, depending on what the data is used for. But (3) is also the one easiest to move to an async fetch without any "functionality" loss (but admittedly, with some refactoring to bubble up asynchronicity up the call stack).

So I'm somewhat optimistic that analysis will reveal that (1) and (2) are the main "use-cases" for sync XHR in cross-origin iframes, where (1) is being replaced by fetchLater and (2) is a user-hostile pattern rather than a legitimate use case.

Here's a slightly different idea. Maybe we could get to the point where we could launch this policy with a default allowlist of self so that it's off by default in 3P iframes, but top frames can turn it back on.

If we can pull that off from a compatibility perspective, it might be better to simply deprecate & remove. Giving top-level frames the ability to turn this back on doesn't give us much over e.g. a temporary deprecation trial. (unless we want to keep sync XHR as a platform feature)

@arturjanc
Copy link

arturjanc commented Feb 19, 2024

So I'm somewhat optimistic that analysis will reveal that (1) and (2) are the main "use-cases" for sync XHR in cross-origin iframes, where (1) is being replaced by fetchLater and (2) is a user-hostile pattern rather than a legitimate use case.

This sounds reasonable, but one snag here is that, at least from a security perspective, we need to think not only about the usage of sync XHR in iframes, but in any document. Cross-origin embedding is possible by default in the platform, so any document that relies on sync XHR -- even if it's not explicitly meant to be embedded -- can be iframed by an attacker and have its behavior modified through the sync-xhr switch (unless it has taken steps to protect itself by setting, XFO, frame-ancestors, etc - which only a minority of web-facing content does).

This ties in with @RByers's question about measuring the risk; we can catalog existing uses of sync XHR (via crawls, UKM, or poking around GitHub), but even if we don't find any examples of it being used in a security-sensitive context we'll at most be able to say we don't think this pattern is common. But there still might exist long tail sites that this behavior would make vulnerable; and we sadly don't have a great way to distinguish a hypothetical concern with zero real-world impact from a problem that would affect a small number of niche sites (some of which could be sensitive for users).

Perhaps the best argument for doing this is that AFAIK Chrome has had this Feature Policy flag for a long time and the sky seemingly hasn't fallen, or at least I'm not aware of application-level vulnerabilities that were enabled by this switch.

If we can pull that off from a compatibility perspective, it might be better to simply deprecate & remove. Giving top-level frames the ability to turn this back on doesn't give us much over e.g. a temporary deprecation trial.

+1. Y'all have thought about this much more, but I'm wondering if there's some way to deprecate it globally, but divide this into stages that would limit breakage? For example, say we capped sync XHR to some timeout and only errored out if that limit is exceeded -- this could let us start ratcheting up restrictions on cases that are presumably worst for performance and eventually deprecate it altogether.

@yoavweiss
Copy link
Contributor

This sounds reasonable, but one snag here is that, at least from a security perspective, we need to think not only about the usage of sync XHR in iframes, but in any document. Cross-origin embedding is possible by default in the platform, so any document that relies on sync XHR -- even if it's not explicitly meant to be embedded -- can be iframed by an attacker and have its behavior modified through the sync-xhr switch (unless it has taken steps to protect itself by setting, XFO, frame-ancestors, etc - which only a minority of web-facing content does).

Right, we'd need to do this analysis for documents in general, not only cross-origin frames.

+1. Y'all have thought about this much more, but I'm wondering if there's some way to deprecate it globally, but divide this into stages that would limit breakage? For example, say we capped sync XHR to some timeout and only errored out if that limit is exceeded -- this could let us start ratcheting up restrictions on cases that are presumably worst for performance and eventually deprecate it altogether.

I like the approach of failing these fetches based on relative user harm. Ratcheting that down can nudge users of (1) and (3) to look for less harmful alternatives.

@annevk
Copy link
Member

annevk commented Mar 16, 2024

Couple things I wanted to note:

  • It seems WebKit has supported sync-xhr as well for about five years: https://bugs.webkit.org/show_bug.cgi?id=202098.
  • sync-xhr was removed from XMLHttpRequest quite a while ago, it's been non-standard ever since (perhaps it wouldn't have happened with the current WHATWG Working Mode though, as there seemingly was no follow through with tests and implementer agreement): whatwg/xhr@9157397

Given that two browser engines have deployed this for quite a while I think I'm not opposed to keeping it in, it's definitely the more straightforward option, but I do think we all need to agree that it's not a precedent we can build on going forward for the reasons Artur helpfully laid out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants