Proposal: Transition 'sync-xhr' feature to Document Policy #410

clelland · 2020-10-28T13:09:11Z

'sync-xhr' is currently implemented as a policy-controlled feature whose default allowlist is '*'. This means that it follows the model of "available in every frame until it disabled somewhere, and then disabled from there down". That is, disabling sync-xhr at one node in the frame tree means that it is necessarily disabled in all descendants of that node. (@martinthomson opines in #248 that "'sync-xhr' seems like its primary application is creating bustage in framed content.").

Moving this to Document Policy would provide a better model; the availability could be controlled per-document, without necessarily affecting framed content. (Required policies could be used to provide guarantees of non-usage in subframes if absolutely necessary, with their cooperation.) It establishes sync-xhr as following the 'document-sandboxing' model, as discussed in #296.

This would also support an eventual deprecation strategy: Switching the default value from true to false will disable synchronous XHR in documents, but would allow the same Document Policy mechanism to re-enable it where required. The additional configuration required should serve to discourage new uses of the API, and should help drive down existing usage, hopefully to the point where it can eventually be removed from the platform.

The text was updated successfully, but these errors were encountered:

martinthomson · 2020-10-29T01:09:24Z

Not sure I like your chances of deprecating, but this is the right choice.

yoavweiss · 2020-10-29T07:48:06Z

(@martinthomson opines in #248 that "'sync-xhr' seems like its primary application is creating bustage in framed content.").

That use case will no longer be enabled, right? That means that a top-level frame will no longer be able to make sure its 3P resources (or even devs from The Other Team™) are not using sync XHR. That seems like the use case for the policy.

What would be the use-case for the equivalent Document Policy? How do we foresee it being used?

annevk · 2020-10-29T08:27:19Z

You can still enforce it, but the framed document has to allow that restriction be imposed upon it. Meaning that the parent cannot alter its control flow in unexpected ways.

mikewest · 2020-10-29T08:46:47Z

This seems right to me. We should likely move/copy document-domain as well.

arturjanc · 2020-10-29T10:57:28Z

A big +1 from me for both this change and for document-domain. In terms of the web's threat model, Document Policy seems like a better fit for these features because -- as Anne said -- it requires consent from an iframe to have its control flow affected by the embedder.

clelland · 2020-10-30T12:52:43Z

Not sure I like your chances of deprecating

It's completely possible that features like this end up as optional legacy features of the platform, that require a bit of configuration to enable, and stay like that for a long time. That's probably not a bad outcome, as long as we can provide better primitives so that new content doesn't need it by default.

I'm still hopeful, though :)

RByers · 2024-02-11T14:17:25Z

I don't think I'm willing to give up in the idea of parent frames removing this power from their children unilaterally. I think that would effectively mean giving up on deprecating sync XHR. I see no realistic path for solving the chicken and egg problem of getting 3Ps to opt into the policy such that embedders can make it required. Instead, I believe we must give parent frames some limited tools to pragmatically reign in the undesirable behavior of their child frames. In the case the child is blocking the parents ability to navigate - that's totally unacceptable IMHO.

The unexpected altered control flow argument seems theoretical to me. Are there any real examples? XHR can already fail for network conditions, why not give parent frames the power to induce such failures? It also seems like an argument to say that we can never deprecate any API ever in a way that causes new failures. If we decide that it's safe from a security perspective to deprecate sync XHR entirely, then it should also be safe to roll that out incrementally under the control of parent frames, right?

I think we're going through a similar debate with unload events right now. I appreciate that WebKit and Gecko are opposed to our permissions policy for control over unload events (and I probably would be in their shoes too). But lacking such pragmatic "hacks" for managing a deprecation responsibly, it leaves the other engines in a position of either 1) causing high breakage such that a big class of sites tells their users to use another browser (WebKit and enterprise sites) or 2) having no realistic path to being able to fix such flaws in the web platform (Gecko).

Personally I think Chromium has a responsibility to push hard on a pragmatic middle ground here, even if it's messy and hacky. I'm not going to argue that phasing out sync xhr is urgent, but I really don't want chromium taking steps backward from making it possible in the long run.

arturjanc · 2024-02-17T21:22:49Z

I definitely support the goal of deprecating sync XHR, but I don't think the reasoning above accurately describes the situation from a security perspective.

The concern is that any time an application has sensitive functionality dependent on data retrieved via sync XHR, and assumes the fetch succeeds (which is not an outrageous assumption), an attacker which disables sync XHR will be able to bypass application logic that expects the data to be present. For example, imagine that an application fetches an array of regular expressions to validate data it will receive via postMessage and appends it to some variable; then for every message it gets, it runs all the validation rules against the value to make sure it's well-formed. An attacker who can zero-out this configuration will be able to bypass the application's security logic which can have application-dependent consequences, up to and including compromise through script execution (e.g. an XSS can happen if the application expects to run a regexp such as /^[a-z]+$/ on the data and then passes the data to an XSS sink such as innerHTML).

XHR can already fail for network conditions, why not give parent frames the power to induce such failures?

In the web's security model attackers don't have the ability to selectively prevent the loading of chosen resources in a target cross-origin document. The fact that any load could potentially fail doesn't mean that it's safe to give attackers the capability to induce such failures -- and creating such a general capability would actually be a significant vulnerability because there are many situations where external scripts contain security-relevant functionality without which the application becomes unsafe. (Sync XHR seems less security-sensitive than scripts, but I'd still be a bit uneasy about giving cross-origin documents control over it.)

It also seems like an argument to say that we can never deprecate any API ever in a way that causes new failures.

This depends on how likely a given capability is to give attackers an interesting primitive that lets them affect cross-origin behavior. For example, I wouldn't be particularly worried about exposing a switch to disable setting document.domain; but I imagine sync XHR is used fairly commonly to load interesting data, so the blast radius of letting attackers selectively break loading such data seems larger.

If we decide that it's safe from a security perspective to deprecate sync XHR entirely, then it should also be safe to roll that out incrementally under the control of parent frames, right?

Sadly, I don't think so. As one example: if the platform disables sync XHR then developers will generally quickly realize that their requests are failing and likely update their application logic to fix this. If we give cross-origin attackers the ability to induce such failures, this will not be visible to application developers -- their application will work as before, but it now might be vulnerable in scenarios like the one I mentioned above.

Practically, my guess is that the risk is probably relatively low and it's possible that the value of removing sync XHR from the platform outweighs the security concerns. But IMHO we need to be clear about the risks of doing this.

RByers · 2024-02-18T02:07:49Z

Thanks @arturjanc, you make some good points I hadn't thought of. Do you think there's some way to help quantify the level of risk here? Like if we were able to reduce usage to some lowish level via outreach and then do a spot check of the top remaining usage to manually evaluate the security risk in each case? And/or maybe we do a GitHub search for uses of sync XHR and come up with a bound on the fraction which could have security implications? If we can find real examples of what you're describing then I'm much more amenable to it being a blocking concern. But if, after trying, we can't find a single example then I'd be quite sad to block progress fixing a platform mistake on a pure hypothetical.

Here's a slightly different idea. Maybe we could get to the point where we could launch this policy with a default allowlist of self so that it's off by default in 3P iframes, but top frames can turn it back on. Given your argument about platform-wide deprecation being safer, this would be significantly less risky right? Of course I don't know what it would take to meet our compat bar for this, but maybe a bit of analysis and outreach could get there...

yoavweiss · 2024-02-19T06:24:10Z

Thanks @arturjanc, you make some good points I hadn't thought of.

Same. Thank you for the clear example!

If we can find real examples of what you're describing then I'm much more amenable to it being a blocking concern. But if, after trying, we can't find a single example then I'd be quite sad to block progress fixing a platform mistake on a pure hypothetical.

I agree it'd be good for us to find out if this is a hypothetical or a real-world risk.
We can probably classify sync XHR usage by cross-origin iframes into categories:

Analytics beaconing
"freeze the world" (where the side effect of stopping the main thread is what developers are trying to accomplish)
Fetching of data.

Only (3) may suffer from this issue, depending on what the data is used for. But (3) is also the one easiest to move to an async fetch without any "functionality" loss (but admittedly, with some refactoring to bubble up asynchronicity up the call stack).

So I'm somewhat optimistic that analysis will reveal that (1) and (2) are the main "use-cases" for sync XHR in cross-origin iframes, where (1) is being replaced by fetchLater and (2) is a user-hostile pattern rather than a legitimate use case.

Here's a slightly different idea. Maybe we could get to the point where we could launch this policy with a default allowlist of self so that it's off by default in 3P iframes, but top frames can turn it back on.

If we can pull that off from a compatibility perspective, it might be better to simply deprecate & remove. Giving top-level frames the ability to turn this back on doesn't give us much over e.g. a temporary deprecation trial. (unless we want to keep sync XHR as a platform feature)

arturjanc · 2024-02-19T10:13:39Z

So I'm somewhat optimistic that analysis will reveal that (1) and (2) are the main "use-cases" for sync XHR in cross-origin iframes, where (1) is being replaced by fetchLater and (2) is a user-hostile pattern rather than a legitimate use case.

This sounds reasonable, but one snag here is that, at least from a security perspective, we need to think not only about the usage of sync XHR in iframes, but in any document. Cross-origin embedding is possible by default in the platform, so any document that relies on sync XHR -- even if it's not explicitly meant to be embedded -- can be iframed by an attacker and have its behavior modified through the sync-xhr switch (unless it has taken steps to protect itself by setting, XFO, frame-ancestors, etc - which only a minority of web-facing content does).

This ties in with @RByers's question about measuring the risk; we can catalog existing uses of sync XHR (via crawls, UKM, or poking around GitHub), but even if we don't find any examples of it being used in a security-sensitive context we'll at most be able to say we don't think this pattern is common. But there still might exist long tail sites that this behavior would make vulnerable; and we sadly don't have a great way to distinguish a hypothetical concern with zero real-world impact from a problem that would affect a small number of niche sites (some of which could be sensitive for users).

Perhaps the best argument for doing this is that AFAIK Chrome has had this Feature Policy flag for a long time and the sky seemingly hasn't fallen, or at least I'm not aware of application-level vulnerabilities that were enabled by this switch.

If we can pull that off from a compatibility perspective, it might be better to simply deprecate & remove. Giving top-level frames the ability to turn this back on doesn't give us much over e.g. a temporary deprecation trial.

+1. Y'all have thought about this much more, but I'm wondering if there's some way to deprecate it globally, but divide this into stages that would limit breakage? For example, say we capped sync XHR to some timeout and only errored out if that limit is exceeded -- this could let us start ratcheting up restrictions on cases that are presumably worst for performance and eventually deprecate it altogether.

yoavweiss · 2024-02-19T11:30:13Z

This sounds reasonable, but one snag here is that, at least from a security perspective, we need to think not only about the usage of sync XHR in iframes, but in any document. Cross-origin embedding is possible by default in the platform, so any document that relies on sync XHR -- even if it's not explicitly meant to be embedded -- can be iframed by an attacker and have its behavior modified through the sync-xhr switch (unless it has taken steps to protect itself by setting, XFO, frame-ancestors, etc - which only a minority of web-facing content does).

Right, we'd need to do this analysis for documents in general, not only cross-origin frames.

+1. Y'all have thought about this much more, but I'm wondering if there's some way to deprecate it globally, but divide this into stages that would limit breakage? For example, say we capped sync XHR to some timeout and only errored out if that limit is exceeded -- this could let us start ratcheting up restrictions on cases that are presumably worst for performance and eventually deprecate it altogether.

I like the approach of failing these fetches based on relative user harm. Ratcheting that down can nudge users of (1) and (3) to look for less harmful alternatives.

annevk · 2024-03-16T08:06:53Z

Couple things I wanted to note:

It seems WebKit has supported sync-xhr as well for about five years: https://bugs.webkit.org/show_bug.cgi?id=202098.
sync-xhr was removed from XMLHttpRequest quite a while ago, it's been non-standard ever since (perhaps it wouldn't have happened with the current WHATWG Working Mode though, as there seemingly was no follow through with tests and implementer agreement): whatwg/xhr@9157397

Given that two browser engines have deployed this for quite a while I think I'm not opposed to keeping it in, it's definitely the more straightforward option, but I do think we all need to agree that it's not a precedent we can build on going forward for the reasons Artur helpfully laid out.

This was referenced Nov 6, 2020

Replace Feature Policy integration with Document Policy. whatwg/xhr#295

Open

Replace Feature Policy integration with Document Policy whatwg/xhr#296

Open

clelland mentioned this issue Nov 17, 2020

Feature Policy: Document Policies w3ctag/design-reviews#408

Closed

5 tasks

theblue888 mentioned this issue Dec 6, 2020

Proposal: Transition 'document-domain' feature to Document Policy #414

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Transition 'sync-xhr' feature to Document Policy #410

Proposal: Transition 'sync-xhr' feature to Document Policy #410

clelland commented Oct 28, 2020

martinthomson commented Oct 29, 2020

yoavweiss commented Oct 29, 2020

annevk commented Oct 29, 2020

mikewest commented Oct 29, 2020

arturjanc commented Oct 29, 2020

clelland commented Oct 30, 2020

RByers commented Feb 11, 2024

arturjanc commented Feb 17, 2024

RByers commented Feb 18, 2024

yoavweiss commented Feb 19, 2024

arturjanc commented Feb 19, 2024 •

edited

Loading

yoavweiss commented Feb 19, 2024

annevk commented Mar 16, 2024

Proposal: Transition 'sync-xhr' feature to Document Policy #410

Proposal: Transition 'sync-xhr' feature to Document Policy #410

Comments

clelland commented Oct 28, 2020

martinthomson commented Oct 29, 2020

yoavweiss commented Oct 29, 2020

annevk commented Oct 29, 2020

mikewest commented Oct 29, 2020

arturjanc commented Oct 29, 2020

clelland commented Oct 30, 2020

RByers commented Feb 11, 2024

arturjanc commented Feb 17, 2024

RByers commented Feb 18, 2024

yoavweiss commented Feb 19, 2024

arturjanc commented Feb 19, 2024 • edited Loading

yoavweiss commented Feb 19, 2024

annevk commented Mar 16, 2024

arturjanc commented Feb 19, 2024 •

edited

Loading