Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider a different process for standardizing registerProtocolHandler schemes #9158

Open
zcorpan opened this issue Apr 13, 2023 · 33 comments
Open
Labels
meta Changes to the ecosystem around the standard, not its contents. topic: custom protocols

Comments

@zcorpan
Copy link
Member

zcorpan commented Apr 13, 2023

Related: #8503 #9154

It's not clear that Chromium, Gecko and WebKit should be gatekeepers for adding schemes to registerProtocolHandler(), since the interest to add a new scheme may be for something that are not relevant for browsers. I received feedback internally at Mozilla that we'd prefer a process where we don't need to be consulted for adding to the list of supported schemes.

Thoughts?

@zcorpan zcorpan added the meta Changes to the ecosystem around the standard, not its contents. label Apr 13, 2023
@rastislavcore
Copy link

IMHO RFC schemes should be approved. For the rest, there should be a discussion prior to implementation.

@domenic
Copy link
Member

domenic commented Apr 13, 2023

I received feedback internally at Mozilla that we'd prefer a process where we don't need to be consulted for adding to the list of supported schemes.

I'd like to hear more about this. What process would you prefer to use for landing code in https://searchfox.org/mozilla-central/source/dom/base/Navigator.cpp#886 , which does not involve consulting anyone at Mozilla?

@rastislavcore
Copy link

There are two ways to go about this question:

  • formal
  • decentralized

If we would like to strictly follow the standards we should follow: the Internet Engineering Task Force (IETF), the Internet Research Task Force (IRTF), the Internet Architecture Board (IAB), Independent Submissions
RFC is accepting those. Independent Submissions are available but the process is very long.

Another way to go (which is my favorite) is to produce the DAO entity or project, which will be based on the voting system, or even the new repo here on GitHub. It will collect all the pros/cons to implement the proposed protocol.

I assume the formal way is now out of the question because there are introduced protocols, such as: bitcoin, matrix, etc.

Then the only way will be not to take out the freedom of choice but require enough explanation and use cases. Obviously, RFC is a use case as well.

Just my 10 cents, but I will be happy to elaborate.

@zcorpan
Copy link
Member Author

zcorpan commented Apr 13, 2023

@domenic Once it has landed in the spec it's fine to need engineering and review from Mozilla to land in Gecko, as it's a trivial change to add a new scheme.

For preferred process for adding schemes to the spec, maybe something like:

  • a standards-track spec for the scheme exists
  • evidence that there are 2+ interested independent parties for using the scheme with registerProtocolHandler
  • the HTML spec editors are trusted to not allow schemes that are unsafe

@domenic
Copy link
Member

domenic commented Apr 13, 2023

So you're saying, Gecko would pledge to automatically lend engineering support for anything in that kind of category? And would like to have the editors be responsible for gathering and collating that information (or asking the proposers to do so)? That seems reasonable, but is largely a Gecko policy for how they want to manager their codebase.

For Chromium, it seems like there's very little willingness to add schemes. I think at least three Chromium engineers have bounced off trying to do so, unable to get appropriate approvals: @fred-wang, @mgiuca, and @mustaqahmed, if I recall correctly. (I know @javifernandez is also working in this area but I'm not sure if he's tried to add any new schemes.) So I'm not hopeful for Chromium making a similar policy in how they manage https://source.chromium.org/chromium/chromium/src/+/refs/heads/main:third_party/blink/common/custom_handlers/protocol_handler_utils.cc;l=66;drc=837cc12de25a288edf3ac222f7265c9936e69552;bpv=1;bpt=1 .

@martinthomson
Copy link

So as the instigator here, I should probably weigh in.

The concern here is not really about preserving engineering resources on our end; if things started getting high volume, we might automate something. The concern as a few aspects:

  1. Gatekeeping, as Simon said. Having people ask Mozilla, Apple, and Google for permission to do a thing here (and web+ doesn't count if they have a scheme of their own), is not a good use of anyone's time. Proposers and WHATWG both.

  2. Endorsement/disendorsement. We don't want to be in a position to have to endorse schemes. I personally have very strong objections to bitcoin: as something connected to a larger idea, but I don't want to have Mozilla conclude one way or other regarding that technology by virtue of having to endorse its inclusion in this list. The same goes for the next planet-destroying technology that is referenced in this way.

This part of the specification isn't central to HTML, it is a point where it interfaces with the messy, non-inoperable world outside of HTML. A different process is appropriate.

Our experience with this sort of list (a registry if you will) in the IETF is that the higher the bar you set1, the more damage it causes. That's counterintuitive, because bad designs are bad and the use of a registry seems like a great way to head off bad ideas. But when the bar is high, people just don't register their scheme.

The harm that comes from not registering something is immediately obvious here. After all, if it is not on the list, browsers won't implement, because browsers are good at tracking WHATWG decisions and specifications. However, the IETF has a somewhat looser policy for registering new URI schemes (though the registration process is pretty onerous, to the point that only Dave Thaler seems to have bothered to register many of them). But the IETF policy isn't the operative one: operating systems exercise no such control, so there are many URI schemes that are in use. But web applications cannot access those schemes. That harms the web, at least relative to native applications.

So I think that this allowlist would be better off as a blocklist (that lists "http", "https", "file", "about", "chrome", and anything we can identify as being outright dangerous). Then we would either have no further restriction or a restriction to only permit registration of schemes from the IANA registry. The advantage of having no allowlist is that maintenance is much easier. The advantage of letting IANA/IETF manage this is that they have procedures that already deal with the business of requesting specifications and whatnot2, so maybe you get better adherence to standards. I personally have a preference for nothing on the basis that IANA/IETF have not been able to attract registrations for many of the schemes that are in wide use3.

However, neither option addresses the risk that a scheme might appear on the list that is manifestly unsafe to expose to the web. That's a risk that increases here. Though URI schemes are supposed to be safe, they manifestly are not. For instance, capability URIs are a thing, regardless of scheme and having a web site intercept a capability URI intended for a native application might cause real problems that are not adequately counterbalanced by any prompting that a user agent might do in response to a call to registerProtocolHandler. I think that browsers generally do a reasonable job here, and that we should defer more to our users in these cases on the edges of the Web4, but I do want to make that decision explicit.

Footnotes

  1. This is currently a very high bar, because it simply follows other WHATWG procedures that exist for a very good reason. For most of what we do here, those procedures are absolutely necessary. Not so here.

  2. Specifications are encouraged, but not mandatory for provisional registrations. Some of the things on the current allowlist are provisional and not specified to any satisfactory degree; for example, the magnet registration cites a Wikipedia page.

  3. zoommtg for instance isn't listed, but it is very widely used. That clearly doesn't fit with the principles for registration set out in RFC 7595. I just take that as more evidence that trying to control how these are used is futile.

  4. I would also contend that the harm caused by a website that handles a sensitive schemes is far less than the harm caused if a website can activate a scheme that is handled by a native application.

@rastislavcore
Copy link

Just a note. Instead of bitcoin: there is ethereum: and many more listed here https://developer.mozilla.org/en-US/docs/Web/API/Navigator/registerProtocolHandler#browser_compatibility

In your point 2 I somehow agree that the opinion is tendentious, but strongly disagree with "The same goes for the next planet-destroying technology that is referenced in this way." I assume this is off-topic now.

@javifernandez
Copy link
Contributor

For Chromium, it seems like there's very little willingness to add schemes. I think at least three Chromium engineers have bounced off trying to do so, unable to get appropriate approvals: @fred-wang, @mgiuca, and @mustaqahmed, if I recall correctly. (I know @javifernandez is also working in this area but I'm not sure if he's tried to add any new schemes.) So I'm not hopeful for Chromium making a similar policy in how they manage https://source.chromium.org/chromium/chromium/src/+/refs/heads/main:third_party/blink/common/custom_handlers/protocol_handler_utils.cc;l=66;drc=837cc12de25a288edf3ac222f7265c9936e69552;bpv=1;bpt=1 .

My colleague @fred-wang was working in the past on some proposal to add a few dweb schemes to the HTML spec's safe-list. Ultimately, I followed up, but splitting the effort and focus first on ipfs. I've filed standard position requests for Firefox already.

However, regarding the issue of leave aside browsers from the scheme standarization process. I don't have a strong opinion. I see good points on both positions, tbh.

@RByers
Copy link

RByers commented Apr 21, 2023

I only have a little context in this topic and will have to get caught up with the positions of other experts in Chrome, especially around security. But I wanted to say that @martinthomson's points really resonate with me. I think we should always actively work to avoid the temptation to gatekeep unless there's a very compelling and concrete reason. I think we can protect users from abuse without having to explicitly decide to endorse each experiment with a new scheme, and we can always add stronger warnings or outright block schemes that turn out to be a problem for our users in practice.

I don't really like it, but note that Chrome already has an extremely wide open model on Android where any website can invoke an intent: URL to talk to any Android app, usually without any browser UI warning the user. Having a standardized list of schemes for registerProtocolHandler while having a completely wide open communication channel to Android's equivalent propriety mechanism seems to me to be at best inconsistent, and at worst one more reason to encourage exploration and innovation to occur in proprietary app platforms instead of on the web.

@mgiuca
Copy link
Contributor

mgiuca commented Apr 24, 2023

Some housekeeping: we have a still-open issue #3998 for converting the allowlist to a blocklist (essentially allowing all schemes to be used as handlers, other than a handful of harmful or problematic schemes like http and about). I would encourage people to read over that as it has a lot of useful background and arguments.

This bug was originally filed as streamlining the process for adding schemes to the allowlist, but @martinthomson and @RByers comments seem to be moving into the direction of converting the allowlist to a blocklist. Just so we're clear, both of those options are on the table here.

(FWIW: I agree with Martin and Rick and I think we'd be in a much better place if we just switch to a blocklist, and I made several arguments for that on #3998. As I said there, I don't think the current scheme buys us any security advantage, though I think we need to consult with security folks on that. There are potential compat risks with opening it wide up, but I think that ship has already sailed by allowing native apps to register whatever they want. This just allows web apps to catch up.)

@annevk
Copy link
Member

annevk commented Apr 24, 2023

Is the expectation here that end users understand arbitrary schemes and can make reasonable decisions about them?

@martinthomson
Copy link

The understanding is that the content of arbitrary schemes is not an inherent risk to users if it is passed to an agent of their choice.

The main risk is if the link itself carries information that the site should not be able to access (see also capability URIs). That is, the user erroneously decides to allow a site to receive URIs for a given scheme, when those URIs carry information intended for a different entity. This is a risk that exists regardless of the scheme. mailto URIs carry this risk, too. All URI schemes potentially do.

That makes the user decision less about the security of links and more about delegation of agent authority, annoyance mitigation, and other factors. Delegating authority to act as an agent for a URI scheme is a powerful capability, certainly, but I'd suggest that the only way to deal with that is to disable registerProtocolHandler entirely as the scheme is the wrong axis along which to exert control.

@annevk
Copy link
Member

annevk commented Apr 24, 2023

So giving arbitrary websites control over say, zoommtg, is not a concern?

@martinthomson
Copy link

It depends on what you think you are protecting. Zoom might be unhappy if someone built a client that handled zoommtg- their client is entirely proprietary and that appears to be a deliberate choice on their part - but if a user chooses to have a site intercept those links, that would be their choice (in my view).

There is potential for this to be used by sites in an attempt to hijack access, which a user might accidentally permit. That would be annoying, but we need mechanisms for correcting that sort of error, even when the choice was deliberate. For that, operating systems do provide some interfaces for high-value stuff and browsers offer UX for managing the small number of schemes that they have registered. Firefox has a list in settings that covers both links to apps (like zoommtg) and links to sites (like mailto for webmail sites). In your hypothetical case a Zoom client could re-associate when it detects this has happened.

@zcorpan
Copy link
Member Author

zcorpan commented May 3, 2023

I'm warming up to the idea of switching to a blocklist. Disallowing anything not in the IANA registry means there's at least some friction to start using a new scheme, which makes it harder to hijack typo/lookalike schemes. The disadvantage is that there's still the delay from minting a new scheme to being able to use it in browsers.

I also think we should add all browser-specific schemes like "chrome" to the spec's blocklist, for reasons @annevk outlined in #3998 (improve interop and web compat). If Zoom would like zoommtg to be on the blocklist, we can add it to the blocklist. The IANA registry has contact information for all registered schemes, so we could proactively ask if any of them should also be on the blocklist, so that they're not surprised when other websites can suddenly register their scheme.

@annevk
Copy link
Member

annevk commented May 3, 2023

I have to say that it's still very much unclear to me you can have a good user interface beyond a limited set of schemes that correspond to tasks end users understand.

In your hypothetical case a Zoom client could re-associate when it detects this has happened.

So you are comfortable forcing this on all apps (which I think includes most macOS/iOS apps)? Presumably they would now all have to start caring about websites in Chrome or Firefox hijacking their schemes.

@martinthomson
Copy link

I tend to think that we should deny Zoom the ability to be added to a blocklist. We're not in the business of protecting apps from their competition. Also, that would stop the Zoom website itself from handling those links, which is something that it could do, should it choose to. Finally, in this case at least, there are probably enough technical hurdles to clear for a functional site to intercept the link, we're just looking at an abuse scenario.

Better UX seems to be the answer to most of these issues. For instance, browsers might add a confirmation notice before opening the handler site on the first launch. That might help with accidental interceptions. And I don't see why a browser couldn't do more, maybe with a short list of schemes that they understand and for which they loosen any stricter handling. Schemes that are natively understood can maybe be explained better ("Allow this to become your default email sender? [y/n]"), so that makes sense.

Is there anything other than the potential for users to understand a scheme that is getting in the way here?

@mgiuca
Copy link
Contributor

mgiuca commented May 4, 2023

Presumably [native apps] would now all have to start caring about websites in Chrome or Firefox hijacking their schemes.

Yes, but they already have to care about other native apps hijacking their schemes. (e.g. what's to stop a native app from claiming zoommtg:, in fact without asking the user on some systems). This just lets websites do the same, but in accordance with the web security model which is asking the user for permission to do so.

@domenic
Copy link
Member

domenic commented May 24, 2023

So with my editor hat on, what I'm seeing here is implementer interest from both Mozilla and Chromium to switch to a blocklist-based approach. That meets our criteria for changing the standard. (Especially since those are the only two implementers of registerProtocolHandler!)

It seems like the remaining work is for someone to drive the creation of the blocklist, and updates to the spec and web platform tests. Do we have any volunteers?

@zcorpan
Copy link
Member Author

zcorpan commented Jun 16, 2023

I started looking into what the blocklist should contain based on what browsers support and comments in #3998. PTAL

https://docs.google.com/spreadsheets/d/1_jA5brgGVwTNvCzVDagfwCAZQIH2gYq6RlBS4QtSX1o/edit#gid=0

copy of the sheet's current state
Scheme Firefox Chrome Safari IANA URL standard Notes
Containing any character other than ASCII alphanumeric, U+002B (+), U+002D (-) (don't allow .) FALSE FALSE FALSE FALSE TRUE https://url.spec.whatwg.org/#scheme-state
Containing only numbers FALSE FALSE FALSE FALSE FALSE #3998 (comment)
A single ASCII alpha TRUE TRUE FALSE FALSE TRUE https://url.spec.whatwg.org/#windows-drive-letter
localhost FALSE FALSE FALSE FALSE FALSE #3998 (comment)
localdomain FALSE FALSE FALSE FALSE FALSE #3998 (comment)
chrome TRUE TRUE FALSE FALSE FALSE #3998 (comment)
chrome- prefix FALSE TRUE FALSE FALSE FALSE #3998 (comment)
cros FALSE TRUE FALSE FALSE FALSE #3998 (comment)
android-app FALSE TRUE FALSE FALSE FALSE #3998 (comment)
content FALSE TRUE FALSE FALSE FALSE #3998 (comment)
cid FALSE TRUE FALSE FALSE FALSE #3998 (comment)
view-source TRUE TRUE FALSE FALSE FALSE #3998 (comment)
about TRUE TRUE TRUE FALSE FALSE  
example FALSE FALSE FALSE TRUE FALSE https://www.rfc-editor.org/rfc/rfc7595.html#section-8.1
filesystem FALSE TRUE FALSE TRUE FALSE https://source.chromium.org/chromium/chromium/src/+/main:url/gurl.cc;l=389?q=schemeis&ss=chromium%2Fchromium%2Fsrc&start=31
file TRUE TRUE TRUE TRUE FALSE  
http TRUE TRUE TRUE TRUE TRUE  
https TRUE TRUE TRUE TRUE TRUE  
ws TRUE TRUE TRUE TRUE TRUE  
wss TRUE TRUE TRUE TRUE TRUE  
blob TRUE TRUE TRUE TRUE TRUE  
data FALSE FALSE FALSE FALSE FALSE  
javascript FALSE FALSE FALSE FALSE FALSE  
moz- prefix TRUE FALSE FALSE FALSE FALSE  
page-icon TRUE FALSE FALSE FALSE FALSE  
resource TRUE FALSE FALSE FALSE FALSE https://searchfox.org/mozilla-central/source/netwerk/base/nsNetUtil.cpp#1914
indexeddb TRUE FALSE FALSE FALSE FALSE https://searchfox.org/mozilla-central/source/netwerk/base/nsNetUtil.cpp#1922
uuid TRUE FALSE FALSE FALSE FALSE https://searchfox.org/mozilla-central/source/netwerk/base/nsNetUtil.cpp#1922
jar TRUE FALSE FALSE FALSE FALSE https://searchfox.org/mozilla-central/source/netwerk/base/nsNetUtil.cpp#1959
smb TRUE FALSE FALSE FALSE FALSE https://searchfox.org/mozilla-central/source/netwerk/base/nsNetUtil.cpp#1972
android TRUE FALSE FALSE FALSE FALSE https://searchfox.org/mozilla-central/source/netwerk/base/nsNetUtil.cpp#1980
app-settings FALSE TRUE FALSE FALSE FALSE https://source.chromium.org/chromium/chromium/src/+/main:ios/chrome/browser/app_launcher/app_launcher_tab_helper.mm;l=43?q=schemeis&ss=chromium%2Fchromium%2Fsrc
fuchsia-pkg FALSE FALSE FALSE FALSE FALSE https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/fuchsia/element_manager_impl.cc;l=26?q=schemeis&ss=chromium%2Fchromium%2Fsrc&start=11
isolated-app FALSE FALSE FALSE FALSE FALSE https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/ui/chrome_pages.cc;l=223?q=schemeis&ss=chromium%2Fchromium%2Fsrc&start=11
ipps FALSE TRUE FALSE FALSE FALSE https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/ash/printing/server_printers_fetcher.cc;l=247?q=schemeis&ss=chromium%2Fchromium%2Fsrc&start=51
ipp FALSE TRUE FALSE FALSE FALSE https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/ash/printing/server_printers_fetcher.cc;l=247?q=schemeis&ss=chromium%2Fchromium%2Fsrc&start=51
urn FALSE TRUE FALSE FALSE FALSE https://source.chromium.org/chromium/chromium/src/+/refs/heads/main:url/url_constants.cc;l=45;drc=f80633b34538615fcb73515ad8c4bc56a748abfe
uuid-in-package FALSE TRUE FALSE FALSE FALSE https://source.chromium.org/chromium/chromium/src/+/refs/heads/main:url/url_constants.cc;l=47;drc=f80633b34538615fcb73515ad8c4bc56a748abfe
applewebdata FALSE FALSE TRUE FALSE FALSE https://github.com/WebKit/WebKit/blob/515867b8e7c87f14b42d625c9b5181740568944c/Source/WebCore/page/SecurityOriginData.cpp#L198
x-apple- prefix FALSE FALSE TRUE FALSE FALSE https://github.com/WebKit/WebKit/blob/515867b8e7c87f14b42d625c9b5181740568944c/Source/WebCore/page/SecurityOriginData.cpp#L198
webkit- prefix FALSE FALSE TRUE FALSE FALSE https://github.com/WebKit/WebKit/blob/515867b8e7c87f14b42d625c9b5181740568944c/Source/WebCore/page/SecurityOriginData.cpp#L206

@zcorpan
Copy link
Member Author

zcorpan commented Jun 22, 2023

Should intent be on the blocklist?

@RByers
Copy link

RByers commented Jun 27, 2023

For the record, I'm supportive of this for chromium and pretty sure it would pass an I2S (though we'll need a spec change to go through the formal process).

Yes I think intent should be on the blocklist, see documentation.

More important than the specific list of schemes to block is, I think, the principles we'd use for deciding what should be on the blocklist. Perhaps just "any scheme that are implemented strictly internally by major browser engines and host OSes"? Critically, I agree that schemes commonly used by applications should not be on the blocklist.

@RByers
Copy link

RByers commented Jun 27, 2023

Searching chromium code it's unfortunately not likely we can come up with an exhaustive list, but this seems to be a pretty good superset. There's a bunch of obscure things, eg. ChromeOS-specific implementation details, I'm not sure how much we should worry about trying to figure out if they need to be on the blocklist or not? Since they are impl details, if we find a conflict with a website we could consider treating it as a chrome bug and just rename to use the chrome- prefix.

@zcorpan
Copy link
Member Author

zcorpan commented Jun 29, 2023

Thanks @RByers!

More important than the specific list of schemes to block is, I think, the principles we'd use for deciding what should be on the blocklist. Perhaps just "any scheme that are implemented strictly internally by major browser engines and host OSes"? Critically, I agree that schemes commonly used by applications should not be on the blocklist.

This makes sense to me.

Outside of Android, are there OS-specific schemes we should add?

@RByers
Copy link

RByers commented Jun 29, 2023 via email

@mgiuca
Copy link
Contributor

mgiuca commented Jun 30, 2023

Looking through those search results @RByers posted, with an eye to the ChromeOS ones, they feel to me like they won't be affected.

This will only be a problem for URL schemes that the system uses for navigations in the browser, but I suspect most or all of these are used by other systems outside of the browser.

The list of ChromeOS-specific schemes that I can find in that list:

  • smb (standard scheme, we probably want websites to be able to take over this at the user's discretion)
  • opentab
  • keyboard_shortcut
  • vmfile
  • steam (specifically intercepted from the browser, but if the user wants a different app to handle these URLs, they should)
  • drivefs
  • intent (discussed above)
  • invalid, direct, socks, socks5, quic

We would probably need to go through each one in detail, but I doubt most of these are things that you click links to in the browser and are intercepted by the OS. I'm not sure if some of these are used for internal UI pages (similar to chrome:) whether having a handler register them would pose a problem.

Either way, I wouldn't want a blocklist in the spec that blocks ChromeOS-specific things, nor common things like steam which belong to a third-party app that happens to be special-cased at the OS level. I'd want the spec to have a carve-out that says the user agent can block any additional schemes that it wants. This creates a slight incompatibility issue, but it's more along the lines of features that happen not to work in certain user agents.

@martinthomson
Copy link

ChromeOS is maybe not the benchmark I would use. smb is a particularly good example. I interpret that as equivalent to file, which ChromeOS might want to make available to sites/apps as well. I would not want to open up file in that way.

A site that handles smb is unlikely to be able to access the identified files. But the content of the URI (i.e., the identity of those files) is worth protecting. registerProtocolHandler can result in OS-level changes to how these schemes are handled, so it's not just about what is clicked in the browser. If existing systems will generate file or smb links and expect the system handler to deal with those appropriately when clicked, having the URI passed to a web site is at best surprising.

It is possible at least that the OS would also act to protect file URLs so that apps can't take them over, so this point might be moot for file.

My general position here is that if there is an identifiable risk to users, a scheme is a candidate for the blocklist. Not blocking something on because sites might need access should be assessed on the basis of the risk to users of standard browsers, not something like ChromeOS. ChromeOS can selectively loosen the list if it deems things to be safe, but it can do that in the same way that it exposes other APIs that should not be exposed to the web (I don't want to start a fight here, so I won't start listing examples). I consider the above to be sufficient risk, but I can see ChromeOS might need a very different threshold. I would prefer if ChromeOS maintained an exception list.

Similar logic applies to intent, if indeed ChromeOS wants those passed to web sites. I would look for additional safeguards in all those cases, though, like ensuring that apps^Wsites can only receive intents for which they are approved.

I have insufficient information on the other items that are listed there. None appear on the list so far.

I agree that steam definitely doesn't go on a blocklist any more than zoommtg does. I have no problem with browsers recognizing widely used schemes and applying extra treatment (different UX for instance) but that's outside of this.

quic/socks/direct/etc... don't belong here. They appear as a consequence of being used in proxy.pac. Those aren't URI schemes though.

@zcorpan
Copy link
Member Author

zcorpan commented Jun 30, 2023

  • smb (standard scheme, we probably want websites to be able to take over this at the user's discretion)

Removed smb from the spreadsheet. Thanks!

@mgiuca
Copy link
Contributor

mgiuca commented Jul 3, 2023

@martinthomson sorry, I didn't mean to imply we should special-case all those. I was responding to @RByers by going through the schemes used by ChromeOS, concluding that they are either schemes that aren't used in the browser at all (but by other parts of the OS) or that we shouldn't interfere with sites from registering as an alternative. I generally agree that we should not blocklist schemes just because some OS or other makes special use of the scheme. But we should (perhaps obviously) leave it open for a user agent or OS to add its own blocklist, in case there are special schemes that, in the context of that OS, should not be registrable by sites.

If existing systems will generate file or smb links and expect the system handler to deal with those appropriately when clicked, having the URI passed to a web site is at best surprising.

For file I agree: file is defined as "find this file on my system" and an application should not be able to override that. However, smb is just a protocol, and an app might be better suited at opening it than a native application (as websites become more powerful, gaining app-like capabilities). The purpose of registerProtocolHandler is to allow sites to handle things traditionally in the domain of an application. Saying "having the URI passed to a web site is at best surprising" is relegating websites to second-class citizens. My goal is to make them first-class citizens, at least when properly installed. So I don't think we should blocklist schemes when they would merely "surprise" a user to have them open in a web browser. We should expect that users are in many cases not thinking of the thing that opens as a web browser, since it may be styled like an app, and thought of as an app.

Consider: In 2006, Firefox added registerProtocolHandler for mailto, and you could have made a similar claim "The user expects a mailto URL to be handled by their mail client. Having it open in a web browser is surprising." This claim seems absurd now, as we don't really have mail clients outside the browser, but that's only because registerProtocolHandler led the way. The purpose of this change is to stop gatekeeping, at the spec level, what schemes can be handled by a website, and let developers and users decide which schemes they are OK letting a website handle.

My general position here is that if there is an identifiable risk to users, a scheme is a candidate for the blocklist.

I think that is too high a bar. We can always identify a risk to users of allowing a website to intercept clicked URLs, for any scheme, especially if we consider merely leaking the contents of a URI to a site as a "risk". mailto allows the registered site to read email addresses. geo allows the registered site to read potentially very accurate geo coordinates. We can't blocklist schemes based on the possibility that there is sensitive information in the URL. Remember that users must opt in to registering a site, and part of that opt-in process is accepting that any URLs of that scheme will leak to the site.

I think the bar for the blocklist should be "having any application (website or a "real" app) intercepting this URL would break basic assumptions about how the web works" (e.g. http, file, about, ws, blob)", with individual UAs able to block additional schemes for the same reason (e.g. chrome, moz, intent) but those shouldn't be on a spec blocklist. Anything else is up to the user if they want to expose URLs of that type to the app.

Not blocking something on because sites might need access should be assessed on the basis of the risk to users of standard browsers, not something like ChromeOS. ChromeOS can selectively loosen the list if it deems things to be safe...

This seems to suggest that we have a standard blocklist, but user agents can selectively remove schemes from the blocklist if they think they are safe. I think we could do that, but it would be akin to adding non-standard APIs to the web. My preference would be to have a minimal blocklist, and let user agents selectively add schemes to it, if they have a system-specific reason why it is unsafe, which would be akin to a user agent choosing not to implement a given API, in general much less of a compatibility problem.

Looking at Simon's list, I would put on the blocklist just the ones that are common to Chrome, Firefox and Safari: about, file, http, https, ws, wss, blob. I'm also not sure why data and javascript aren't blocked: those are pretty important. Everything else is fair game, IMO, to be registered by an app, or blocked by individual user agents for being system-critical on that particular system.

@martinthomson
Copy link

That's a pretty reasonable position. I do want to push back on your stated inclusion principle a little though:

having any application (website or a "real" app) intercepting this URL would break basic assumptions about how the web works

On the basis that registered handlers involve changing how the entire system operates, not just the browser, we do need to consider the system as a whole as part of this. That is, "having any application intercept this URL would break basic assumptions about how the web - or operating system - operates".

I think that file fits that definition better than anything that involves the browser itself. But if you consider it, the reason isn't down to the mechanisms involved in resolution. https://example.com is no more able to resolve /etc/passwd than any local application, whether they are acting as confused deputy or not. The real risks are twofold:

  1. Some components of the system might depend on certain actions occurring when resolution of file URIs occurs. For instance, running an executable. Sites aren't really in any position to perform those functions.
  2. The identity of files might be secret and revealing their identity might compromise system integrity.

I tend to think that smb and intent also that that definition on the basis that it is a system that has not considered the potential for references to become available to arbitrary websites. The former for reason 2 (ipp also, I'm going to hazard), the latter for reason 1 (sometimes; ChromeOS probably differs from Android in that it relies on sites being recipients of intents).

Perhaps then the answer is that there is, as you say, a relatively short core blocklist plus an adjunct list of things that we say "should" be blocked in most cases, so that sites cannot expect those to work. I don't think that Firefox would have any problem with blocking webkit-* or chrome, if only to ensure some amount of consistency in behaviour between platforms. On the other hand, sites building ChromeOS applications generally know that that is what they are doing and so they can proceed with UA detection or whatever to ensure that their request to handle intent is likely to work.

What I don't want to have is wide divergence in lists - that's not a great experience for site authors.

I definitely agree regarding data and javascript. Intercepting those would have far-reaching implications. Good catch.

@zcorpan
Copy link
Member Author

zcorpan commented Jul 4, 2023

data and javascript are there (rows 23 and 24). I had missed to tick some checkboxes for those (now done).

We can allow UAs to add more to the blocklist, but I think some interop on the blocklist is useful. The reason is to prevent a situation where one browser allows a scheme to be registered and sites start using that scheme (e.g. content or resource), but it's blocked in another browser and used for a different purpose. That could be a tricky web compat issue.

@ben221199
Copy link

In my opinion there are more possibilities:

Other questions can arise:

  • What about a blocklist? Should data: and javascript: be on it? What about http: and https:, because the browser itself handles them? What about chrome:, edge:, about:?
  • What to do with web+ schemes?

Also note:

  • Users have to allow websites to register a protocol handler, so registerProtocolHandler can fail if the users says: NO. Maybe we have to trust browser users a little bit more...

@ben221199
Copy link

I also agree with point 2 of #9158 (comment) from @martinthomson. This can be seen in the policital way too. Maybe some people can conclude that a specific scheme is much more used in the rightwing or leftwing and therefore want to disendorse it. I don't think we should want that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta Changes to the ecosystem around the standard, not its contents. topic: custom protocols
Development

No branches or pull requests

9 participants