Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax requirements around the outer SNI value #575

Merged
merged 6 commits into from Nov 6, 2023

Conversation

chris-wood
Copy link
Collaborator

@ekr
Copy link
Collaborator

ekr commented Oct 19, 2023

I don't think this quite works. As I understand it, we:

  1. Are silent on what clients have to do, but just tell them "this may not work" if you don't make the names match, but only for the retry mechanism.
  2. We tell the servers that they SHOULD check that the names match.

The result here is that a conformant client can fail to interoperate with a server that behaves as we RECOMMEND. That seems bad.

@chris-wood
Copy link
Collaborator Author

I thought the silence was to be interpreted as a SHOULD, but I would be fine to explicitly say that.

draft-ietf-tls-esni.md Outdated Show resolved Hide resolved
Co-authored-by: Christopher Patton <cpatton@cloudflare.com>
@sftcd
Copy link
Collaborator

sftcd commented Oct 19, 2023

I'm fine with the change from MUST to SHOULD for the client, and I think calling out the SHOULD for clients as in the most recent commit is better than omitting 2119 terms for that.

The addition of a SHOULD for servers is wrong. (I would prefer a SHOULD NOT for servers but am ok with less.) The ECH protocol works fine without that, (unlike the reasonable SHOULD for clients related to fallback), so I'd at most be ok with a MAY for servers. Noting that servers that do not enforce that that check (the behaviour or all servers I've tested so far) has consequences is fine, but there's no need to add a SHOULD recommending making that check.

@chris-wood
Copy link
Collaborator Author

The addition of a SHOULD for servers is wrong. (I would prefer a SHOULD NOT for servers but am ok with less.) The ECH protocol works fine without that, (unlike the reasonable SHOULD for clients related to fallback), so I'd at most be ok with a MAY for servers. Noting that servers that do not enforce that that check (the behaviour or all servers I've tested so far) has consequences is fine, but there's no need to add a SHOULD recommending making that check.

It's not wrong -- it just doesn't match what you'd like. As has been clearly stated, failure to implement this check admits domain fronting. The normative language is here to discourage that behavior for servers.

@sftcd
Copy link
Collaborator

sftcd commented Oct 19, 2023

What's "wrong" is certainly debatable. However, while there may be good or bad reasons to (dis)like domain fronting, making this check a SHOULD is not necessary to make the ECH protocol to work. That's (I claim, not that strongly) demonstrated as current servers do not make that check.

@chris-wood
Copy link
Collaborator Author

It's not just about making things work. It's incumbent on us to write the spec in a way that discourages undesirable outcomes.

@sftcd
Copy link
Collaborator

sftcd commented Oct 20, 2023

OK, so we seem to agree that server-side check is not necessary to make the protocol work.

It's not clear to me we that we on here (or the TLS WG) have rough consensus as to the desirability of encouraging implementations to make domain fronting less or more possible. Deployments can of course validly make such choices, but that's different from a SHOULD that recommends that implementations make that choice for those deploying the protocol.

So overall it seems to me the spec here is better saying that servers MAY make such checks, calling out the consequences, but no more.

@chris-wood
Copy link
Collaborator Author

Implementations can of course make a decision as they see fit, which is precisely why this is phrased in the way it is. That said, I don't have any interest in debating what is the right verb to use here.

@sftcd
Copy link
Collaborator

sftcd commented Oct 20, 2023

I do have such a verbal interest:-) That;s a 2119 verb, so a SHOULD implies (to me) that servers really ought have code to make that check and send that error, which I think is a less good outcome.

@sftcd
Copy link
Collaborator

sftcd commented Oct 20, 2023

So here's a different argument against such a SHOULD for the server-side check. Not hugely strong, but worth thinking about.

Say a server supports thousands of public_names but very few ECH private keys. That will create config_id collisions and likely lead to erroneous protocol failures if the server-side SHOULD check we're discussing is enforced, especially if/as public_name values change and servers and clients are or are not up to date with the latest settings.

I'm not claiming that's a winning argument, but it's worth pondering as the (un)desirability of domain fronting may not be the only thing in play here.

@ekr
Copy link
Collaborator

ekr commented Oct 20, 2023

Some further thoughts.

  1. I don't find the domain fronting argument for requiring this check on the server to be very persuasive. Why is ECH the right place for the TLS WG to take a position on this, even assuming we have consensus to do so?

  2. ISTM that the stronger rationale for this check is that (1) clients which don't out the correct value in ClientHelloOuter will not interoperate properly in case of the need for fallback (2) fallback will be uncommon and so this is unlikely to be detected in normal operation (3) the check ensures that clients do the right thing (this is the argument of RFC 9413). From this perspective, @sftcd's observation that nothing untoward has happened from existing server's failure to check doesn't tell us much, because it's in the nature of this kind of issue that nonconformance is mostly not a problem until it is, at which point it is hard to repair.

  3. I still don't think that the SHOULD/SHOULD combination works because, as noted before, RFC 2119 allows for implementations to violate SHOULDs and we don't really provide reasons why this might be OK in this case. I think the following combinations are reasonable (I.e., they provide interoperability by conformant implementations):

  • MUST send and any behavior by the server
  • MUST send unless you have some out of band reason to know that the server doesn't check and that it won't do fallback, combined with any behavior on the server.
  • Any behavior by the client combined with MUST ignore on the server

With that said, for the RFC 9413 reasons above, I believe we should require the client to send and encourage the server to check. I could be persuaded to have an exception for clients who have some out of band reason to believe that the server doesn't want it, as in my second option above.

@chris-wood
Copy link
Collaborator Author

With that said, for the RFC 9413 reasons above, I believe we should require the client to send and encourage the server to check. I could be persuaded to have an exception for clients who have some out of band reason to believe that the server doesn't want it, as in my second option above.

The intent of the PR is exactly this. It requires clients to do this, with exceptions (hence the SHOULD), and encourages servers to check (hence the SHOULD). It's possible there are better ways to phrase it, but that's the goal here.

@ekr
Copy link
Collaborator

ekr commented Oct 20, 2023

Thanks. In that case, I don't think that the "if you don't do this these are the consequences" works. Rather it actually needs to be a MUST and lay out the specific conditions under which clients can violate that MUST. E.g.,

"It MUST place the value of ECHConfig.contents.public_name in the
"server_name" extension unless it has a specific reason to know that
the server in question does not implement the retry
mechanism described in {{rejected-ech}} and does not enforce the
above requirement {{client-facing-server}}."

@sftcd
Copy link
Collaborator

sftcd commented Oct 20, 2023

"It MUST place the value of ECHConfig.contents.public_name in the
"server_name" extension unless it has a specific reason to know that
the server in question does not implement the retry
mechanism described in {{rejected-ech}} and does not enforce the
above requirement {{client-facing-server}}."

I don't think the above quite makes sense, because the client (as a
TLS library) won't know things, in particular cannot know what the
server may or may not check and, once a correct ECHConfig is used,
and ECH decryption has succeeded, the retry scheme doesn't come
into play so there's no benefit to the server in that case in making
the check.

I'm fine with a SHOULD (or "MUST ... unless") for clients, but the
"unless" clause above seems wrong. How would this be?

'Clients SHOULD place the the value of ECHConfig.contents.public_name
in the (outer) "server_name" extension unless specifically directed otherwise
by application layer code. Clients that do not include that value will not
benefit from the retry mechanism described in {{rejected-ech}}, therefore
applications need to exercise caution if sending any other value, and e.g.
only use this approach with servers for which the application client can be
confident a correct ECHConfig value has been used and where the server
does not make a check comparing the outer "server_name" to an "
ECHConfig.contents.public_name value.'

@ekr
Copy link
Collaborator

ekr commented Oct 20, 2023

I don't think the above quite makes sense, because the client (as a
TLS library) won't know things, in particular cannot know what the
server may or may not check and, once a correct ECHConfig is used,
and ECH decryption has succeeded, the retry scheme doesn't come
into play so there's no benefit to the server in that case in making
the check.

No, I don't think this is correct. The purpose of this mandate is to restrain the behavior of TLS clients on the wire, not the behavior of individual TLS libraries.

@sftcd
Copy link
Collaborator

sftcd commented Oct 20, 2023

My argument is not only only about libraries. The part of my argument about libraries referred only to the specific wording you proposed. (Maybe github issues aren't the best way to argue such things though, the quoting is hard to do and interpret IMO.)

@ekr
Copy link
Collaborator

ekr commented Oct 20, 2023

Well, your proposed wording is about the interaction of the application and the library, and I think that this text should refer solely to the behavior of the client as a unit. And the only normative text is about the library's behavior and my point is that the text should refer to the unit.

@sftcd
Copy link
Collaborator

sftcd commented Oct 20, 2023

So you'd prefer this (if forced to choose)?

'Clients SHOULD place the the value of ECHConfig.contents.public_name
in the (outer) "server_name" extension unless specifically directed otherwise.
Clients that do not include that value will not benefit from the retry mechanism
described in {{rejected-ech}}, therefore clients need to exercise caution if
sending any other value, and e.g. only use this approach with servers for
which the client can be confident a correct ECHConfig value has been used
and where the server does not make a check comparing the outer
"server_name" to an ECHConfig.contents.public_name value.'

I could live with something like that but think it's less useful than the first text I suggested myself.

@ekr
Copy link
Collaborator

ekr commented Oct 20, 2023

So you'd prefer this (if forced to choose)?

'Clients SHOULD place the the value of ECHConfig.contents.public_name in the (outer) "server_name" extension unless specifically directed otherwise. Clients that do not include that value will not benefit from the retry mechanism described in {{rejected-ech}}, therefore clients need to exercise caution if sending any other value, and e.g. only use this approach with servers for which the client can be confident a correct ECHConfig value has been used and where the server does not make a check comparing the outer "server_name" to an ECHConfig.contents.public_name value.'

I could live with something like that but think it's less useful than the first text I suggested myself.

As I already provided text, I'm not really sure why I would be forced to choose between this and your prior suggestion.

@sftcd
Copy link
Collaborator

sftcd commented Oct 20, 2023 via email

@martinthomson
Copy link
Contributor

I'm OK with the stricter language @ekr suggests. A stack can conform with this requirement if it has an option to disable the outer SNI. A stack can conform with this requirement without implementing the outer SNI if it is clearly stated that it can only be used in cases where servers don't need the outer SNI (for the listed reasons).

@sftcd
Copy link
Collaborator

sftcd commented Oct 23, 2023 via email

@cjpatton
Copy link
Contributor

cjpatton commented Oct 23, 2023

If I understand @sftcd's point correctly (please let me know if I'm missing anything), it's that "SHOULD" is better because it permits clients to do something different "as long as they know what they're doing". But I'm worried that a careless reader might ignore the SHOULD and fill in the unspecified behavior in arbitrary ways. This could lead to bugs and/or increase the attack surface for applications built on top of TLS in ways we can't anticipate right now.

I'd also prefer @ekr's stricter language, as a future-proofing measure. Simpler semantics is always better. That said, I actually think this PR is a strict improvement over the current text, since it makes the consequences of ignoring the MUST (or SHOULD) clear.

@dennisjackson
Copy link
Contributor

dennisjackson commented Oct 26, 2023

The current PR for the client reads:

[The Client] SHOULD place the value of ECHConfig.contents.public_name in the
"server_name" extension. Clients that do not follow this step, or place a
different value in the "server_name" extension, risk breaking the retry
mechanism described in {{rejected-ech}} or failing to interoperate with
servers that require this step to be done; see {{client-facing-server}}.

I think this is fine. It's very clear as to what the expected behaviour is and what the consequences might be. For the server:

Once the server has chosen the correct ECHConfig, it SHOULD verify that the
value in the ClientHelloOuter "server_name" extension matches the value of
ECHConfig.contents.public_name, and abort with an "illegal_parameter" alert
if these do not match. Failure to enforce this check can allow clients to lie
about the ClientHelloOuter SNI, possibly with the intent of bypassing network
policies that might otherwise block ECH.

I don't think this text is great for two reasons:

  1. It's not clear how to choose the correct ECHConfig, especially if multiple configs use the same keypair and distinct public names (as @sftcd pointed out).
  2. I don't think there's going to be any kind of consensus for recommending this check (with a SHOULD) or the rationale for why. Implementers are going to be split between those doing the check and those refusing to.

I see we've already had plenty of suggested changes with little support, so one more can't hurt:

Once the server has chosen the correct ECHConfig, it MAY verify that the value in the ClientHelloOuter "server_name" extension matches the value of ECHConfig.contents.public_name, and abort with an "illegal_parameter" alert if these do not match. This optional check allows the server to limit ECH connections to only use the public SNI values advertised in its ECHConfigs. The server must be careful not to unnecessarily reject connections if the same ECHConfig id or keypair is used in multiple ECHConfigs with distinct public names.

The intent here is to permit the check without taking a stand on whether its desirable or necessary.

@sftcd
Copy link
Collaborator

sftcd commented Oct 26, 2023 via email

draft-ietf-tls-esni.md Outdated Show resolved Hide resolved
@chris-wood
Copy link
Collaborator Author

Took @dennisjackson's suggestion. I'm fine with this as well.

Copy link
Contributor

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this. I think it's strictly better than the status quo regardless of the verb we choose. Clarifying what can go wrong if the verb is ignored is the most important thing.

@roelfdutoit
Copy link

I believe this PR is incomplete without adding a corresponding warning to the "8.2 Middleboxes" section. The logic in that section is: Proxy does not understand ECH -> Proxy must act as conforming TLS server -> Proxy forges the disable signal by being authoritative for public name. If outer SNI != public_name then Proxy does not know public name (because it does not understand ECH) and can only be "authoritative" for the outer SNI.
At a mimimum the following phrase in that section would have to change from the proxy will act as if connecting to the public name to the proxy will act as if connecting to the ClientHelloOuter server_name, which SHOULD match the public name

@taddhar
Copy link
Contributor

taddhar commented Nov 1, 2023

Funny the timing coincidence @roelfdutoit I was about to come back on this section as well for a number of other reasons that for non educated people and non english native people this part is very very hard to parse. Am trying to edit it in my corner since yesterday to find the right words but in short a few paused definitions would help naive readers.

@chris-wood
Copy link
Collaborator Author

Good catch @roelfdutoit! I fixed that text.

@ckcr4lyf
Copy link

ckcr4lyf commented Nov 6, 2023

Is there any strong reason for a client to be SHOULD instead of MAY?

public_name may help the ECH server to determine which config to use, however, it may be that the server has only one config, and doesn't need additional public_name logic to determine which config to use.

I am coming from the angle of censorship-resistance. When SNI was introduced initially, many websites would work without it just fine, however browsers started sending it for any and every website, which lead to leaks where it wasn't even necessary.

I've talked about this on the mailing list: https://mailarchive.ietf.org/arch/msg/tls/HUG1CU0Q4PorZ7fD0yafVfj7VUY/ , and even in the original Github Issue for this PR: #572 (comment)

Since I didn't receive replies there and it seems this PR is active, I'd like to raise this point again. If this is not the appropriate place, please let me know where the WG group prefers to have such discussions.

@ekr
Copy link
Collaborator

ekr commented Nov 6, 2023

The reason to have it be SHOULD is that, as listed, failure to do so causes compatibility issues.

I am coming from the angle of censorship-resistance. When SNI was introduced initially, many websites would work without it just fine, however browsers started sending it for any and every website, which lead to leaks where it wasn't even necessary.

I'm not following this argument. SNI is necessary except in cases where the required certificate is completely determined by the IP address. Similarly, when ECH is used, the same public SNI will be used but it's just enough to determine the required certificate, which doesn't tell you anything that you didn't know already from the IP address and the DNS records.

@ckcr4lyf
Copy link

ckcr4lyf commented Nov 6, 2023

Similarly, when ECH is used, the same public SNI will be used but it's just enough to determine the required certificate, which doesn't tell you anything that you didn't know already from the IP address and the DNS records.

Assuming you used DoH, then someone in the middle (e.g. an ISP) cannot know which domain you're connecting to. However passing the "public name" SNI in plaintext would reveal this.

In the initial ESNI draft / implementation (prior to ECH), this was not a problem, since ESNI replaces SNI completely (if ESNI was used).

@chris-wood
Copy link
Collaborator Author

Similarly, when ECH is used, the same public SNI will be used but it's just enough to determine the required certificate, which doesn't tell you anything that you didn't know already from the IP address and the DNS records.

Assuming you used DoH, then someone in the middle (e.g. an ISP) cannot know which domain you're connecting to. However passing the "public name" SNI in plaintext would reveal this.

In the initial ESNI draft / implementation (prior to ECH), this was not a problem, since ESNI replaces SNI completely (if ESNI was used).

@ekr's point is that the IP address reveals effectively the same information as the public name.

@ckcr4lyf
Copy link

ckcr4lyf commented Nov 6, 2023

An IP address however can be rotated much more easily (in the context of end users of a service), e.g. for a VPS using IPs owned(?) by a cloud provider, where the actual cloud customer behind it could change from one day to the next. Whereas if I am operating a service, the end-users would expect the domain to be static.

In such scenarios, the ECH public_name would still leak the domain to those in the middle

@ckcr4lyf
Copy link

ckcr4lyf commented Nov 6, 2023

Cloudflare has a blog post talking about something tangential: Identity is attached to names, never addresses

@chris-wood chris-wood merged commit ad58289 into master Nov 6, 2023
2 checks passed
@chris-wood
Copy link
Collaborator Author

Merging after discussion in ECH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Are we quite right now with "MUST use public_name"?
9 participants