[6265bis] Require ASCII for domain attributes (fixes #1707) #1969

johannhof · 2022-02-16T12:07:25Z

See #1707 for discussion and rationale.

I wasn't entirely sure where to put this new restriction, but it does seem like the UA should parse the domain attribute list and only reject the cookie if the final domain attribute contains invalid characters (as opposed to, say, reject if any domain attribute contains invalid characters). So I went with that.

johannhof · 2022-02-16T12:09:57Z

@sbingler @martinthomson @annevk ^ happy to get feedback on this :)

annevk

Looks good!

annevk · 2022-02-16T12:50:04Z

draft-ietf-httpbis-rfc6265bis.md

+as enhanced by {{RFC1123}}, Section 2.1. The domain-value MUST be a string of
+{{USASCII}} characters, such as one obtained by applying the "ToASCII" operation
+defined in {{Section 4 of RFC3490}}.


I think this is redundant with how subdomain is defined in the referenced RFCs so I'd not include this. (If you really want to include it I'd make it a statement of fact. ("Thus, the domain-value is a string of {{USASCII}} characters, such as one ...")

I think that's a good suggestion! I might be reading the referenced RFCs wrong but for me they operate more under an assumption of ASCII than strictly requiring it? In any case, I do think it's worthwhile calling this out in the server section so I'll go with what you suggested.

sbingler · 2022-02-16T21:37:23Z

draft-ietf-httpbis-rfc6265bis.md

@@ -1470,7 +1472,10 @@ user agent MUST process the cookie as follows:

    1.  Let the domain-attribute be the empty string.

-8.  If the user agent is configured to reject "public suffixes" and the
+8.  If the domain-attribute contains a character that is not in the range of {{USASCII}}
+    characters (%x00 - %x7F), abort these steps and ignore the cookie entirely.


It feels odd to include characters we specifically avoid elsewhere as valid (such as %x00). I guess the earlier parts of the spec handle those cases where necessary so this should be ok, it just strikes me as odd.

Just noting this for posterity. No change required.

I think that we can use a narrower range for acceptance here that will ensure that rejected characters (%x00) don't get included. A domain name can be just limited to LDH (letter, digit, hyphen) + '.'.

It seems we're very deliberately avoiding mention of IP addresses, so we can probably not specify tighter validation rules (along the lines of what RFC 1034 specifies), but somewhat tighter is probably wise.

I agree it's a bit weird to explicitly list those. I'm not sure to what extent it's better to just remove the explicit mention of the character range vs. having a narrower definition. The UA will perform an equality check to a canonicalized URL host right after this step, so there's not a lot of worth to UAs in checking for LDH explicitly. Browsers have very optimized isASCII functions that would have to be sidestepped for this.

(Even this step is mostly added to make it clear to UAs that they should not perform canonicalization on the domain attribute, strictly speaking it wouldn't be necessary since the spec doesn't mandate canonical/A-label domain attributes)

If we do go for referencing LDH it might be useful to declare it in the terminology, I filed #1979 in any case.

I like this simpler check for that reason too. We don't actually want to reject .. or some such at this point.

(IP addresses we should tackle as part of #1939.)

I'm OK with that, but this is a good point to point out that the domain canonicalization will ensure that invalid domain names will be rejected. The oddness (and inconsistency) is pretty apparent and you've had two people tripped up by it already.

What kind of canonicalization are you thinking of? The lowercasing and leading dot removal? The rejection happens due to Domain Matching failing, but that only does relatively straightforward comparison operations.

Added a note to help clear the confusion, lmk what you think :)

sbingler

Looks good, aside from Anne's suggestion.

Thanks!

martinthomson

I think that removing %x00 from the allowed set of characters is worthwhile, if only because it means that the logic and the explanatory text (with Anne's tweak) line up better.

martinthomson · 2022-02-17T00:09:37Z

draft-ietf-httpbis-rfc6265bis.md

@@ -1470,7 +1472,10 @@ user agent MUST process the cookie as follows:

    1.  Let the domain-attribute be the empty string.

-8.  If the user agent is configured to reject "public suffixes" and the
+8.  If the domain-attribute contains a character that is not in the range of {{USASCII}}
+    characters (%x00 - %x7F), abort these steps and ignore the cookie entirely.


I think that we can use a narrower range for acceptance here that will ensure that rejected characters (%x00) don't get included. A domain name can be just limited to LDH (letter, digit, hyphen) + '.'.

It seems we're very deliberately avoiding mention of IP addresses, so we can probably not specify tighter validation rules (along the lines of what RFC 1034 specifies), but somewhat tighter is probably wise.

johannhof · 2022-02-17T10:57:09Z

Thanks all for taking a look! I made the minimal necessary changes first but would still want to wait for @martinthomson to respond to my comment on his LDH suggestion. I don't have strong feelings on that either way.

annevk · 2022-02-18T16:36:42Z

draft-ietf-httpbis-rfc6265bis.md

+    NOTE: The user agent MAY also reject any cookies that contain characters not
+    in the LDH (letter, digit, hyphen) range of %x41-5A / %x61-7A / %x30-39 / %x2D,
+    which is the character range of the canonicalized request-host the
+    domain-attribute is compared against.


I don't think this note works until we have a more conclusive answer on what user agents do with IPv6 addresses.

Alright, let's punt this on the IP address discussion then.

This reverts commit 2e29dc3.

johannhof · 2022-02-22T11:53:22Z

@chlily1 @johnwilander @mikewest I think we've reached some level of agreement on this PR, would any of you mind giving this a quick look and merging if you don't have any concerns?

chlily1 · 2022-02-22T15:21:54Z

Thanks, LGTM.

With this change, cookies with a "Domain" attribute that contains a non-ASCII character will be rejected (e.g. Domain=éxample.com). This makes Chromium adhere to the recently agreed-upon change to RFC 6265 (httpwg/http-extensions#1969) and reflects what Firefox has been shipping already. The web platform tests for this change were landed in web-platform-tests/wpt#32620 and are now passing. Bug: 1296537 Change-Id: I8fc29fcc57d7a8218e7fb257d56272adc86ffe46 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3583727 Reviewed-by: Steven Bingler <bingler@chromium.org> Commit-Queue: Johann Hofmann <johannhof@chromium.org> Reviewed-by: Matt Menke <mmenke@chromium.org> Cr-Commit-Position: refs/heads/main@{#1010491}

[6265bis] Require ASCII for domain attributes (fixes httpwg#1707)

ba38826

johannhof requested review from chlily1, mikewest and johnwilander as code owners February 16, 2022 12:07

annevk reviewed Feb 16, 2022

View reviewed changes

sbingler reviewed Feb 16, 2022

View reviewed changes

sbingler approved these changes Feb 16, 2022

View reviewed changes

martinthomson reviewed Feb 17, 2022

View reviewed changes

johannhof mentioned this pull request Feb 17, 2022

Editorial: 6265bis should explain ASCII and LDH in terminology #1979

Closed

Tweak based on review

f241b40

Add a note that LDH is safe to reject as well.

2e29dc3

annevk reviewed Feb 18, 2022

View reviewed changes

Revert "Add a note that LDH is safe to reject as well."

fddeae5

This reverts commit 2e29dc3.

chlily1 merged commit d30c6c7 into httpwg:main Feb 22, 2022

marcoscaceres mentioned this pull request Jun 29, 2022

Deprecating Non-ASCII characters in cookie domain attributes WebKit/standards-positions#7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[6265bis] Require ASCII for domain attributes (fixes #1707) #1969

[6265bis] Require ASCII for domain attributes (fixes #1707) #1969

johannhof commented Feb 16, 2022

johannhof commented Feb 16, 2022

annevk left a comment

annevk Feb 16, 2022

johannhof Feb 17, 2022

sbingler Feb 16, 2022 •

edited

Loading

martinthomson Feb 17, 2022

johannhof Feb 17, 2022

annevk Feb 17, 2022

martinthomson Feb 17, 2022

annevk Feb 18, 2022

johannhof Feb 18, 2022

sbingler left a comment

martinthomson left a comment

martinthomson Feb 17, 2022

johannhof commented Feb 17, 2022

annevk Feb 18, 2022

johannhof Feb 22, 2022

johannhof commented Feb 22, 2022

chlily1 commented Feb 22, 2022

[6265bis] Require ASCII for domain attributes (fixes #1707) #1969

[6265bis] Require ASCII for domain attributes (fixes #1707) #1969

Conversation

johannhof commented Feb 16, 2022

johannhof commented Feb 16, 2022

annevk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbingler Feb 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbingler left a comment

Choose a reason for hiding this comment

martinthomson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johannhof commented Feb 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johannhof commented Feb 22, 2022

chlily1 commented Feb 22, 2022

sbingler Feb 16, 2022 •

edited

Loading