Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6265bis] Require ASCII for domain attributes (fixes #1707) #1969

Merged
merged 4 commits into from
Feb 22, 2022

Conversation

johannhof
Copy link
Collaborator

See #1707 for discussion and rationale.

I wasn't entirely sure where to put this new restriction, but it does seem like the UA should parse the domain attribute list and only reject the cookie if the final domain attribute contains invalid characters (as opposed to, say, reject if any domain attribute contains invalid characters). So I went with that.

@johannhof
Copy link
Collaborator Author

@sbingler @martinthomson @annevk ^ happy to get feedback on this :)

Copy link

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Comment on lines 533 to 535
as enhanced by {{RFC1123}}, Section 2.1. The domain-value MUST be a string of
{{USASCII}} characters, such as one obtained by applying the "ToASCII" operation
defined in {{Section 4 of RFC3490}}.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is redundant with how subdomain is defined in the referenced RFCs so I'd not include this. (If you really want to include it I'd make it a statement of fact. ("Thus, the domain-value is a string of {{USASCII}} characters, such as one ...")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good suggestion! I might be reading the referenced RFCs wrong but for me they operate more under an assumption of ASCII than strictly requiring it? In any case, I do think it's worthwhile calling this out in the server section so I'll go with what you suggested.

@@ -1470,7 +1472,10 @@ user agent MUST process the cookie as follows:

1. Let the domain-attribute be the empty string.

8. If the user agent is configured to reject "public suffixes" and the
8. If the domain-attribute contains a character that is not in the range of {{USASCII}}
characters (%x00 - %x7F), abort these steps and ignore the cookie entirely.
Copy link
Collaborator

@sbingler sbingler Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels odd to include characters we specifically avoid elsewhere as valid (such as %x00). I guess the earlier parts of the spec handle those cases where necessary so this should be ok, it just strikes me as odd.

Just noting this for posterity. No change required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can use a narrower range for acceptance here that will ensure that rejected characters (%x00) don't get included. A domain name can be just limited to LDH (letter, digit, hyphen) + '.'.

It seems we're very deliberately avoiding mention of IP addresses, so we can probably not specify tighter validation rules (along the lines of what RFC 1034 specifies), but somewhat tighter is probably wise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's a bit weird to explicitly list those. I'm not sure to what extent it's better to just remove the explicit mention of the character range vs. having a narrower definition. The UA will perform an equality check to a canonicalized URL host right after this step, so there's not a lot of worth to UAs in checking for LDH explicitly. Browsers have very optimized isASCII functions that would have to be sidestepped for this.

(Even this step is mostly added to make it clear to UAs that they should not perform canonicalization on the domain attribute, strictly speaking it wouldn't be necessary since the spec doesn't mandate canonical/A-label domain attributes)

If we do go for referencing LDH it might be useful to declare it in the terminology, I filed #1979 in any case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this simpler check for that reason too. We don't actually want to reject .. or some such at this point.

(IP addresses we should tackle as part of #1939.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with that, but this is a good point to point out that the domain canonicalization will ensure that invalid domain names will be rejected. The oddness (and inconsistency) is pretty apparent and you've had two people tripped up by it already.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of canonicalization are you thinking of? The lowercasing and leading dot removal? The rejection happens due to Domain Matching failing, but that only does relatively straightforward comparison operations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note to help clear the confusion, lmk what you think :)

Copy link
Collaborator

@sbingler sbingler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, aside from Anne's suggestion.

Thanks!

Copy link
Contributor

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that removing %x00 from the allowed set of characters is worthwhile, if only because it means that the logic and the explanatory text (with Anne's tweak) line up better.

@@ -1470,7 +1472,10 @@ user agent MUST process the cookie as follows:

1. Let the domain-attribute be the empty string.

8. If the user agent is configured to reject "public suffixes" and the
8. If the domain-attribute contains a character that is not in the range of {{USASCII}}
characters (%x00 - %x7F), abort these steps and ignore the cookie entirely.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can use a narrower range for acceptance here that will ensure that rejected characters (%x00) don't get included. A domain name can be just limited to LDH (letter, digit, hyphen) + '.'.

It seems we're very deliberately avoiding mention of IP addresses, so we can probably not specify tighter validation rules (along the lines of what RFC 1034 specifies), but somewhat tighter is probably wise.

@johannhof
Copy link
Collaborator Author

Thanks all for taking a look! I made the minimal necessary changes first but would still want to wait for @martinthomson to respond to my comment on his LDH suggestion. I don't have strong feelings on that either way.

NOTE: The user agent MAY also reject any cookies that contain characters not
in the LDH (letter, digit, hyphen) range of %x41-5A / %x61-7A / %x30-39 / %x2D,
which is the character range of the canonicalized request-host the
domain-attribute is compared against.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this note works until we have a more conclusive answer on what user agents do with IPv6 addresses.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let's punt this on the IP address discussion then.

@johannhof
Copy link
Collaborator Author

@chlily1 @johnwilander @mikewest I think we've reached some level of agreement on this PR, would any of you mind giving this a quick look and merging if you don't have any concerns?

@chlily1
Copy link
Contributor

chlily1 commented Feb 22, 2022

Thanks, LGTM.

@chlily1 chlily1 merged commit d30c6c7 into httpwg:main Feb 22, 2022
aarongable pushed a commit to chromium/chromium that referenced this pull request Jun 3, 2022
With this change, cookies with a "Domain" attribute that contains a
non-ASCII character will be rejected (e.g. Domain=éxample.com).

This makes Chromium adhere to the recently agreed-upon change to
RFC 6265 (httpwg/http-extensions#1969) and
reflects what Firefox has been shipping already.

The web platform tests for this change were landed in
web-platform-tests/wpt#32620 and are now
passing.

Bug: 1296537
Change-Id: I8fc29fcc57d7a8218e7fb257d56272adc86ffe46
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3583727
Reviewed-by: Steven Bingler <bingler@chromium.org>
Commit-Queue: Johann Hofmann <johannhof@chromium.org>
Reviewed-by: Matt Menke <mmenke@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1010491}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants