-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #4562: add support for internationalized email addresses #5799
Conversation
…Infra Fix #5067 Adds the (new) definition of string equality using terms 'is' or 'identical to' from Infra, replacing the term 'case-sensitive'. The phrase "case sensitive" is retained in the text where it provides clarity (particularly in sections with names like "case-sensitivity of XXX"). I may need to go back and define an inverse term: there are a number of places where "case-sensitive" was used with "unordered set of unique space-separated tokens". A number of these remain in the commit.
- Restore the case-sensitive head - Add a note about string comparison/case-sensitivity - Add a reference to charmod-norm - Fix references to Infra string-is
…t type=email mirroring what browsers currently do. Note that this change removes the reference regular expression for valid email address, since it would be difficult to write such a regex and have it approximate correctness.
Crumpets. I forgot to rebase first. |
- Removed trailing whitespaces - Removed class=note from a paragraph because it contains normative text that ought to be normative.
@domenic is it really |
I wasn't under the impression that this was what browsers do. For example it doesn't seem to match https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/renderer/core/html/forms/email_input_type.cc;l=47-53;drc=1dea8d7ba92cfec5e553d3a4699115b1dc8dd707 . Web platform tests would certainly help, in any case. |
As I documented in the issue browsers currently have somewhat different behavior, but none seem to allow non-ASCII before |
@domenic, @annevk You're right. I was having a brain cramp--I spent time testing the right hand side and neglected the left-hand side (which we are changing here). Would it make sense to break this into two PRs? One to fix the current misalignment of the spec with support for IDNs and one to fix 4562? |
(clicked the close button by mistake)I think that would be good, although I'd suggest before writing any spec text, writing web platform tests. |
"+" / "-" / "/" / "=" / "?" / "^" / | ||
"_" / "`" / "{" / "|" / "}" / "~" / | ||
%80-D7FF / %E000-10FFFF ; or any non-ASCII Unicode | ||
domain = < a "valid host string", see URL section 3.4 ></code></pre> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The valid host string rule is not compatible with RFC 5321, section 4.1.2. In SMTP, IPv4 addresses must be wrapped in square brackets, e.g. mailbox@[10.0.0.1]
.
I just verified that Postfix enforces this rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The currently published spec forbids the use of IP addresses in the domain part anyway. We could just keep recommending that. If we do, then this is sufficient, I think:
domain = < a "valid host string", see URL section 3.4 ></code></pre> | |
domain = < a "valid domain string", see URL section 3.4 ></code></pre> |
https://url.spec.whatwg.org/#valid-domain (which is correct for IDNA, but also see whatwg/url#245).
Currently, the spec text evidently isn't clear enough for the domain part, since Blink, Gecko, and WebKit all do different things. My reading of the current spec text is that it best supports what Blink does. However, https://www.ietf.org/archive/id/draft-ietf-emailcore-as-09.html#name-use-and-validation-in-html- ends up attributing what appears to be just the WebKit interpretation as a general HTML problem. The current PR does not make things as clear as I'd like from implementor perspective. I think it would be good as basis for discussion if the upcoming PR update took a position on each of the questions I listed on the issue. (As noted there, I think specifying Blink's general approach of replacing the domain with the ToASCII form when the ToASCII operation does not raise errors is the least risky way forward. I do think tweaks are needed to the Blink specifics. In particular, I believe UTS 46 processing needs to be run in the non-transitional mode like Blink, too, does for URLs.) |
Closing in favor of #10522. |
Changes the ABNF and surrounding text to match what current browser implementations do, which includes support for non-ASCII characters on both the left and right side of the email address. Discussion in #4562 includes the genesis of these changes.
(See WHATWG Working Mode: Changes for more details.)
💥 Error: Wattsi server error 💥
PR Preview failed to build. (Last tried on Jan 15, 2021, 8:00 AM UTC).
More
PR Preview relies on a number of web services to run. There seems to be an issue with the following one:
🚨 Wattsi Server - Wattsi Server is the web service used to build the WHATWG HTML spec.
🔗 Related URL
If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.