net/http: update the IDNA implementation from x/text #17268
Comments
First question: what standards to Chrome, Firefox and Safari implement? According to local domain experts (one of the authors of UTS #46), most, if not all, browsers currently use UTS #46 which was introduced by The Unicode consortium as a transition standard between IDNA 2003 and IDNA 2008, minimizing the substantial incompatibility between them. The UTS #46 standard provides (roughly) two versions: with and without compatibility processing. Browsers may be in different modes w.r.t. removing compatibility processing in the transition to IDNA 2008, but for the browsers I tested compatibly processing seems to be the norm. Looking at the spec, it is not quite trivial to implement it, but also not too hard. I'll look into supporting UTS #46. |
CL https://golang.org/cl/30392 mentions this issue. |
CL https://golang.org/cl/30550 mentions this issue. |
The two CLs mentioned here are close to passing all tests defined in http://www.unicode.org/Public/idna/9.0.0/IdnaTest.txt. That's the easy part. Now comes the hard part: defining what to do with errors. An error does not mean a domain name cannot be used. For example, a browser may decide to show a label as punycode, instead of Unicode, in case of some errors. An implementation may also add additional constraints based on confusables and cross-script spoofing, for example. The policy for Chrome is described here: https://www.chromium.org/developers/design-documents/idn-in-google-chrome. This also gives a decent overview how other browsers behave. We have to choose which policies we want to implement here. I would not pick something that is language-dependent. Other than that, something like Firefox (and what Chrome will be) seems the nicest to me. Safari's approach looks like its the easiest to implement. |
The code is now internal to x/text. The generation code will probably stay there for now. The generated code can end up in x/net/idna. Note that this package has its own tables that do not rely on unicode/norm, width, cases, etc. UTS golang/go#46 (and IDNA2008 to some extent) define slight variations for all of these and dealing with that gets very tedious and error prone. Advantages: - much less error prone - avoid security issues from having mixed Unicode versions - it is a considerably faster - it is more compact than if all the other tables are pulled in - spec-conform invalid UTF-8 handling supported by trie, but not by core unicode/utf8 Updates golang/go#17268 Change-Id: I6b4cfbcfd4386c5e005cef23365e5dd327eb972c Reviewed-on: https://go-review.googlesource.com/30392 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Nigel Tao <nigeltao@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
CL https://golang.org/cl/30552 mentions this issue. |
This passes all the “positive” tests and most “negative” tests (about 2*800 remaining). This is not yet optimized for performance. The Punycode code was copied from x/net/idna. Updates golang/go#17268 Change-Id: Ia8b64483ebb6bb23a5b2b9f5ad4727b80754e43d Reviewed-on: https://go-review.googlesource.com/30550 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Only ContextJ errors are not detected yet (219 errors). Updates golang/go#17268 Change-Id: Ic5ddbc7cd6f9218b5edfafd636afdcd7fa47c26b Reviewed-on: https://go-review.googlesource.com/30552 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
CL https://golang.org/cl/31273 mentions this issue. |
CL https://golang.org/cl/31274 mentions this issue. |
All tests now pass. Also includes modifier information in the table. Updates golang/go#17268 Change-Id: I7a7a9cdad9a654e0826617925dc9cdf0537c217e Reviewed-on: https://go-review.googlesource.com/31273 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Also include a profile for displaying IDNs as recommended by UTS 46. Updates golang/go#17268 Change-Id: I33189fa8115e7891dbf21ba222ca28ce294437e2 Reviewed-on: https://go-review.googlesource.com/31274 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
CL https://golang.org/cl/32155 mentions this issue. |
Ideally the new package should live in net/idna, as managing Unicode dependencies for three repos seems more tedious and prone to security issues than doing so for two. |
Made the API configuration internal. This is a little bit safer but also gives some flexibilty to add features that may require computed setup and/or tables that one should not link in by default. Examples of such possible features are confusable and spoofing detection. Updates golang/go#17268 Change-Id: Ia205d17feb385a4e9fea47287ca32786f80da0e1 Reviewed-on: https://go-review.googlesource.com/32155 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Nigel Tao <nigeltao@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@mpvl, so you want me to vendor x/text/internal/export/idna (and its dependencies, including bidi stuff?) into std now? Or are you going to do it? Feel free to reassign if you want me to. |
I'm going to punt to Go 1.9. I think the support we have now is sufficient for all but the corner cases. |
Yeah, I'll work on this. I'll do so by first updating the implementation in
x/net. But let's wait till start of 1.9, as you said.
…On Thu, Dec 1, 2016 at 7:42 PM Brad Fitzpatrick ***@***.***> wrote:
I'm going to punt to Go 1.9. I think the support we have now is sufficient
for all but the corner cases.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17268 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGJZR3h9ujePgoTK9B-b-hdLZpt4rWKJks5rDxUhgaJpZM4KJSVp>
.
|
CL https://golang.org/cl/37111 mentions this issue. |
Custom logic from request.go has been removed. Created by running: “go run gen.go -core” from x/text at fc7fa097411d30e6708badff276c4c164425590c. Fixes golang/go#17268 Change-Id: Ie440d6ae30288352283d303e5126e5837f11bece Reviewed-on: https://go-review.googlesource.com/37111 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Opening this bug for @mpvl to sanity check the new net/http implementation of IDNA.
It's all been committed. See
func idnaASCII
innet/http/request.go
:I'm not a domain expert.
In particular, do we do what Chrome and Firefox do? A user should be able to type in any case with any widths in their URL bar (or in an HTML file's
<a href=".....">
attribute) and they should all canonicalize the same way.Thanks!
The text was updated successfully, but these errors were encountered: