-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/url: make URL parsing return an error on IPv6 literal without brackets #31024
Comments
Your ipv6-implicit-2, ipv6-implicit-3, and ipv6-implicit-4 aren't valid URLs. IPv6 literals need to be in square brackets. |
Closing for now. Let me know if I'm wrong, though. |
Thank you for your response @bradfitz ! The rfc2732 recommends the use of brackets in all cases. However, it is not mandatory and can be technically omitted if no explicit port needs to be defined. The thing is: In many cases the URL might not be built by the developer, but be received from some other source. Using url.Parse() is already an small indicator that URLs are received and parsed rather than built by the developer. One common example, would be parsing URLs returned by a website. The website is free to return any URL format (can be decided by the web application developer). Relative URLs but also absolute ones. If the website returns absolute URLs, they might not follow the recommendation of the RFC but omit the brackets. url.Parse() would not return an error but garbage difficult to catch. Sanitizing every received URL before parsing would be a fix, but unnecessasry overhead. It would be related to re-implementing the url.Parse() function. The following might be a less valid argument in the views of some, but there is already the net.SplitHostPort() function in a closeby package. It is a little more forgiving and having duplicate code might not be the best practice either. It would be also okay if parsing "http://1::2008" and calling rfc2732 : https://www.ietf.org/rfc/rfc2732.txt |
Well, that SHOULD is news to me. I thought it was a requirement. I think it might've been upgraded to a requirement in later (post 1999) RFCs. In any case,
Who decided that the valid reasons were to not use square brackets? Did they carefully weigh that libraries (such as Go) might reject them? Is there any other software today that generates them? That is, how did this bug come about? Is this real or hypothetical? |
It was hypothetical based on the RFC's definition and that IPv6 addresses go without brackets by default unless a specific endpoint (port) is to be specified. However, I now tried how browsers (Chrome, FF, IE, Edge) react on IPv6 URLs without brackets. It seems they don't accept it and understand such input as search terms. Obviously, the general understanding is that brackets are mandatory for IPv6 URLs. Hence, no working website could apply it without. Web developers could build it but it wouldn't be useful. If being forgiving is not desired in this case, I'm suggesting, that url.Parse() throws an error in order to prevent from continuing with garbage data! See garbage output in sample below.
(If the port is not empty or an integer, something went bad.) |
Okay, I've repurposed this bug to be about rejecting them instead. |
I'm pretty sure that RFC 3986 (which obsoletes RFC 2732) requires square brackets for IPv6 literals. The ABNF rules defined in RFC 3986 only allow an This silence on IPv6 addresses without square brackets is notable, because the text does acknowledge the ambiguity between
|
Well, an IPv6 address cannot be interpreted as a hostname due to the colons, in contrast to an IPv4 address. Anyways, it seems clear now that brackets are required to be compliant. Any suggestions how to make the function detect and reject mentioned invalid URLs? Validating the host part might be error prone because its content could be IPv4, IPv6 or a hostname. The error actually materializes in the port part. So I thought, a test whether the port is either empty or a valid integer would catch the issue. It would also catch potential other issues not directly related with IPv6 and brackets. But I can't help it feels too cheap. |
I would suggest that RFC 3986 provides the solution - it is the literal source of truth for what can be a valid URL. That does mean parsing it as IPv4, IPv6, IPvFuture (yes, that's a thing), or a reg-name. I note also that currently, net/url thinks that |
Change https://golang.org/cl/184117 mentions this issue: |
Change https://golang.org/cl/189258 mentions this issue: |
When Host is not valid per RFC 3986, the behavior of Hostname and Port was wildly unpredictable, to the point that Host could have a suffix that didn't appear in neither Hostname nor Port. This is a security issue when applications are applying checks to Host and expecting them to be meaningful for the contents of Hostname. To reduce disruption, this change only aims to guarantee the following two security-relevant invariants. * Host is either Hostname or [Hostname] with Port empty, or Hostname:Port or [Hostname]:Port. * Port is only decimals. The second invariant is the one that's most likely to cause disruption, but I believe it's important, as it's conceivable an application might do a suffix check on Host and expect it to be meaningful for the contents of Hostname (if the suffix is not a valid port). There are three ways to ensure it. 1) Reject invalid ports in Parse. Note that non-numeric ports are already rejected if and only if the host starts with "[". 2) Consider non-numeric ports as part of Hostname, not Port. 3) Allow non-numeric ports, and hope they only flow down to net/http, which will reject them (#14353). This change adopts both 1 and 2. We could do only the latter, but then these invalid hosts would flow past port checks, like in http_test.TestTransportRejectsAlphaPort. Non-numeric ports weren't fully supported anyway, because they were rejected after IPv6 literals, so this restores consistency. We could do only the former, but at this point 2) is free and might help with manually constructed Host values (or if we get something wrong in Parse). Note that net.SplitHostPort and net.Dial explicitly accept service names in place of port numbers, but this is an URL package, and RFC 3986, Section 3.2.3, clearly specifies ports as a number in decimal. net/http uses a mix of net.SplitHostPort and url.Parse that would deserve looking into, but in general it seems that it will still accept service names in Addr fields as they are passed to net.Listen, while rejecting them in URLs, which feels correct. This leaves a number of invalid URLs to reject, which however are not security relevant once the two invariants above hold, so can be done in Go 1.14: IPv6 literals without brackets (#31024), invalid IPv6 literals, hostnames with invalid characters, and more. Tested with 200M executions of go-fuzz and the following Fuzz function. u, err := url.Parse(string(data)) if err != nil { return 0 } h := u.Hostname() p := u.Port() switch u.Host { case h + ":" + p: return 1 case "[" + h + "]:" + p: return 1 case h: fallthrough case "[" + h + "]": if p != "" { panic("unexpected Port()") } return 1 } panic("Host is not a variant of [Hostname]:Port") Fixes CVE-2019-14809 Updates #29098 Change-Id: I7ef40823dab28f29511329fa2d5a7fb10c3ec895 Reviewed-on: https://go-review.googlesource.com/c/go/+/189258 Reviewed-by: Ian Lance Taylor <iant@golang.org>
…ictable for invalid Host values When Host is not valid per RFC 3986, the behavior of Hostname and Port was wildly unpredictable, to the point that Host could have a suffix that didn't appear in neither Hostname nor Port. This is a security issue when applications are applying checks to Host and expecting them to be meaningful for the contents of Hostname. To reduce disruption, this change only aims to guarantee the following two security-relevant invariants. * Host is either Hostname or [Hostname] with Port empty, or Hostname:Port or [Hostname]:Port. * Port is only decimals. The second invariant is the one that's most likely to cause disruption, but I believe it's important, as it's conceivable an application might do a suffix check on Host and expect it to be meaningful for the contents of Hostname (if the suffix is not a valid port). There are three ways to ensure it. 1) Reject invalid ports in Parse. Note that non-numeric ports are already rejected if and only if the host starts with "[". 2) Consider non-numeric ports as part of Hostname, not Port. 3) Allow non-numeric ports, and hope they only flow down to net/http, which will reject them (#14353). This change adopts both 1 and 2. We could do only the latter, but then these invalid hosts would flow past port checks, like in http_test.TestTransportRejectsAlphaPort. Non-numeric ports weren't fully supported anyway, because they were rejected after IPv6 literals, so this restores consistency. We could do only the former, but at this point 2) is free and might help with manually constructed Host values (or if we get something wrong in Parse). Note that net.SplitHostPort and net.Dial explicitly accept service names in place of port numbers, but this is an URL package, and RFC 3986, Section 3.2.3, clearly specifies ports as a number in decimal. net/http uses a mix of net.SplitHostPort and url.Parse that would deserve looking into, but in general it seems that it will still accept service names in Addr fields as they are passed to net.Listen, while rejecting them in URLs, which feels correct. This leaves a number of invalid URLs to reject, which however are not security relevant once the two invariants above hold, so can be done in Go 1.14: IPv6 literals without brackets (#31024), invalid IPv6 literals, hostnames with invalid characters, and more. Tested with 200M executions of go-fuzz and the following Fuzz function. u, err := url.Parse(string(data)) if err != nil { return 0 } h := u.Hostname() p := u.Port() switch u.Host { case h + ":" + p: return 1 case "[" + h + "]:" + p: return 1 case h: fallthrough case "[" + h + "]": if p != "" { panic("unexpected Port()") } return 1 } panic("Host is not a variant of [Hostname]:Port") Fixes CVE-2019-14809 Updates #29098 Change-Id: I7ef40823dab28f29511329fa2d5a7fb10c3ec895 Reviewed-on: https://go-review.googlesource.com/c/go/+/189258 Reviewed-by: Ian Lance Taylor <iant@golang.org> (cherry picked from commit 61bb56a) Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/526408 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> (cherry picked from commit 3226f2d) Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/526409
…ictable for invalid Host values When Host is not valid per RFC 3986, the behavior of Hostname and Port was wildly unpredictable, to the point that Host could have a suffix that didn't appear in neither Hostname nor Port. This is a security issue when applications are applying checks to Host and expecting them to be meaningful for the contents of Hostname. To reduce disruption, this change only aims to guarantee the following two security-relevant invariants. * Host is either Hostname or [Hostname] with Port empty, or Hostname:Port or [Hostname]:Port. * Port is only decimals. The second invariant is the one that's most likely to cause disruption, but I believe it's important, as it's conceivable an application might do a suffix check on Host and expect it to be meaningful for the contents of Hostname (if the suffix is not a valid port). There are three ways to ensure it. 1) Reject invalid ports in Parse. Note that non-numeric ports are already rejected if and only if the host starts with "[". 2) Consider non-numeric ports as part of Hostname, not Port. 3) Allow non-numeric ports, and hope they only flow down to net/http, which will reject them (#14353). This change adopts both 1 and 2. We could do only the latter, but then these invalid hosts would flow past port checks, like in http_test.TestTransportRejectsAlphaPort. Non-numeric ports weren't fully supported anyway, because they were rejected after IPv6 literals, so this restores consistency. We could do only the former, but at this point 2) is free and might help with manually constructed Host values (or if we get something wrong in Parse). Note that net.SplitHostPort and net.Dial explicitly accept service names in place of port numbers, but this is an URL package, and RFC 3986, Section 3.2.3, clearly specifies ports as a number in decimal. net/http uses a mix of net.SplitHostPort and url.Parse that would deserve looking into, but in general it seems that it will still accept service names in Addr fields as they are passed to net.Listen, while rejecting them in URLs, which feels correct. This leaves a number of invalid URLs to reject, which however are not security relevant once the two invariants above hold, so can be done in Go 1.14: IPv6 literals without brackets (#31024), invalid IPv6 literals, hostnames with invalid characters, and more. Tested with 200M executions of go-fuzz and the following Fuzz function. u, err := url.Parse(string(data)) if err != nil { return 0 } h := u.Hostname() p := u.Port() switch u.Host { case h + ":" + p: return 1 case "[" + h + "]:" + p: return 1 case h: fallthrough case "[" + h + "]": if p != "" { panic("unexpected Port()") } return 1 } panic("Host is not a variant of [Hostname]:Port") Fixes CVE-2019-14809 Updates #29098 Change-Id: I7ef40823dab28f29511329fa2d5a7fb10c3ec895 Reviewed-on: https://go-review.googlesource.com/c/go/+/189258 Reviewed-by: Ian Lance Taylor <iant@golang.org> (cherry picked from commit 61bb56a) Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/526408 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Have you progressed on this bug? |
Not to go off-tangent too much. But what is the intent in url.Parse on strings that have no scheme? I came across this in a library that was using url.Parse on ipv4 addresses and would fail when given implicit or explicit ipv6 addresses. The library should not have been using url.Parse on plain ip address strings, but it did make me wonder why net.Parse was not returning an error. For example, The behavior I'm looking at is demonstrated here: https://play.golang.org/p/MzW9lCD7Min |
I'm using url.Parse to parse hrefs extracted from web sites, but they can be defined in various ways (and are seldomly with scheme):
Different web developers or content management systems may chose different variants and or a combination for reasons. |
5.2. Relative Resolution allows those as URI references. The ResolveReference method on url.URL strongly hints at the intention to support this in net/url. The parsing of |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
url.URL provides the methods
.Hostname()
and.Port()
which schould split the URL's host (.Host
) into its address and port part. In certain cases, these functions are not able to interpret IPv6 addresses correctly, leading to invalid output.Here is a test function feeding different sample URLs, demonstrating the issue (all test URLs are valid and should succeed!):
Output:
What did you expect to see?
All sample URLs in the test cases above are valid ones, hence, all tests should succeed as defined
What did you see instead?
The top test samples work as expected, however, the bottom three return incorrect results. The bottom three samples are valid IPv6 URLs with inplicit port specification, but
.Hostname()
and.Port()
interpres them as IPv4 addresses returning parts of the IPv6 address as if it was the explicit port of an IPv4 input. E.g., in one of the test outputs, ":2008" is returned as the port, but it is actually part of the IPv6 address.Where is the bug?
.Hostname()
and.Port()
implement their own logic to split the port from the hostname. I've found that there is already a close function in the net package, callednet.SplitHostPort()
, which does it's job correctly. If.Hostname()
and.Port()
just called that function instead of re-implementing the logic, everything should work as expected. Below is the proof in form of a test function with exactly the same inputs as above, but usingnet.SplitHostPort()
instead of.Hostname()
/.Port()
.Output:
The text was updated successfully, but these errors were encountered: