Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/url: mysql://boulder@tcp(localhost:3306)/foo no longer parses #12023

Closed
jmhodges opened this issue Aug 5, 2015 · 15 comments
Closed

net/url: mysql://boulder@tcp(localhost:3306)/foo no longer parses #12023

jmhodges opened this issue Aug 5, 2015 · 15 comments
Milestone

Comments

@jmhodges
Copy link
Contributor

jmhodges commented Aug 5, 2015

The URL mysql://boulder@tcp(localhost:3306)/boulder_test?parseTime=true would parse and re-encode correctly in go 1.4.2.

However, in 1.5.0-beta3, url.Parse("mysql://boulder@tcp(localhost:3306)/boulder_test?parseTime=true") now returns the error message:

parse mysql://boulder@tcp(localhost:3306)/boulder_test?parseTime=true: invalid port ":3306)" after host

This was found in go-sql-driver/mysql#362.

@adg
Copy link
Contributor

adg commented Aug 5, 2015

Is that even a valid URL?

The RFC described the authority component like this:

   authority   = [ userinfo "@" ] host [ ":" port ]

So I don't think there can be parens that span the host and port section.

@liviosoares
Copy link

I have a similar issue with MongoDB URIs.

mongodb://ip1,ip2,ip3 used to parse and re-encode fine in various Go versions (including 1.4.2)

Now, when re-encoding into string, I get the mongodb://ip1%2Cip2%2Cip3.

As far as I understand the code for 'encodeHost' was added as part of commit 8e95654. The decision in the commit was to allow only reserved characters to be unencoded. The sub-delimiters that could be left unescaped (including commas and parenthesis) were chosen to be escaped.

This choice is what is causing issues for us.

@jmhodges
Copy link
Contributor Author

jmhodges commented Aug 5, 2015

I'm totally fine hearing this was a bug fix and having this bug get closed. If that happens, as a user, I'd love for the database/sql driver maintainers to get some heads up.

I'm currently in a bit of a discussion with one for the mysql driver about this in that other ticket. They were surprised by the change in behavior, but, unfortunately, more surprised by their users wanting to control and modify the strings before they go in. I think the postgres folks are more familiar with their users using and working with db strings (including heroku's DATABASE env variable stuff, etc), but some of the mysql folks are, uh, not so clear they want to support this use case.

There might be more issues like we're seeing from monogodb and mysql and a little bit of foreknowledge might make driver devs amenable to changes in their APIs to accommodate their users or maybe just some documentation on really solid workarounds.

@cespare
Copy link
Contributor

cespare commented Aug 5, 2015

The fix for all of these issues is not to use a URL parsing routine for URL-like DSNs that are not actually URLs.

@jmhodges
Copy link
Contributor Author

jmhodges commented Aug 5, 2015

Right, I was hoping the mysql and other folks would give us a nice way to use an actual URL instead of a DSN. Something like a mysqltcp:// scheme and such. Thanks for your interest!

@bradfitz bradfitz added this to the Go1.5Maybe milestone Aug 5, 2015
@bradfitz
Copy link
Contributor

bradfitz commented Aug 5, 2015

/cc @rsc @mikioh

@jmhodges jmhodges changed the title net/url: certain url's with mysql schemes no longer parse in 1.5-beta3 net/url: certain urls with mysql schemes no longer parse in 1.5-beta3 Aug 5, 2015
@mikioh mikioh changed the title net/url: certain urls with mysql schemes no longer parse in 1.5-beta3 net/url: database source name or connection string (certain urls with mysql schemes) no longer parse in 1.5-beta3 Aug 5, 2015
@mikioh
Copy link
Contributor

mikioh commented Aug 5, 2015

In terms of RFC 3986,

  • mysql://tcp(localhost)/ is valid // '(' and ')' are sub-delims and sub-delims are valid for reg-name
  • mongodb://host1,host2,host3/ is valid // ',' is a sub-delim and sub-delims are valid for reg-name
  • mongodb://host1,host2,host3:27017/ is valid // ',' is a sub-delim and sub-delims are valid for reg-name

but

  • mysql://tcp(localhost:3306)/ is not valid // position of the sub-delim ':'
  • mongodb://host1,host2:27017,host3:27018/ is not valid // position of the sub-delim ':'
  • mongodb://1.0.0.1,1.0.0.2,1.0.0.3/ is not valid // no literal address
  • mongodb://1.0.0.1,1.0.0.2,1.0.0.3:27018/ is not valid // no literal address

Looks like it would be a bit hard to support various database source names, connection strings.

@mikioh
Copy link
Contributor

mikioh commented Aug 5, 2015

mongodb://ip1,ip2,ip3 -> mongodb://ip1%2Cip2%2Cip3 is a bug comes from #5684.

@liviosoares
Copy link

Couple of comments on this thread:

  1. @cespare: Not sure I understand the point of bringing up data source names (DSNs). DSN is an abstract term, while URLs and URIs have RFCs where we can derive an actual specification. So, if you want to implement a DSN, using an URI/URL is not necessarily a bad way to do it. In fact, I think it's a very common way to implement DSNs is through the use of URI/URL.

  2. @mikioh: yes, net/url: url.Parse("example.com/oid/[order_id]") escapes [ ] in String() method #5684 is one of root causes. OTOH, that bug became relevant for host names only in the 1.5 cycle, due to host name encoding support added in 8e95654.

Your broader comment about RFC 3986 and it's suitability to a wide range of connection strings is understandable. However, it is common to use URIs as connection strings. I have managed to do so for connecting to several services (redis, cassandra, mongodb, ceph/rados), without using the invalid forms you posted above.

Bottom line is: as a user, I would much rather make an effort to make sure my URIs are complaint with RFC 3986, and be able to rely on net/url to do URI manipulation rather than to design custom connection string parsers, which are likely 99% similar to net/url.

In any case, whichever way the Go team decides, I think a note is warranted in the release notes. There may be others, like me, who have strived to keep URIs RFC compliant, who will have broken code when upgrading from Go <= 1.4.2 to 1.5.

@mikioh mikioh changed the title net/url: database source name or connection string (certain urls with mysql schemes) no longer parse in 1.5-beta3 net/url: database source name or connection string (certain? urls with mysql schemes) no longer parse in 1.5-beta3 Aug 5, 2015
@mikioh
Copy link
Contributor

mikioh commented Aug 5, 2015

@liviosoares,

Please open a new issue for escaping host subcomponent of URL.

@rsc
Copy link
Contributor

rsc commented Aug 5, 2015

@liviosoares I created #12036 for your mongodb issue. I have some questions there for you. Thanks.

@rsc rsc changed the title net/url: database source name or connection string (certain? urls with mysql schemes) no longer parse in 1.5-beta3 net/url: mysql://boulder@tcp(localhost:3306)/boulder_test?parseTime=true no longer parses Aug 5, 2015
@rsc rsc changed the title net/url: mysql://boulder@tcp(localhost:3306)/boulder_test?parseTime=true no longer parses net/url: mysql://boulder@tcp(localhost:3306)/foo no longer parses Aug 5, 2015
@cespare
Copy link
Contributor

cespare commented Aug 5, 2015

@liviosoares I agree that it's nice to use URIs to specify DSNs, but if your database/driver does that, it should support URI semantics:

  • Invalid URIs like mysql://tcp(localhost:3306)/ ought to be rejected
  • Equivalent URIs like mongodb://ip1,ip2,ip3 and mongodb://ip1%2Cip2%2Cip3 ought to be interchangeable

If these semantics don't hold for these DSNs, then such strings are not URIs but are merely URI-like, and net/url shouldn't be relied upon to parse or encode them.

@rsc
Copy link
Contributor

rsc commented Aug 6, 2015

Mikioh, please see my question on #12036 about host encoding.

@rsc
Copy link
Contributor

rsc commented Aug 6, 2015

The code breaking this URL seems to me a classic example of fuzz testing hurting instead of helping, by inducing real bugs when we fix hypothetical bugs. I am inclined to move the validHostPort check up into the [ipv6]:port syntax case. I don't see any reason for it to be here. It is not relevant to the original fuzz report (#11208).

diff --git a/src/net/url/url.go b/src/net/url/url.go
index abcd23b..1cec43b 100644
--- a/src/net/url/url.go
+++ b/src/net/url/url.go
@@ -478,7 +478,6 @@ func parseAuthority(authority string) (user *Userinfo, host string, err error) {
 // information. That is, as host[:port].
 func parseHost(host string) (string, error) {
    litOrName := host
-   var colonPort string // ":80" or ""
    if strings.HasPrefix(host, "[") {
        // Parse an IP-Literal in RFC 3986 and RFC 6874.
        // E.g., "[fe80::1], "[fe80::1%25en0]"
@@ -490,7 +489,10 @@ func parseHost(host string) (string, error) {
        if i < 0 {
            return "", errors.New("missing ']' in host")
        }
-       colonPort = host[i+1:]
+       colonPort := host[i+1:]
+       if !validOptionalPort(colonPort) {
+           return "", fmt.Errorf("invalid port %q after host", colonPort)
+       }
        // Parse a host subcomponent without a ZoneID in RFC
        // 6874 because the ZoneID is allowed to use the
        // percent encoded form.
@@ -500,11 +502,8 @@ func parseHost(host string) (string, error) {
        } else {
            litOrName = host[1:j]
        }
-   } else {
-       if i := strings.Index(host, ":"); i != -1 {
-           colonPort = host[i:]
-       }
    }
+
    // A URI containing an IP-Literal without a ZoneID or
    // IPv4address in RFC 3986 and RFC 6847 must not be
    // percent-encoded.
@@ -517,9 +516,6 @@ func parseHost(host string) (string, error) {
    if strings.Contains(litOrName, "%") {
        return "", errors.New("percent-encoded characters in host")
    }
-   if !validOptionalPort(colonPort) {
-       return "", fmt.Errorf("invalid port %q after host", colonPort)
-   }
    var err error
    if host, err = unescape(host, encodeHost); err != nil {
        return "", err

@gopherbot
Copy link

CL https://golang.org/cl/13253 mentions this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants