Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: ServeMux uses URL.Path instead of URL.EscapedPath(), leading it to treat %2F as a path separator #14815

Closed
gelraen opened this issue Mar 14, 2016 · 7 comments

Comments

Projects
None yet
4 participants
@gelraen
Copy link

commented Mar 14, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    go version go1.6 darwin/amd64
  2. What did you do?
    http://play.golang.org/p/60StDlOnw8
    It doesn't work on play.golang.org, but when you run it on local machine it results in:
2016/03/14 14:40:10 http: panic serving 127.0.0.1:58546: /a/b
  1. What did you expect to see?
    I expected HTTP handler to get a request with URL equivalent to URL in original request, i.e. /a%2F%2Fb. Per https://tools.ietf.org/html/rfc3986#section-2.2 /a/b and even /a//b are NOT EQUIVALENT to /a%2F%2Fb

This happens because http.ServeMux uses URL.Path, which already contains wrong data /a//b, does some normalization on that and issues a redirect to /a/b, so user handler is never invoked during the initial request, only after this redirect.

I suggest one of these ways to fix this:

  1. Make net/url percent-decode only unreserved characters, making it compliant with RFC3986
  2. Use EscapedPath() in http.ServeMux, so it will see the data before net/url spoiled it

Both options (and probably any other) break API in one way or another.

Yes, I saw the response on #8248 that this is documented behavior, but it directly contradicts relevant Internet standard, so it's worth to be changed.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Mar 14, 2016

This comes up repeatedly in various forms, but compatibility and not breaking old programs likely prevents us from doing anything here. Any such proposal needs to be thoroughly investigated to consider its impact on existing programs.

/cc @rsc for any opinions.

@bradfitz bradfitz added this to the Unplanned milestone Mar 14, 2016

@gelraen

This comment has been minimized.

Copy link
Author

commented Mar 14, 2016

I can hardly imagine an application that would rely on %2F to be decoded and used as path separator. In most cases you either don't care about those weirdos sending you requests with %2F or want it to be preserved as is and not used as path separator. But that's just my opinion based on a very limited sample.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Mar 28, 2016

You cannot just leave the %2F's in place, because that will encourage people to unescape the URL.Path again, which will lead to double-unescaping problems. And you do need to unescape the other things. You do want /%78yz to turn into /xyz. Also if you decide not to unescape %2F then you also can't unescape %252F, and because you're not unescaping that, you can't unescape %25252F, or %2525252F, and so on. It gets into a rathole pretty quickly. And if we started doing this all of a sudden then you'd have to worry about people seeing %2F and not knowing if it was a preserved %2F or a decoded %252F being process by an earlier version of Go. There are no good answers here.

Fundamentally, the idea of 'Path string' is at odds with the "internet standard". It should have been 'Path []string' and people who wanted the full path would have to strings.Join(u.Path, "") or something like that. But obviously we can't do that now.

For now I think we should just leave things alone. In the long term I think very limited wildcard support in ServeMux might help provide an API with the convenience of the single string form and still have the ability to pull out these kinds of mangled paths from individual elements. I wrote more about this at https://www.reddit.com/r/golang/comments/46bd5h/ama_we_are_the_go_contributors_ask_us_anything/d05lewi.

The workaround is to pass your own handler to Serve or ListenAndServe and have that handler pick off the %2F paths you care about before handing the rest to ServeMux. One reason the root handler is just an http.Handler instead of a ServeMux is so that you can do this when it is necessary. (Sadly, this unexpected usage of REST has made it more necessary than we anticipated.)

@gelraen

This comment has been minimized.

Copy link
Author

commented Mar 28, 2016

Also if you decide not to unescape %2F then you also can't unescape %252F, and because you're not unescaping that, you can't unescape %25252F, or %2525252F, and so on. It gets into a rathole pretty quickly. And if we started doing this all of a sudden then you'd have to worry about people seeing %2F and not knowing if it was a preserved %2F or a decoded %252F being process by an earlier version of Go.

True. However, RFC3986 defines a concept of equivalence, which goes as far as unescaping unreserved characters. Problem with URL.Path is that it maps URLs from distinct equivalence classes to the same string value, and any change in how things are unescaped for URL.Path is API-breaking change. So it's either URL.Path is left broken or someone's code exhibits behavior change; you can't avoid both of these because there may be existing code that relies on this particular brokenness.

It is possible to preserve equivalence while mapping URL paths to string values by unescaping only unreserved characters (and "%" is not unreserved, it's funny^Wannoying that "reserved" + "unreserved" sets don't cover full ASCII), and then having UnescapeNonUnreserved(string) string function to be used on separate path components. But it would be less fragile to expose parsed and unescaped path as []string.

https://www.reddit.com/r/golang/comments/46bd5h/ama_we_are_the_go_contributors_ask_us_anything/d05lewi.
Similarly, the decision to use unescaped URLs in the API has been overall very helpful: no handler has to worry about turning index%2Ehtml into index.html, for example.

I totally agree that unescaping (more precisely – normalization https://tools.ietf.org/html/rfc3986#section-6 ) is very useful and simplifies user code, but it was taken a bit too far and now it's too late to fix it in-place without breaking someone's code.

In the long term I think very limited wildcard support in ServeMux might help provide an API with the convenience of the single string form and still have the ability to pull out these kinds of mangled paths from individual elements.

In my case it doesn't mess the routing (i.e. it invokes the exact handler I want it to), but once it does, there's already no way to distinguish between / (path separator) and %2F (escaped "/" character). So unless you plan to bring preserving of escaped reserved characters (i.e. using URL.RawPath instead of URL.Path) along with wildcards, it won't help.

It should have been 'Path []string' and people who wanted the full path would have to strings.Join(u.Path, "") or something like that. But obviously we can't do that now.

How about adding URL.ParsedPath []string? Just like URL.RawPath was added in 1.5. It may require some considerations about what to do when it gets inconsistent with URL.Path and URL.RawPath, but I don't see any fundamental reason why it's not possible. Oh, and before joining path components you'd need to escape non-unreserved characters in each component, so it's a bit more complicated than simple strings.Join.

It's also possible to add it as URL.ParsedPath() []string (that will work off URL.EscapedPath()) and transition to use that instead of URL.Path in http.ServeMux and other standard packages.

The workaround is to pass your own handler to Serve or ListenAndServe and have that handler pick off the %2F paths you care about before handing the rest to ServeMux.

Yeah, for now I'm stuck with my own clone of http.ServeMux that preserves only %2F in the path.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2016

I don't think we're going to add more complexity to url.URL now. RawPath suffices to preserve all the information one might need to derive or need control over. As of Go 1.5, clients in this (not everyday) situation can build what they need from the pieces provided. When it's time for a rethink of the overall big picture, we'll be sure to consider this. But I'd rather not add more piecemeal.

@gelraen

This comment has been minimized.

Copy link
Author

commented Mar 30, 2016

Okay, then what do you think about making http.ServeMux use EscapedPath() ? Cause my issue right now is that after passing through http.ServeMux information I need is not preserved even in RawPath.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Mar 30, 2016

I understand the issue you are having, but I don't think we can change
anything today. http.ServeMux is defined to issue certain redirects back to
the client and it is doing that (the request for /a/b%2f%2fc redirects the
client to /a/b/c; that second request is what your handler sees). If you
need to get in ahead of http.ServeMux and prevent the defined redirects,
the way to do that today is to define your own mux in front of it and pass
that to http.Serve or http.ListenAndServe.

On Wed, Mar 30, 2016 at 4:53 AM Maxim Ignatenko notifications@github.com
wrote:

Okay, then what do you think about making http.ServeMux use EscapedPath()
? Cause my issue right now is that after passing through http.ServeMux
information I need is not preserved even in RawPath.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#14815 (comment)

@bradfitz bradfitz closed this Mar 31, 2016

@golang golang locked and limited conversation to collaborators Mar 31, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.