-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/url: fails to decode %ya whereas browsers are more tolerant #29808
Comments
According to the standard RFC 3986, Section 2.1, a percent encoded character must be of the form:
And, the percent sign (
So the character sequence |
Thanks, I'll be waiting for decision. |
A shorter version is What does Apache do, or Nginx? |
I think yes. it is. But I've tested this case with Nginx (listing directory mode and reverse proxy) and it accepts only urls like |
I looked up what WHATWG URL Standard says about this. I'm not implying that Go must follow it, but just a case in point. https://url.spec.whatwg.org/#path-state
First, "validation error" is not a hard error: parser might continue, and report the error separately. Second, I think they don't want the parser to automatically decode percent-encoding when parsing the path portion of the URL. The point 3 means that certain characters, like |
One could work around this by using a separate URL parser, like https://github.com/nlnwa/whatwg-url, but unfortunately, |
Go net/http cannot send HTTP requests containing invalid URL encoding in path (e.g. bare percent) at all[1]. Browsers send a bare percent in such scenario, and do not implicitly autoencode it. Until the upstream issue is resolved somehow, we have only two alternatives: either fail to fetch such URLs, or at least attempt the autoencoded variant. Lots of webservers handle them the same way, so it's worth trying. There aren't too many websites with invalid URL encoding in path component, though. [1] golang/go#29808
Go net/http cannot send HTTP requests containing invalid URL encoding in path (e.g. bare percent) at all[1]. Browsers send a bare percent in such scenario, and do not implicitly autoencode it. Until the upstream issue is resolved somehow, we have only two alternatives: either fail to fetch such URLs, or at least attempt the autoencoded variant. Lots of webservers handle them the same way, so it's worth trying. There aren't too many websites with invalid URL encoding in path component, though. [1] golang/go#29808
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
GOARCH="amd64" GOBIN="" GOCACHE="/root/.cache/go-build" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="freebsd" GOOS="freebsd" GOPATH="/root/go" GOPROXY="" GORACE="" GOROOT="/usr/local/go" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/freebsd_amd64" GCCGO="gccgo" CC="clang" CXX="clang++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build246043989=/tmp/go-build -gno-record-gcc-switches"What did you do?
Yandex web application (https://yandex.ru/search) periodically sends requests like:
I parse this URL with url.ParseRequestURI() and it returns an error, but as I understand the URL is valid.
What did you expect to see?
Parsed URL.
What did you see instead?
The error:
The text was updated successfully, but these errors were encountered: