-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
What version of Go are you using (go version)?
$ go version go version go1.17 linux/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/anon/.cache/go-build" GOENV="/home/anon/.config/go/env" GOEXE="" GOEXPERIMENT="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/home/anon/go/pkg/mod" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/anon/go" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/local/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64" GOVCS="" GOVERSION="go1.17" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/dev/null" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build697905611=/tmp/go-build -gno-record-gcc-switches"
What did you do?
We ran into an issue where Go's net/url unescape function threw an error on a variant of double encoding that doesn't happen in other programming languages. A specific example is using %%32%65 as an alternative to %252e. Here's a demonstration of the error in Go playground:
https://play.golang.org/p/bfPg9f_oGxF
The docs are fairly clear about this and say:
It returns an error if any % is not followed by two hexadecimal digits.
The issue however is that other languages use an algorithm similar to the one specified in WHATWG's URL spec.
An example from Python's urllib:
$ python3
Python 3.9.5 (default, May 11 2021, 08:20:37)
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.parse
>>> urllib.parse.unquote("%%32%65")
'%2e'
>>>
Another example using Rust servo percent-encoding crate:
What did you expect to see?
I would expect it to be handled similar to the WHATWG URL spec for percent decoding. In this very specific example it would decode to %2e.
What did you see instead?
A decoding error show here: