New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/url: URL allows malformed query round trip #22907

Open
artyom opened this Issue Nov 28, 2017 · 10 comments

Comments

Projects
None yet
6 participants
@artyom
Contributor

artyom commented Nov 28, 2017

What did you do?

package main

import (
	"fmt"
	"log"
	"net/url"
)

func main() {
	u, err := url.Parse("http://example.com/bad path/?bad query#bad fragment")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(u.String())
}

https://play.golang.org/p/hdX1zpv3BN

What did you expect to see?

I expect either url.Parse return a non-nil error or URL.String method return fully escaped url representation — http://example.com/bad%20path/?bad%20query#bad%20fragment — with query being escaped the same way as path or fragment.

What did you see instead?

http://example.com/bad%20path/?bad query#bad%20fragment

For the reference, such url is rejected by net/http.Server: https://play.golang.org/p/2gujmbXZlu

Does this issue reproduce with the latest release (go1.9.2)?

Yes

System details

go version devel +9a13f8e11c Tue Nov 28 06:47:50 2017 +0000 darwin/amd64
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/artyom/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/tmp/go:/Users/artyom/go"
GORACE=""
GOROOT="/Users/artyom/Repositories/go"
GOTMPDIR=""
GOTOOLDIR="/Users/artyom/Repositories/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/lb/3rk8rqs53czgb4v35w_342xc0000gn/T/go-build624293827=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version devel +9a13f8e11c Tue Nov 28 06:47:50 2017 +0000 darwin/amd64
GOROOT/bin/go tool compile -V: compile version devel +9a13f8e11c Tue Nov 28 06:47:50 2017 +0000
uname -v: Darwin Kernel Version 17.2.0: Fri Sep 29 18:27:05 PDT 2017; root:xnu-4570.20.62~3/RELEASE_X86_64
ProductName:	Mac OS X
ProductVersion:	10.13.1
BuildVersion:	17B48
lldb --version: lldb-900.0.57
  Swift-4.0

https://tools.ietf.org/html/rfc3986#section-3.4 states that query component should be defined as (appendix A):

  query       = *( pchar / "/" / "?" )
  pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
  unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
  pct-encoded   = "%" HEXDIG HEXDIG
  sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="

There's no whitespace character in this list. whatwg agrees on that:

A URL-query string must be zero or more URL units.

[...]

The URL units are URL code points and percent-encoded bytes.

[...]

The URL code points are ASCII alphanumeric, U+0021 (!), U+0024 ($), U+0026 (&), U+0027 ('), U+0028 LEFT PARENTHESIS, U+0029 RIGHT PARENTHESIS, U+002A (*), U+002B (+), U+002C (,), U+002D (-), U+002E (.), U+002F (/), U+003A (:), U+003B (;), U+003D (=), U+003F (?), U+0040 (@), U+005F (_), U+007E (~), and code points in the range U+00A0 to U+10FFFD, inclusive, excluding surrogates and noncharacters.

@bradfitz bradfitz added this to the Go1.11 milestone Nov 28, 2017

@namusyaka

This comment has been minimized.

Member

namusyaka commented Dec 8, 2017

note:
This seems to be related to this line, and its behavior looks reasonable.

@namusyaka

This comment has been minimized.

Member

namusyaka commented Dec 8, 2017

I have investigated the issue and confirmed that this comment is incorrect, or at least the current (*URL) String() implementation is wrong.
In fact, query components aren't escaped at all, not only in this example.
If the comment is positive, I think we should escape the query component appropriately.
/cc @tombergan

@fraenkel

This comment has been minimized.

Contributor

fraenkel commented Dec 8, 2017

@namusyaka I think you are mistaken. The problem is with all forms of Parse which does no validation on the query or fragment pieces. The parse of the uri should have failed. Since it does not, the Fragment or RawQuery fields are incorrect and hence String() which is doing the right thing appears to provide a bad url.

All that has to happen is to apply validation of the query and fragment pieces in parse, and that should solve the issues. It might start breaking code that is sending around "bad" URLs.

@fraenkel

This comment has been minimized.

Contributor

fraenkel commented Dec 9, 2017

Its not clear that this can be fixed given the behavior of shouldEscape() which states that most of the sub-delims should escape.

@namusyaka

This comment has been minimized.

Member

namusyaka commented Dec 9, 2017

@fraenkel Yes, I knew it's more easy to understand the behavior.
However, if your opinion will reflect to the implementation, another cases such as http://example.com/foo bar, http://example.com/foo? bar and http://example.com/foo?bar# baz should be failed on parse, and then I'm worried that breaks backword-compatibility.

@gopherbot

This comment has been minimized.

gopherbot commented Mar 7, 2018

Change https://golang.org/cl/99135 mentions this issue: net/url: reject invalid query strings when parsing URLs.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jul 11, 2018

Copying my comment from the CL:

I changed the commit message to accurately reflect what later versions of this CL did. It now says:

net/url: escape URL.RawQuery on Parse if it contains invalid characters

But while reviewing this, I'm worried it might change the behavior of:

https://golang.org/pkg/net/url/#URL.Query
which says:

"It silently discards malformed value pairs. To check errors use ParseQuery."

Before it silently discarded things, and with this CL it would start returning escaped versions instead. Do we care?


I think we probably should wait for Go 1.12 at this point. This needs a decision on what to do.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Sep 26, 2018

Reopening, as the fix is being reverted.

@bradfitz bradfitz reopened this Sep 26, 2018

@gopherbot

This comment has been minimized.

gopherbot commented Sep 26, 2018

Change https://golang.org/cl/137716 mentions this issue: Revert "net/url: escape URL.RawQuery on Parse if it contains invalid characters"

gopherbot pushed a commit that referenced this issue Sep 26, 2018

Revert "net/url: escape URL.RawQuery on Parse if it contains invalid …
…characters"

This reverts commit CL 99135 (git rev 1040626).

Reason for revert: breaks valid code; see #27302

Fixes #27302
Updates #22907

Change-Id: I82bb0c28ae1683140c71e7a2224c4ded3f4acea1
Reviewed-on: https://go-review.googlesource.com/137716
Reviewed-by: Ian Lance Taylor <iant@golang.org>

komuw added a commit to komuw/go that referenced this issue Dec 4, 2018

net/url: escape URL.RawQuery on Parse if it contains invalid characters
Fixes golang#22907

Change-Id: I7abcf53ab92768514e13ce2554a6c25dcde8218e
Reviewed-on: https://go-review.googlesource.com/99135
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

@agnivade agnivade added NeedsDecision and removed NeedsFix labels Dec 9, 2018

@agnivade

This comment has been minimized.

Member

agnivade commented Dec 9, 2018

@bradfitz - Any thoughts on this ? Or do we want to push to 1.13 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment