Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/url: don't parse ';' as a separator in query string [freeze exception] #25192

Closed
zhyale opened this issue May 1, 2018 · 38 comments
Closed

net/url: don't parse ';' as a separator in query string [freeze exception] #25192

zhyale opened this issue May 1, 2018 · 38 comments

Comments

@zhyale
Copy link

@zhyale zhyale commented May 1, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go1.10.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

CentOS 7 with kernel 3.10.0-514.10.2.el7.x86_64

What did you do?

I create a program to detect attack such as SQL Injection, when test case is:
http://..../testcase?id=1%27;--

and I use:
r.ParseForm()
params := r.Form
fmt.Println("params:", params, "count:", len(params))
for key, values := range params {
fmt.Println("param", key, ":", values)
}

Got:
params: map[--:[] id:[1']] count: 2
param id : [1']
param -- : []

What did you expect to see?

expect only one expression in this case:
key: id
value: 1';--

What did you see instead?

I got two key:[value] pairs.

@agnivade agnivade changed the title r.Form parse error when parameter include SQL Injection test net/http: ParseForm incorrectly parses parameters during an SQL Injection test May 1, 2018
@agnivade
Copy link
Contributor

@agnivade agnivade commented May 1, 2018

Complete repro -

package main

import (
	"fmt"
	"net/http"
)

func main() {
	http.HandleFunc("/foo", func(w http.ResponseWriter, req *http.Request) {
		err := req.ParseForm()
		if err != nil {
			fmt.Println(err)
			w.Write([]byte("error"))
			return
		}
		params := req.Form
		fmt.Println("params:", params, "count:", len(params))
		for key, values := range params {
			fmt.Println("param", key, ":", values)
		}
		w.Write([]byte("OK"))
	})

	fmt.Println("starting on port 9999")
	server := &http.Server{
		Addr: ":9999",
	}
	server.ListenAndServe()
}

curl 'http://localhost:9999/foo?id=1%27;--'

@bradfitz - Is this expected ?

Loading

@agnivade agnivade added this to the Go1.11 milestone May 1, 2018
@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented May 1, 2018

Both & and ; are used to split key value pairs. The ; is optional since it allows you to provide a URL as a value with a query string.
See https://en.wikipedia.org/wiki/Query_string under Web Forms

Loading

@odeke-em
Copy link
Member

@odeke-em odeke-em commented May 1, 2018

Hello @zhyale, thank you for filing this issue and welcome to the Go project!

@agnivade and @fraenkel thank you for the responses too.

So I believe, the root cause of the question here is rather: Why is a semi colon being used as a separator in url.Parse?

If you run https://play.golang.org/p/QVz18jWspPF or inlined below:

package main

import (
	"log"
	"net/url"
)

func main() {
	log.SetFlags(0)
	u, err := url.Parse("http://localhost:9999/foo?id=1%27;--")
	if err != nil {
		log.Fatalf("Failed to parse URL: %v", err)
	}
	log.Printf("%#v\n", u.Query())
}

You'll see the real symptom

url.Values{"--":[]string{""}, "id":[]string{"1'"}}

A W3C recommendation https://www.w3.org/TR/html401/appendix/notes.html#h-B.2.2 recommended using ; as a separator for url-encoded params for form-data
screen shot 2018-05-01 at 5 08 58 am

and that was what we used to add ; as a separator, making ';' synoymous with '&' in https://golang.org/cl/4973062 #2210 (comment)

However, unfortunately that recommendation just got superseded very recently in mid-December 2017 by https://www.w3.org/TR/2017/REC-html52-20171214/

and now that points to https://url.spec.whatwg.org/#urlsearchparams
in which we can see that the only separator here is '&'
screen shot 2018-05-01 at 5 19 53 am

Node.js doesn't seem to recognize ';' as a separator

url.parse('http://localhost:9999/foo?id=1%27;--', true);
Url {
  protocol: 'http:',
  slashes: true,
  auth: null,
  host: 'localhost:9999',
  port: '9999',
  hostname: 'localhost',
  hash: null,
  search: '?id=1%27;--',
  query: { id: '1\';--' },
  pathname: '/foo',
  path: '/foo?id=1%27;--',
  href: 'http://localhost:9999/foo?id=1%27;--' }

and the old survey of what other languages do while technically still correct per #2210 (comment), any future adoptions of the new recommendation from W3C will mean that those languages will also change their behavior.

Now the big question is: this behavior has been around since 2011, changing it 7 years later might massively break code for many users. Perhaps we need a survey of the make-up and interpretation of query strings from some frontend server?

Also this perhaps will need some security considerations as well or maybe a wide scale adoption of the recommendation first?

In addition to those already paged, I will page some more folks to help pitch in thoughts about the fate of the new W3C recommendation /cc @ianlancetaylor @andybons @tombergan @agl @rsc @mikesamuel

Loading

@odeke-em odeke-em changed the title net/http: ParseForm incorrectly parses parameters during an SQL Injection test net/url: ';' is parsed as a separator in query string May 1, 2018
@bradfitz bradfitz removed this from the Go1.11 milestone May 1, 2018
@bradfitz bradfitz added this to the Go2 milestone May 1, 2018
@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented May 1, 2018

The Go 1 docs say:

Query is expected to be a list of key=value settings separated by ampersands or semicolons.

So we can't change it during Go 1.x.

Repurposing this to be a Go 2.x issue.

Loading

@mikesamuel
Copy link
Contributor

@mikesamuel mikesamuel commented May 1, 2018

IIUC, the problem is that

Array.from(new URL(`https://a/?a&b;c`).searchParams.keys())
// [ "a", "b;c" ]

but

values, _ := url.ParseQuery(`a&b;c`)
fmt.Println(reflect.ValueOf(values).MapKeys())  // [c a b]

Assuming that's right:

Now the big question is: this behavior has been around since 2011, changing it 7 years later might massively break code for many users. Perhaps we need a survey of the make-up and interpretation of query strings from some frontend server?

+1

We might be able to get a handle on the size of potential breakage by surveying URL composition libraries to get an idea of how many sources are likely to produce URLs that are substantially different given non-crafted inputs.

For example:

// html/template
t, _ := template.New("T").Parse(`<a href="/?a={{.}}&b=b">`)
t.Execute(&out, `foo;bar&baz`)
out.String() == `<a href="/?a=foo%3bbar%26baz&b=b">`

// net/url
url.QueryEscape(`foo;bar&baz`) == `foo%3Bbar%26baz`
url.QueryUnescape(`foo%3Bbar%26baz`) == `foo;bar&baz`

In vanilla JS

encodeURIComponent(`foo;bar&baz`) == `foo%3Bbar%26baz`
encodeURI(`foo;bar&baz`) == `foo;bar&baz`

I think this is mostly a correctness and interop issue but that the security consequences are not severe.

That said, there are risks to inferring more structure than the spec allows.

If a survey finds that there are widely used url.QueryEscape / encodeURIComponent counterparts that escape & but do not escape ;, and a trusted partner composes URLs thus:

goServiceUrl = "//example.com/?"
    + "securityIrrelevantParam=" + laxEscapeQuery(untrustedString)
    + "&importantParam=" + laxEscapeQuery(trustedString)

then untrustedString might inject a ;importantParam which would be available as the zero-th entry in Go's query map.

The risk is pretty small as long as laxEscapeQuery escapes '=' and one workaround is to consistently enforce has-exactly-one constraints.

Loading

@bigluck
Copy link

@bigluck bigluck commented Oct 23, 2018

Hi everybody,
I'm having the same problem by using AWS API Gateway in front of my application.
When the proxed endpoint is configured with the HTTP Proxy integration configuration on, AWS API Gateway transform a valid request like this one:

GET /service?cid=002125b3-26bd-48c4-ac90-9683d804be2d&tm=1532974971748&ua=Mozilla%2F5.0%20(Macintosh%3B%20Intel%20Mac%20OS%20X%2010_13_6)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F67.0.3396.99%20Safari%2F537.36&z=b075dd02-77f8-44bb-a389-866dcc80c9b6 HTTP/1.1

into:

GET /service?cid=002125b3-26bd-48c4-ac90-9683d804be2d&tm=1532974971748&ua=Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_13_6)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/67.0.3396.99+Safari/537.36&z=b075dd02-77f8-44bb-a389-866dcc80c9b6 HTTP/1.1

And Golang parse the ua parameter into Mozilla/5.0 (Macintosh

Here a Go Playground to test it: https://play.golang.org/p/BeKy_UuSyoO

Loading

@FiloSottile FiloSottile removed this from the Go2 milestone Mar 3, 2021
@FiloSottile FiloSottile added this to the Go1.17 milestone Mar 3, 2021
@FiloSottile FiloSottile changed the title net/url: ';' is parsed as a separator in query string proposal: net/url: don't parse ';' as a separator in query string Mar 3, 2021
@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Mar 3, 2021

It was recently pointed out that this behavior divergence can lead to cache poisoning attacks in reverse proxies that cache based on a subset of query parameters: if Go treats https://example.com/?x;id=1337&id=123 as a request for ID 1337, while the cache treats it as a request for ID 123, the cache can be made to serve the wrong response.

At the risk of repeating myself, I want to point out that relying on parser alignment for security is doomed, so this will break again, and often in subtler ways that we can't fix universally. However, this is probably broken most of the time now, so we should fix it.

I did a quick survey of other popular languages and frameworks.

  • PHP seems to have defaulted to & forever. php/php-src@c34d2b9
  • Rails (via Rack) currently supports both, but landed a change in January following the Snyk blog post. Unclear when it will reach a release. rack/rack#1732
  • Python currently supports both, but will switch in 3.10 and assigned it a CVE. python/cpython#24297

A more interesting question would be how caching proxies behave, but a lot of them are ad-hoc and hard to test. Snyk seems to think at least Varnish uses only &. As pointed out in #25192 (comment), AWS is not only ignoring ;, but it's unescaping it while proxying.

I think the W3C and WHATWG recommendations are sort of red herrings: they are about x-www-form-urlencoded, not URL queries. On the other hand, the entire reason anyone supported ; is that it was more convenient to use for x-www-form-urlencoded forms, as it did not require escaping & in HTML.

Given Rails and Python are waiting for the next scheduled release to fix this, we will too, and ship the fix in Go 1.17. It would be too harsh of a behavior change for too vague of a security benefit to land in a security release anyway.

This is technically a backwards compatibility promise violation, but I am invoking the security exception. More broadly, we should think whether the thing we want to provide compatibility with is "the exact self-contained behavior of the library when it was introduced" or "correctly interoperate with the current ecosystem". Realistically, we're already doing more of the latter with crypto/tls and crypto/x509, because doing the exact same thing as 5 years ago would be useless and insecure. HTTP and web specifications are similarly living and evolving, and we should figure out what our policy about that is.

Loading

@rsc
Copy link
Contributor

@rsc rsc commented Mar 3, 2021

It seems wrong that every single HTTP server implementation in the world is being charged with coping with a decision made by a few proxies. If the proxy doesn't think ; is special, then it could easily enforce that view by escaping the ; as it passes the request through. And if it did think ; was special, it could enforce that by rewriting to &. Then just a few proxies would need changing instead of all the HTTP server implementations in the world.

The fact that we're having this discussion and not the nginx developers is utterly backward.
(And AWS unescaping the ; is even more backward.)

Anyway, that rant aside, if we do make this change, can we find some way to make it not a silent change? I worry about people updating to Go 1.17 and having their servers misbehave but with no idea why. Maybe a logging print for each request that ignores ; as separator, limited to one per second or something like that?

It also seems like a hard requirement to be able to opt back in to the old behavior. Users with client apps sending ; that don't have the misfortune of being behind nginx will not like being forced to rewrite and redeploy their apps just to get Go 1.17.

Loading

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals Mar 10, 2021
@rsc
Copy link
Contributor

@rsc rsc commented Mar 10, 2021

Adding to the minutes. It's getting a bit late to make a change in Go 1.17 so we should try to converge on a plan soon.

@bigluck, if you still use the AWS API Gateway, does it still unescape semicolons? Or does anyone else know?

Loading

@rsc
Copy link
Contributor

@rsc rsc commented Mar 10, 2021

@ianlancetaylor found private mail saying that https://angular.io/api/common/http/HttpUrlEncodingCodec did not escape semicolons at least in 2019, leading to confusion as well. Does anyone know if that is still the case?

And does anyone know of any clients that do use semicolons for separators and would break if we stopped recognizing them?

Loading

@rsc rsc moved this from Incoming to Active in Proposals Mar 10, 2021
@rsc
Copy link
Contributor

@rsc rsc commented Mar 10, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

Loading

@antichris
Copy link

@antichris antichris commented Apr 30, 2021

[Opinionated rant]

I agree that Go should be fit for use in as many ecosystems as possible. And you're right about that part of the sentence you quoted dripping in my resentment towards those who work tirelessly on reducing the signal-to-noise ratio. There is nothing more I want for code monkeys than to be finally able to graduate to engineers.

I'm really not sure about the validity of that ecosystem consistency argument though: if you were residing in a region where the rest of the population consistently painted their faces with human excrement, would you adopt this practice just for the sake of consistency? It's an extreme example, but reflects how I feel about the ecosystem consistency debate in this context where there is no benefit, only drawbacks, to the approach that everyone sticks to, but they do it just because everyone else is doing it, in a runaway feedback loop.

Semicolons as separators reduce HTML payload generation and parsing complexity and resource use in terms of processing time, memory and bandwidth consumption in humans and machines alike. But I suppose you could say that's a microoptimization that no one should care about.

Most importantly, you're right — it somehow entirely got lost on me that what we're discussing calls for changes in url, which is a static package, and I can't think of a way to fix that without a massive redesign of huge chunks of the standard library.

Loading

@andrius4669
Copy link
Contributor

@andrius4669 andrius4669 commented Apr 30, 2021

somehow people are forgetting that it's often not necessary at all to escape & in case of url parameter usage, because the usual &x=y has very little chance to match &<something>; sequence for character references, unless parameter key contains ; AND character reference is actually meaningful AND it's not escaped for some reason (url.QueryEscape does escape ;).

edit: apparently there are some legacy character references which don't need to end with ; (reference), so I was too optimistic about this. it's quite a minefield.

edit2: except they only apply when I use them in usual text, but when I use them in <a href=" on firefox they get parsed as parameters??? this seems actually "fine" but quite complicated indeed.

Loading

@rsc
Copy link
Contributor

@rsc rsc commented May 5, 2021

@AGWA's handler converter is nice for opt-in use. I filed #45973 for that.

Loading

@rsc rsc moved this from Likely Accept to Accepted in Proposals May 5, 2021
@rsc
Copy link
Contributor

@rsc rsc commented May 5, 2021

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

Loading

@rsc rsc changed the title proposal: net/url: don't parse ';' as a separator in query string net/url: don't parse ';' as a separator in query string May 5, 2021
@rsc rsc removed this from the Proposal milestone May 5, 2021
@rsc rsc added this to the Backlog milestone May 5, 2021
@FiloSottile FiloSottile removed this from the Backlog milestone Jun 7, 2021
@FiloSottile FiloSottile added this to the Go1.17 milestone Jun 7, 2021
@FiloSottile FiloSottile changed the title net/url: don't parse ';' as a separator in query string net/url: don't parse ';' as a separator in query string [freeze exception] Jun 7, 2021
@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Jun 7, 2021

I'd like to request a freeze exception for this fix. We deemed it not a backportable security fix (also because of the potential for disruption and the limited real-world impact), but it's conceivable that it could have security value for some applications, so we'd like not to wait for Go 1.18.

Russ, should we also land AllowQuerySemicolons (#45973) during the freeze? I'm leaning towards yes.

/cc @golang/release @rsc

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 7, 2021

Change https://golang.org/cl/325697 mentions this issue: net/url: reject query values with semicolons

Loading

@dmitshur
Copy link
Contributor

@dmitshur dmitshur commented Jun 7, 2021

Thank you for making this freeze exception request. It is approved.

Loading

@gopherbot gopherbot closed this in e6dda19 Jun 9, 2021
zmb3 added a commit to gravitational/teleport that referenced this issue Sep 3, 2021
Go 1.17 (by default) disallows using ; as a separator. Our tests fail
on Go 1.17 with the following error. Since we only do this in test
code, it's easiest just to switch to & so we don't have a breakage
when we start compiling with 1.17.

See also: golang/go#25192

Error:

    2021/09/03 10:21:57 http: URL query contains semicolon, which is no longer a supported separator; parts of the query may be stripped when parsed; see golang.org/issue/25192

    ----------------------------------------------------------------------
    FAIL: apiserver_test.go:473: WebSuite.TestSAMLSuccess

    apiserver_test.go:524:
        c.Assert(u.Scheme+"://"+u.Host+u.Path, Equals, fixtures.SAMLOktaSSO)
    ... obtained string = ":///web/msg/error/login"
    ... expected string = "https://dev-813354.oktapreview.com/app/gravitationaldev813354_teleportsaml_1/exkafftca6RqPVgyZ0h7/sso/saml"
zmb3 added a commit to gravitational/teleport that referenced this issue Sep 7, 2021
Go 1.17 (by default) disallows using ; as a separator. Our tests fail
on Go 1.17 with the following error. Since we only do this in test
code, it's easiest just to switch to & so we don't have a breakage
when we start compiling with 1.17.

See also: golang/go#25192

Error:

    2021/09/03 10:21:57 http: URL query contains semicolon, which is no longer a supported separator; parts of the query may be stripped when parsed; see golang.org/issue/25192

    ----------------------------------------------------------------------
    FAIL: apiserver_test.go:473: WebSuite.TestSAMLSuccess

    apiserver_test.go:524:
        c.Assert(u.Scheme+"://"+u.Host+u.Path, Equals, fixtures.SAMLOktaSSO)
    ... obtained string = ":///web/msg/error/login"
    ... expected string = "https://dev-813354.oktapreview.com/app/gravitationaldev813354_teleportsaml_1/exkafftca6RqPVgyZ0h7/sso/saml"
zmb3 added a commit to gravitational/teleport that referenced this issue Sep 23, 2021
Go 1.17 (by default) disallows using ; as a separator. Our tests fail
on Go 1.17 with the following error. Since we only do this in test
code, it's easiest just to switch to & so we don't have a breakage
when we start compiling with 1.17.

See also: golang/go#25192

Error:

    2021/09/03 10:21:57 http: URL query contains semicolon, which is no longer a supported separator; parts of the query may be stripped when parsed; see golang.org/issue/25192

    ----------------------------------------------------------------------
    FAIL: apiserver_test.go:473: WebSuite.TestSAMLSuccess

    apiserver_test.go:524:
        c.Assert(u.Scheme+"://"+u.Host+u.Path, Equals, fixtures.SAMLOktaSSO)
    ... obtained string = ":///web/msg/error/login"
    ... expected string = "https://dev-813354.oktapreview.com/app/gravitationaldev813354_teleportsaml_1/exkafftca6RqPVgyZ0h7/sso/saml"
michaelmcallister added a commit to gravitational/teleport that referenced this issue Sep 27, 2021
Go 1.17 (by default) disallows using ; as a separator. Our tests fail
on Go 1.17 with the following error. Since we only do this in test
code, it's easiest just to switch to & so we don't have a breakage
when we start compiling with 1.17.

See also: golang/go#25192

Error:

    2021/09/03 10:21:57 http: URL query contains semicolon, which is no longer a supported separator; parts of the query may be stripped when parsed; see golang.org/issue/25192

    ----------------------------------------------------------------------
    FAIL: apiserver_test.go:473: WebSuite.TestSAMLSuccess

    apiserver_test.go:524:
        c.Assert(u.Scheme+"://"+u.Host+u.Path, Equals, fixtures.SAMLOktaSSO)
    ... obtained string = ":///web/msg/error/login"
    ... expected string = "https://dev-813354.oktapreview.com/app/gravitationaldev813354_teleportsaml_1/exkafftca6RqPVgyZ0h7/sso/saml"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Proposals
Accepted
Linked pull requests

Successfully merging a pull request may close this issue.

None yet