Return to using stdlib's URL escaping #14

ghost · 2015-08-23T03:27:44Z

Issue 5684 is now closed, so it might be time to revert PR 9.

mna · 2015-08-30T23:36:34Z

Yeah, it is now fixed in Go1.5 but it's unlikely people will all upgrade that fast, so I'd be tempted to leave it in for a while, at least until Go1.6, maybe even another release after that. I think a conditional build tag could be used, I may take a look at that when I have a moment, so I'll leave that open.

Thanks for the heads up!

beeker1121 · 2016-11-15T13:36:31Z

I tried making this change locally (returning u.String() in the NormalizeURL() method and using url.QueryEscape in the sortQuery() method), however quite a bit of the tests fail after doing so.

Should this be expected, and the tests just need to be fixed? I'm just unsure which package (net/url or urlesc) is handling everything correctly, as per the RFC stuff.

Here's the output of a portion of the tests:

purell_test.go:715: running LowerScheme...
purell_test.go:715: running LowerScheme2...
purell_test.go:715: running LowerHost...
purell_test.go:715: running UpperEscapes...
purell_test.go:731: UpperEscapes - FAIL expected 'http://www.whatever.com/Some%AA%20Special%8Ecases/', got 'http://www.whatever.com/Some%aa%20Special%8Ecases/'
purell_test.go:715: running UnnecessaryEscapes...
purell_test.go:731: UnnecessaryEscapes - FAIL expected 'http://www.toto.com/AB.D/23R-/_~', got 'http://www.toto.com/%41%42%2E%44/%32%33%52%2D/%5f%7E'
purell_test.go:715: running RemoveDefaultPort...
purell_test.go:715: running RemoveDefaultPort2...
purell_test.go:715: running RemoveDefaultPort3...
purell_test.go:715: running Safe...
purell_test.go:731: Safe - FAIL expected 'http://www.src.ca/to%1Ato%8B%EE/OKnowABC~', got 'http://www.src.ca/to%1ato%8b%ee/OKnow%41%42%43%7e'
purell_test.go:715: running BothLower...
purell_test.go:731: BothLower - FAIL expected 'http://www.src.ca:80/to%1Ato%8B%EE/OKnowABC~', got 'http://www.src.ca:80/to%1ato%8b%ee/OKnow%41%42%43%7e'
purell_test.go:715: running RemoveTrailingSlash...
purell_test.go:715: running RemoveTrailingSlash2...
purell_test.go:715: running RemoveTrailingSlash3...
purell_test.go:715: running AddTrailingSlash...
purell_test.go:731: AddTrailingSlash - FAIL expected 'http://www.SRC.ca:80/', got 'http://www.SRC.ca:80%2F'

Edit: After researching more into it, it appears that this commit golang/go@874a605 changed how the String() method decides how to encode the Path field. A new RawPath field was introduced in this commit, and this field is used to compare against the Path field. If RawPath is a valid encoding of Path, it simply uses it instead of Path.

One fix that resolves most of the test errors is just setting u.RawPath = u.Path before calling u.String() (if taking urlesc package out of the picture). However, while this solves all of the main tests, the TestDecodeUnnecessaryEscapesAll and TestEncodeNecessaryEscapesAll tests fail.

While the standard String() method gets the majority of the escaping correct after doing the path fix, it still does not handle the !, ', (, ), [, ], and * characters as we would probably want. I believe this is due to Go trying to follow the newest RFC specs. RFC 2396 labels these characters (marks) as unreserved (except [ and ] which most browsers accept, so we include), while RFC 3986 does not allow them anymore.

So, what do you think would be best? AFAIK, most browsers and other clients will still allow the unreserved characters labeled in RFC 2396, which the current urlesc package handles escaping correctly. Maybe it's best to keep using that package and forget about the standard lib for now?

mna · 2016-11-16T18:21:48Z

Yeah, I did give it a shot a while ago and ran into those issues but I didn't dig deeper at that moment, sorry I should've documented it a bit here. Thanks for looking into it, I think that regardless of what browsers support, the change should try to maintain behaviour with what purell does now.

If it's down to only those few characters, maybe it could be done in purell itself, modifying the *url.URL struct directly, so we could get rid of the external dep.

AlekSi · 2017-08-10T06:44:14Z

Would like to see the latest comment implemented.

mna · 2017-08-10T14:36:41Z

Me too! I'd be happy to accept a PR that implements this and makes all tests pass.

mna mentioned this issue Jun 27, 2016

Add FlagRemoveQuery (strips entire query part) #16

Closed

mna mentioned this issue Oct 22, 2016

Return same url.URL struct instead of string #18

Closed

natasha-jarus linked a pull request Mar 19, 2024 that will close this issue

Use net/url .String() #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return to using stdlib's URL escaping #14

Return to using stdlib's URL escaping #14

ghost commented Aug 23, 2015

mna commented Aug 30, 2015

beeker1121 commented Nov 15, 2016 •

edited

mna commented Nov 16, 2016

AlekSi commented Aug 10, 2017

mna commented Aug 10, 2017

Return to using stdlib's URL escaping #14

Return to using stdlib's URL escaping #14

Comments

ghost commented Aug 23, 2015

mna commented Aug 30, 2015

beeker1121 commented Nov 15, 2016 • edited

mna commented Nov 16, 2016

AlekSi commented Aug 10, 2017

mna commented Aug 10, 2017

beeker1121 commented Nov 15, 2016 •

edited