Make normalize ignore %2B in query strings #99

tps12 · 2012-12-28T14:01:26Z

In a query string, '+' is reserved as a shorthand for space, so "real"
pluses encoded as %2b should be preserved during normalization:

http://example.com/one%2btwo/calc?q=1%2b2+2%2b3

is normalized as:

http://example.com/one+two/calc?q=1%2B2+2%2B3

Previously this would have been normalized to:

http://example.com/one+two/calc?q=1+2+2+3

making '+' ambiguous.

This fixes #50.

In a query string, '+' is reserved as a shorthand for space, so "real" pluses encoded as %2b should be preserved during normalization: http://example.com/one%2btwo/calc?q=1%2b2+2%2b3 is normalized as: http://example.com/one+two/calc?q=1%2B2+2%2B3 Previously this would have been normalized to: http://example.com/one+two/calc?q=1+2+2+3 making '+' ambiguous.

sporkmonger · 2012-12-28T14:06:45Z

I'm torn. This is a bug I've wanted fixed for ages and I haven't found a good way to fix it myself. But on the other hand, I really don't like the way you've used upper-case vs. lower-case as semantically meaningful, and I simply can't merge this as-is.

tps12 · 2012-12-28T14:15:46Z

Hi, thanks for the response. Case should not be semantically meaningful:

...?q=1%2b2+2%2B3

is normalized to

...?q=1%2B2+2%2B3

for example. Percent encodings are upcased as part of normalization, which I believe is the current/expected behavior.

sporkmonger · 2012-12-28T14:23:07Z

Oh man, total code-read fail on my part. I read leave_encoded.include?(c) ? sequence.upcase : c as something completely different. OK, in that case, I'm much happier with the commit, with one minor quibble. The unencode method should not perform any kind of normalization. So no upcasing. Just leave case as-is.

tps12 · 2012-12-28T14:29:16Z

Awesome, thanks, I'll fix that.

sporkmonger · 2012-12-28T14:31:30Z

Also I'd like to see tests that include both % and %25 in the same query string as %2B. I like my edge cases. The example "?v=%7E&w=%&x=%25&y=%2B&z=C%CC%A7" should normalize to "?v=~&w=%25&x=%25&y=%2B&z=%C3%87". While "?v=%7E&w=%&x=%25&y=+&z=C%CC%A7" should still normalize to "?v=~&w=%25&x=%25&y=+&z=%C3%87".

sporkmonger · 2012-12-28T14:46:30Z

There should probably be a test for any method that takes a leave_encoded parameter that ensures it's behaving correctly around characters that aren't on the list. Currently you're just testing strings that contain a percent-encoded "+" character, but it needs to verify all three character categories are encoded correctly in a single return value. So use something like "%%25~%7E+%2B" as an input. I'd like to see some unit tests of unencode directly and not just tests of methods that happen to call it downstream, since it's part of the public API (unlike normalize_component, which I don't expect anyone to ever call directly).

tps12 · 2012-12-28T14:58:29Z

Awesome, will add those.

Instead of upcasing leave_encoded characters inside the unencode call, leave them as they are and pass the list on to encode_component for upcasing. This encapsulation keeps unencode free of any normalization logic. Also added some more test cases around leave_encoded handling.

tps12 · 2012-12-28T16:46:21Z

I moved the upcasing out of unencode and added some test cases.

sporkmonger · 2012-12-28T17:17:08Z

LGTM.

Make normalize ignore %2B in query strings

sporkmonger closed this Dec 28, 2012

sporkmonger reopened this Dec 28, 2012

sporkmonger added a commit that referenced this pull request Dec 28, 2012

Merge pull request #99 from tps12/leave-%2b-in-query

72bf6c0

Make normalize ignore %2B in query strings

sporkmonger merged commit 72bf6c0 into sporkmonger:master Dec 28, 2012

sporkmonger mentioned this pull request Dec 28, 2012

Inconsistent normalization of % escaping #50

Closed

douglara mentioned this pull request Apr 23, 2020

Ignore %2B in normalize #386

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make normalize ignore %2B in query strings #99

Make normalize ignore %2B in query strings #99

tps12 commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

tps12 commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

tps12 commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

tps12 commented Dec 28, 2012

tps12 commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

Make normalize ignore %2B in query strings #99

Make normalize ignore %2B in query strings #99

Conversation

tps12 commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

tps12 commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

tps12 commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

sporkmonger commented Dec 28, 2012

tps12 commented Dec 28, 2012

tps12 commented Dec 28, 2012

sporkmonger commented Dec 28, 2012