Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Add a decoder that supports ECMA unicode uris #450

Closed
wants to merge 1 commit into from

4 participants

@raggi
Owner

References #360 #323

More detail here:

http://en.wikipedia.org/wiki/Percent-encoding#Non-standard_implementations

I would appreciate reviews, if at all possible, today. This problem affects many people, but the solution may be imperfect.

@chneukirchen @manveru @tenderlove @nurse @nahi

@gioele

Please note that allowing non-standard escaping sequences, you are opening a quite big security hole. Right now firewalls check for common unsafe sequences (e.g., ../, .., ;/) encoded or partially encoded in URLs. With this change you are allowing a new set of such exploitable unsafe sequences.

-1 from me, since this is undocumented in RFC and has actually been rejected by the W3C.

I agree that %-decoding should not crash in these cases.
We probably should not decode cookies.

Could we decode wrong %-encoded strings as is and push this decoding to the application?

@raggi
Owner

@chneukirchen yes, that's what 1.2 used to do (just pass by these silently). The problem is, I'm not sure I can so safely remove the stdlib stuff that @tenderlove persuaded me to accept, before 1.5.

stdlib has been a total and utter fail for us, and I want to move back to our own escape and unescape functions in 1.5 onward.

If someone from core (e.g. @tenderlove ) wants to keep a single canonical standard, they can take ours instead.

@nurse

-1

Such old feature shouldn't be newly added to libraries.
Moreover the patch doesn't consider non BMP Unicode character.

@raggi
Owner

This is rejected. Other issues fixed using other methods. We'll not add an ecma unescape for now.

@raggi raggi closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 18 additions and 2 deletions.
  1. +13 −2 lib/rack/utils.rb
  2. +5 −0 test/spec_utils.rb
View
15 lib/rack/utils.rb
@@ -39,15 +39,26 @@ def escape_path(s)
# target encoding of the string returned, and it defaults to UTF-8
if defined?(::Encoding)
def unescape(s, encoding = Encoding::UTF_8)
- URI.decode_www_form_component(s, encoding)
+ URI.decode_www_form_component(unescape_unicode(s), encoding)
end
else
def unescape(s, encoding = nil)
- URI.decode_www_form_component(s, encoding)
+ URI.decode_www_form_component(unescape_unicode(s), encoding)
end
end
module_function :unescape
+ # See:
+ # http://en.wikipedia.org/wiki/Percent-encoding#Non-standard_implementations
+ # Issue 337
+ # Issue 360
+ def unescape_unicode(s)
+ s.gsub(/((?:%u[0-9a-fA-F]{4})+)/n){
+ [$1.delete('%u')].pack('H*').unpack("n*").pack("U*")
+ }
+ end
+ module_function :unescape_unicode
+
DEFAULT_SEP = /[&;] */n
class << self
View
5 test/spec_utils.rb
@@ -101,6 +101,11 @@ def kcodeu
should.equal "q1!2\"'w$5&7/z8)?\\"
end
+ should "unescape non-standard unicode uri escaping (e.g. ECMA-262)" do
+ Rack::Utils.unescape_unicode("%u3042").should.equal ""
+ Rack::Utils.unescape("%u3042").should.equal ""
+ end
+
should "parse query strings correctly" do
Rack::Utils.parse_query("foo=bar").
should.equal "foo" => "bar"
Something went wrong with that request. Please try again.