AcitveSupport::JSON::Encoding.escape implementation doesn't clean and escape properly #7764

rkh opened this Issue Sep 26, 2012 · 8 comments


None yet
5 participants

rkh commented Sep 26, 2012


        def escape(string)
          string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)

This line does not work. Due to a bug in ruby, unless you change a strings encoding, the undef is ignored. You need to first change the encoding to something else. You also want to pass in :invalid => :replace, :replace => "".

And then it still doesn't work on JRuby and Rubinius.

          json = string.
            gsub(escape_regex) { |s| ESCAPED_CHARS[s] }.
                   [\xF0-\xF7][\x80-\xBF]{3})+/nx) { |s|
            s.unpack("U*").pack("n*").unpack("H*")[0].gsub(/.{4}/n, '\\\\u\&')

This assumes codepoints have 4 bytes top. Codepoints can have 5 or 6 bytes. You could even inject the overlong form, for instance of ", and thus totally mess with the parser.

Since you already assume you're on a Ruby with encoding support, you could use codepoints or each_codepoint instead, which should deal with overlong versions.

Also, why the strange indentation? This makes reading it really hard.

          json = %("#{json}")

jaggederest commented Oct 21, 2012

Could you compose a tiny test for me? I want to fix this, and I think I have a reasonable solution, but I'd like to check it against a test written by someone else rather than being solipsistic.


steveklabnik commented Nov 18, 2012

Hey @rkh, do you have a test for @jaggederest ?


rkh commented Nov 18, 2012

I still have the tab open, believe it or not. Sorry, so much else to do.


steveklabnik commented Nov 18, 2012

Ha! No worries. ❤️ just going through old issues.


JonRowe commented Mar 31, 2013

Did this ever get a test written for it?


rkh commented Apr 2, 2013

Not yet, no, sorry.


JonRowe commented Jun 13, 2013

Did anyone ever manage to create test data for this?


chancancode commented Dec 6, 2013

Hi everyone! Encoding.escape has been removed in master, so I'm closing this. However, if you have a test case for this, please help test against the new encoder to see if this is still a problem! Thanks guys! ❤️ 💚 💙 💛 💜

chancancode closed this Dec 6, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment