Escape multibyte line terminators in JSON encoding #10057

Merged
merged 1 commit into from May 10, 2013

Conversation

Projects
None yet
8 participants
Contributor

zackham commented Apr 2, 2013

Currently, json/encoding respects the JSON spec (as it should) which disallows \n and \r inside strings, escaping them as expected.

Unfortunately, ECMA-262 (Javascript) disallows not only \n and \r in strings, but "Line Terminators" which includes U+2028 and U+2029. See here: http://bclary.com/2004/11/07/#a-7.3

This pull request adds U+2028 and U+2029 to be escaped.

Why? 

It's very common to see something like this in a Rails template:

<script type="text/javascript"> 
var posts = <%= @posts.to_json %>;
</script>

If U+2028 or U+2029 are part of any attributes output in the to_json call, you will end up with an exception. In Chrome: Uncaught SyntaxError: Unexpected token ILLEGAL.

In other words, if one of your users pastes something into a textarea that happens to include these fancy unicode line terminators, and you run to_json on that model and stick it in a template, that page is probably broken now.

Why not?

This is JSON encoding, and the JSON spec is specific about how to encode strings. U+2028 and U+2029 don't get special treatment.

That being said, this is non-obvious, counterintuitive, and can be tough to debug (https://www.google.com/?q=u2028).

What do you do in your apps to deal with this? Is there a convention I'm missing?

Escape multibyte line terminators in JSON encoding
Currently, json/encoding respects the JSON spec (as it should) which 
disallows \n and \r inside strings, escaping them as expected.

Unfortunately, ECMA-262 (Javascript) disallows not only \n and \r in 
strings, but "Line Terminators" which includes U+2028 and U+2029. 
See here: http://bclary.com/2004/11/07/#a-7.3

This pull request adds U+2028 and U+2029 to be escaped.

# Why? 

It's very common to see something like this in a Rails template:

<script type="text/javascript"> 
var posts = <%= @posts.to_json %>;
</script>

If U+2028 or U+2029 are part of any attributes output in the to_json
call, you will end up with an exception.
In Chrome: Uncaught SyntaxError: Unexpected token ILLEGAL 

# Why not?

This is JSON encoding, and the JSON spec is specific about how to 
encode strings. U+2028 and U+2029 don't get special treatment.

Just trying to start a discussion... what do you do in your apps
to deal with this? Is there a convention I'm missing?
Contributor

knapo commented Apr 16, 2013

+1

+3

Contributor

cmaruz commented May 9, 2013

The following error is raised in the activesupport's test suite:

/vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `require': /vagrant/patch1/rails/activesupport/lib/active_support/json/encoding.rb:126: too short escaped multibyte character: /\xe2\x80(\xa8|\xa9)|[\x00-\x1F"\\><&]/ (SyntaxError)
too short escaped multibyte character: /\xe2\x80(\xa8|\xa9)|[\x00-\x1F"\\]/
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `block in require'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:213:in `load_dependency'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `require'
    from /vagrant/patch1/rails/activesupport/lib/active_support/json.rb:2:in `<top (required)>'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `require'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `block in require'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:213:in `load_dependency'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `require'
    from /vagrant/patch1/rails/activesupport/test/core_ext/duration_test.rb:4:in `<top (required)>'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `require'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `block in require'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:213:in `load_dependency'
    from /vagrant/patch1/rails/activesupport/lib/active_support/dependencies.rb:228:in `require'
    from /home/vagrant/.rvm/gems/ruby-2.0.0-p0@global/gems/rake-10.0.4/lib/rake/rake_test_loader.rb:10:in `block (2 levels) in <main>'
    from /home/vagrant/.rvm/gems/ruby-2.0.0-p0@global/gems/rake-10.0.4/lib/rake/rake_test_loader.rb:9:in `each'
    from /home/vagrant/.rvm/gems/ruby-2.0.0-p0@global/gems/rake-10.0.4/lib/rake/rake_test_loader.rb:9:in `block in <main>'
    from /home/vagrant/.rvm/gems/ruby-2.0.0-p0@global/gems/rake-10.0.4/lib/rake/rake_test_loader.rb:4:in `select'
    from /home/vagrant/.rvm/gems/ruby-2.0.0-p0@global/gems/rake-10.0.4/lib/rake/rake_test_loader.rb:4:in `<main>'
rake aborted!

solnic commented May 10, 2013

+9001

rafaelfranca added a commit that referenced this pull request May 10, 2013

Merge pull request #10057 from zackham/patch-1
Escape multibyte line terminators in JSON encoding

rafaelfranca added a commit that referenced this pull request May 10, 2013

Merge branch 'fix-json-encoding'
This is the compination of #10057 and 10534.

Closes #10320

@rafaelfranca rafaelfranca merged commit 9b8ee8e into rails:master May 10, 2013

Contributor

trevorturk commented May 16, 2013

👍 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment