Browse files

Escape multibyte line terminators in JSON encoding

Currently, json/encoding respects the JSON spec (as it should) which 
disallows \n and \r inside strings, escaping them as expected.

Unfortunately, ECMA-262 (Javascript) disallows not only \n and \r in 
strings, but "Line Terminators" which includes U+2028 and U+2029. 
See here:

This pull request adds U+2028 and U+2029 to be escaped.

# Why? 

It's very common to see something like this in a Rails template:

<script type="text/javascript"> 
var posts = <%= @posts.to_json %>;

If U+2028 or U+2029 are part of any attributes output in the to_json
call, you will end up with an exception.
In Chrome: Uncaught SyntaxError: Unexpected token ILLEGAL 

# Why not?

This is JSON encoding, and the JSON spec is specific about how to 
encode strings. U+2028 and U+2029 don't get special treatment.

Just trying to start a discussion... what do you do in your apps
to deal with this? Is there a convention I'm missing?
  • Loading branch information...
1 parent 4ae089b commit 9b8ee8e006db581eb34dc0fa1d230653b7a1c956 @zackham committed Apr 2, 2013
Showing with 4 additions and 2 deletions.
  1. +4 −2 activesupport/lib/active_support/json/encoding.rb
@@ -98,6 +98,8 @@ def check_for_circular_references(value)
"\010" => '\b',
"\f" => '\f',
"\n" => '\n',
+ "\xe2\x80\xa8" => '\u2028',
+ "\xe2\x80\xa9" => '\u2029',
"\r" => '\r',
"\t" => '\t',
'"' => '\"',
@@ -121,9 +123,9 @@ class << self
def escape_html_entities_in_json=(value)
self.escape_regex = \
if @escape_html_entities_in_json = value
- /[\x00-\x1F"\\><&]/
+ /\xe2\x80(\xa8|\xa9)|[\x00-\x1F"\\><&]/
- /[\x00-\x1F"\\]/
+ /\xe2\x80(\xa8|\xa9)|[\x00-\x1F"\\]/

0 comments on commit 9b8ee8e

Please sign in to comment.