Forward slashes should not be escaped #110

ktkonrad opened this Issue Oct 15, 2013 · 7 comments


None yet

5 participants


The JSON spec says that forward slashes may be escaped or not escaped. Unless there is a compelling reason to escape them it would be preferable to follow the behavior of Python's json module and not escape them. The following surprised me:

>>> import json
>>> import ujson
>>> json.dumps('/')
>>> ujson.dumps('/')

It appears the reason this is optional in the JSON spec is to allow embedding JSON in HTML. See

ESN Social Software member

I agree. Not only is it rather confusing but also the output is not compliant with all third-party components.

Keep it as an option would be fine though.

ESN Social Software member

@jskorpan, The JSON spec says that forward slashes can appear either escaped or not escaped. Not escaping forward slashes does not violate the spec. Because this is supposed to be a "drop in replacement" for other Python JSON libraries I would expect it to have the same behavior in this case.

Can you point me to earlier issues about ujson not escaping forward slashes? I could only find #62, which reports the same issue I am seeing, that the behavior is different than Python's json module.

I'd like to propose that forward slashes not be escaped by default but are escaped when encode_html_chars=True is specified. I will submit a pull request in the next day or two.

ESN Social Software member

I embrace your commitment to this issue, but the spec is very clear on that forward slashes should be escaped.

What other specs are you referring to?


Spec on is not precise. It says:

A string is any unicode character except " or \ or 'control character'.

Then it lists examples of escapes which is a mix of: escaped control characters, escape ", escaped \, escaped 4 hex digit unicode and escaped / .

This is a more precise spec: . It says:

The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with
quotation marks. All Unicode characters may be placed within the
quotation marks except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+0000
through U+001F).

Note: "control characters" phrase is defined as ASCII characters between 00h and 1fh. In other words, it includes backspace, formfeed, newline, but stops before 20h (space). Forward slash isn't in the range of "control characters".

It seems a JSON encoder can choose to escape any character, it must escape certain ones. / falls in "choose to escape" section not in "must escape".


Regardless of the interpretation of the spec. It seems to me that it is desirable to have an option to change this behavior much like the encode_html_chars. Just because different implementations of JSON do this differently apparently.

Working with json and simplejson on the Python side as well as several api's based on JSON I have not seen people or JSON implementation escaping the forward slashes. This does not mean that it's the right or wrong approach.. but it would seem that in the wild as least there different interpretations of it. However it can become a source of confusion when dealing with third-parties.

How much would it impact the speed of ujson if we add that extra parameter for this ? maybe even as a global setting if that helps to optimize for both cases.


@nickva, you are correct, the spec says that a forward slash may appear escaped or unescaped and the meaning is the same. (As opposed to 't' which may also appear escaped or unescaped but with different meanings)

@trbs, I have not done a speed comparison but based on the change I made (#114) I would expect no measurable difference. I essentially added a single branch for each forward slash in the string. Doing some timings with long strings of only forward slashes should give an idea of the worst case difference if you want to get some numbers.

@jskorpan, hopefully @nickva's answer helps to clarify what the spec says about escaping forward slashes. It seems there are a significant number of people who were also surprised by the different behavior of ujson as compared to Python's json module and most other JSON libraries. I agree it should be left as an option.

That said, I would expect the default value of the option to conform to Python's json module. I also like the idea of using the existing escape_html_chars option for this purpose as the primary use case for escaping forward quotes is when embedding JSON or JavaScript in HTML. If you would prefer it to be a separate option it can be changed.

@jskorpan jskorpan closed this Apr 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment