Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward slashes should not be escaped #110

Closed
ktkonrad opened this issue Oct 15, 2013 · 8 comments · Fixed by demisto/demisto-sdk#2005
Closed

Forward slashes should not be escaped #110

ktkonrad opened this issue Oct 15, 2013 · 8 comments · Fixed by demisto/demisto-sdk#2005

Comments

@ktkonrad
Copy link

The JSON spec says that forward slashes may be escaped or not escaped. Unless there is a compelling reason to escape them it would be preferable to follow the behavior of Python's json module and not escape them. The following surprised me:

>>> import json
>>> import ujson
>>> json.dumps('/')
'"/"'
>>> ujson.dumps('/')
'"\\/"'

It appears the reason this is optional in the JSON spec is to allow embedding JSON in HTML. See http://stackoverflow.com/questions/1580647/json-why-are-forward-slashes-escaped/1580664#1580664.

@ronniekk
Copy link

I agree. Not only is it rather confusing but also the output is not compliant with all third-party components.

Keep it as an option would be fine though.

@jskorpan
Copy link

The JSON spec is the JSON spec. Had issues reported earlier about ujson not escaping front end slashes.

In a case like this I think it is better to stay with the spec

On 15 okt 2013, at 08:49, "ronniekk" <notifications@github.commailto:notifications@github.com> wrote:

I agree. Not only is it rather confusing but also the output is not compliant with all third-party components.

Keep it as an option would be fine though.


Reply to this email directly or view it on GitHubhttps://github.com//issues/110#issuecomment-26312508.

@ktkonrad
Copy link
Author

@jskorpan, The JSON spec says that forward slashes can appear either escaped or not escaped. Not escaping forward slashes does not violate the spec. Because this is supposed to be a "drop in replacement" for other Python JSON libraries I would expect it to have the same behavior in this case.

Can you point me to earlier issues about ujson not escaping forward slashes? I could only find #62, which reports the same issue I am seeing, that the behavior is different than Python's json module.

I'd like to propose that forward slashes not be escaped by default but are escaped when encode_html_chars=True is specified. I will submit a pull request in the next day or two.

@jskorpan
Copy link

I embrace your commitment to this issue, but the JSON.org spec is very clear on that forward slashes should be escaped. http://json.org/

What other specs are you referring to?

@nickva
Copy link
Contributor

nickva commented Oct 18, 2013

Spec on http://json.org/ is not precise. It says:

A string is any unicode character except " or \ or 'control character'.

Then it lists examples of escapes which is a mix of: escaped control characters, escape ", escaped , escaped 4 hex digit unicode and escaped / .

This is a more precise spec: http://www.ietf.org/rfc/rfc4627.txt . It says:

The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with
quotation marks. All Unicode characters may be placed within the
quotation marks except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+0000
through U+001F).

Note: "control characters" phrase is defined as ASCII characters between 00h and 1fh. In other words, it includes backspace, formfeed, newline, but stops before 20h (space). Forward slash isn't in the range of "control characters".

It seems a JSON encoder can choose to escape any character, it must escape certain ones. / falls in "choose to escape" section not in "must escape".

@trbs
Copy link

trbs commented Oct 18, 2013

Regardless of the interpretation of the spec. It seems to me that it is desirable to have an option to change this behavior much like the encode_html_chars. Just because different implementations of JSON do this differently apparently.

Working with json and simplejson on the Python side as well as several api's based on JSON I have not seen people or JSON implementation escaping the forward slashes. This does not mean that it's the right or wrong approach.. but it would seem that in the wild as least there different interpretations of it. However it can become a source of confusion when dealing with third-parties.

How much would it impact the speed of ujson if we add that extra parameter for this ? maybe even as a global setting if that helps to optimize for both cases.

@ktkonrad
Copy link
Author

@nickva, you are correct, the spec says that a forward slash may appear escaped or unescaped and the meaning is the same. (As opposed to 't' which may also appear escaped or unescaped but with different meanings)

@trbs, I have not done a speed comparison but based on the change I made (#114) I would expect no measurable difference. I essentially added a single branch for each forward slash in the string. Doing some timings with long strings of only forward slashes should give an idea of the worst case difference if you want to get some numbers.

@jskorpan, hopefully @nickva's answer helps to clarify what the spec says about escaping forward slashes. It seems there are a significant number of people who were also surprised by the different behavior of ujson as compared to Python's json module and most other JSON libraries. I agree it should be left as an option.

That said, I would expect the default value of the option to conform to Python's json module. I also like the idea of using the existing escape_html_chars option for this purpose as the primary use case for escaping forward quotes is when embedding JSON or JavaScript in HTML. If you would prefer it to be a separate option it can be changed.

@radzhome
Copy link

radzhome commented Oct 23, 2018

So now as per the doc, you can use escape_forward_slashes.

Controls whether forward slashes (/) are escaped. Default is True:

>>> ujson.dumps("http://esn.me")
'"http:\/\/esn.me"'
>>> ujson.dumps("http://esn.me", escape_forward_slashes=False)
'"http://esn.me"'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants