Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: Compact should be consistent with escaping #34070

Closed
tmthrgd opened this issue Sep 4, 2019 · 3 comments
Closed

encoding/json: Compact should be consistent with escaping #34070

tmthrgd opened this issue Sep 4, 2019 · 3 comments
Milestone

Comments

@tmthrgd
Copy link
Contributor

@tmthrgd tmthrgd commented Sep 4, 2019

As requested by @mvdan I'm filing an issue to decide on the escaping behaviour of Compact for go1.14.

#30357 was originally filed as Compact escapes U+2028 and U+2029 which increases the output size without being documented. The fix for that issue was documentation (CL 173417) that turned out to be incorrect and was reverted by CL 188717.

Compact currently escapes U+2028 and U+2029, but it doesn't escape <, > or &. Compact is thus only performing a subset of HTML escaping and it's output is not safe to embed inside HTML <script> tags as CL 173417 had documented.

If I understand correctly, the escaping of U+2028 and U+2029 was added not for HTML but rather for JavaScript interpreters that might consume a directly embedded JSON literal.

The options seem to be:

  • Stop performing any escaping in Compact as the original issue requested. The escaping of escaping U+2028 and U+2029 was added in CL 10883045, and reading over the comments, it does seem like it was intentional to have Compact escape U+2028 and U+2029.

  • Have Compact properly perform HTML escaping, as HTMLEscape, does and be documented as such. This is likely the safest option, as the only consequence to overly aggressive escaping is larger encoded JSON.

  • Document that Compact escapes U+2028 and U+2029, but not <, > or & and is thus not safe to embed in HTML without separately calling HTMLEscape.

  • Do nothing~

/cc @PhilipBorgesen @rsc @bradfitz

@PhilipBorgesen

This comment has been minimized.

Copy link
Contributor

@PhilipBorgesen PhilipBorgesen commented Sep 4, 2019

I think each of the options has merit, except doing nothing. In #30357 I complained that Compact behaved surprisingly, doing more than what was documented, and returning a less compact result in contrast to what the function name implied. Doing nothing means that issue persists.

As for the three other options I think:

  • Making Compact do proper HTML escaping may be the safest.
  • Making Compact do no escaping is the most useful. Far from all JSON is intended consumption by browsers, and depending on the application, character escaping may be seen as a corruption. Users wanting compacted and “HTML escaped” JSON can always apply both of Compact and HTMLEscape on their data, which is what the current documentation implies one should do. I note that this behavior was what the Go 1 compatibility guarantee originally promised going forward.
  • Simply documenting the current behavior is the lowest effort solution. It is also the best bet that no programs will break, which both of the above options in practice could, although unlikely.
@katiehockman katiehockman added this to the Go1.14 milestone Sep 4, 2019
@mvdan

This comment has been minimized.

Copy link
Member

@mvdan mvdan commented Sep 22, 2019

I agree that we need to do something here.

I think the best solution is to stop doing any escaping at all. Like others have mentioned in this thread, that's what is documented, HTMLEscape already exists, and Compact has never done any complete form of escaping. Moreover, compacting has nothing to do with escaping to begin with.

I doubt that any programs would break. If any was depending on undocumented behavior, one could argue that their program should have been using HTMLEscape to begin with, to do proper escaping.

My opinion is that we can give this a try in the current cycle for 1.14, and reconsider or revert if a large amount of users complains once it lands in master or in any of the pre-releases. I'm happy to review a CL.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 10, 2019

Change https://golang.org/cl/200217 mentions this issue: encoding/json: stop escaping U+2028 and U+2029 in Compact

@gopherbot gopherbot closed this in 900ebcf Oct 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.