Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: Compact should be consistent with escaping #34070

Open
tmthrgd opened this issue Sep 4, 2019 · 1 comment

Comments

@tmthrgd
Copy link
Contributor

commented Sep 4, 2019

As requested by @mvdan I'm filing an issue to decide on the escaping behaviour of Compact for go1.14.

#30357 was originally filed as Compact escapes U+2028 and U+2029 which increases the output size without being documented. The fix for that issue was documentation (CL 173417) that turned out to be incorrect and was reverted by CL 188717.

Compact currently escapes U+2028 and U+2029, but it doesn't escape <, > or &. Compact is thus only performing a subset of HTML escaping and it's output is not safe to embed inside HTML <script> tags as CL 173417 had documented.

If I understand correctly, the escaping of U+2028 and U+2029 was added not for HTML but rather for JavaScript interpreters that might consume a directly embedded JSON literal.

The options seem to be:

  • Stop performing any escaping in Compact as the original issue requested. The escaping of escaping U+2028 and U+2029 was added in CL 10883045, and reading over the comments, it does seem like it was intentional to have Compact escape U+2028 and U+2029.

  • Have Compact properly perform HTML escaping, as HTMLEscape, does and be documented as such. This is likely the safest option, as the only consequence to overly aggressive escaping is larger encoded JSON.

  • Document that Compact escapes U+2028 and U+2029, but not <, > or & and is thus not safe to embed in HTML without separately calling HTMLEscape.

  • Do nothing~

/cc @PhilipBorgesen @rsc @bradfitz

@PhilipBorgesen

This comment has been minimized.

Copy link
Contributor

commented Sep 4, 2019

I think each of the options has merit, except doing nothing. In #30357 I complained that Compact behaved surprisingly, doing more than what was documented, and returning a less compact result in contrast to what the function name implied. Doing nothing means that issue persists.

As for the three other options I think:

  • Making Compact do proper HTML escaping may be the safest.
  • Making Compact do no escaping is the most useful. Far from all JSON is intended consumption by browsers, and depending on the application, character escaping may be seen as a corruption. Users wanting compacted and “HTML escaped” JSON can always apply both of Compact and HTMLEscape on their data, which is what the current documentation implies one should do. I note that this behavior was what the Go 1 compatibility guarantee originally promised going forward.
  • Simply documenting the current behavior is the lowest effort solution. It is also the best bet that no programs will break, which both of the above options in practice could, although unlikely.

@katiehockman katiehockman added this to the Go1.14 milestone Sep 4, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.