Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add escape json #204

Merged
merged 7 commits into from May 8, 2017
Merged

Add escape json #204

merged 7 commits into from May 8, 2017

Conversation

@g-k
Copy link
Contributor

g-k commented Apr 4, 2017

Change:

  • Add an escapeJson function

The first commit message describes its behavior (basically the same as golang's json marshal):

Unicode escapes <, >, & as \u003c, \u003e, \u0026 "to keep some browsers from misinterpreting JSON output as HTML."

https://golang.org/src/encoding/json/encode.go?s=6456:6499#L48

Also, unicode escapes line and paragraph separators as \u2028, \u2029 "so that the JSON will be safe to embed inside HTML <script> tags."

https://golang.org/src/encoding/json/encode.go?s=6456:6499#L184
http://timelessrepo.com/json-isnt-a-javascript-subset

Note: hex or unicode escaping all unsafe char codes is an option too (implemented in the first commit) but probably incurs a larger performance hit.

My end goal here is to write middleware to escape all JSON from an API or make it an option in hapi core if there's interest.

g-k added 2 commits Apr 4, 2017
Unicode escapes <, >, & as \u003c, \u003e, \u0026 "to keep some browsers
from misinterpreting JSON output as HTML."

https://golang.org/src/encoding/json/encode.go?s=6456:6499#L48

Also, unicode escapes line and paragraph separators as \u2028, \u2029 "so
that the JSON will be safe to embed inside HTML <script> tags."

https://golang.org/src/encoding/json/encode.go?s=6456:6499#L184
http://timelessrepo.com/json-isnt-a-javascript-subset
@hueniverse

This comment has been minimized.

Copy link
Member

hueniverse commented Apr 5, 2017

I'm open to it.

nvm install node fails

refs: nvm-sh/nvm#1349
@g-k

This comment has been minimized.

Copy link
Contributor Author

g-k commented Apr 6, 2017

Cool, I got tests passing.

@nlf nlf self-assigned this Apr 6, 2017
@nlf nlf added the feature label Apr 6, 2017
return '';
}

return input.replace(/</g, '\\u003c')

This comment has been minimized.

Copy link
@DavidTPate

DavidTPate Apr 10, 2017

The current implementation here seems like it could be made quite a bit more performant by reducing the number of calls to String.replace(...) which would search the entire string again.

Something along these lines would be more performant:

return input.replace(/[<>&\u2028\u2029]/g, (match) => {
  if (match === '<') {
    return '\\u003c';
  } else if (...) {
    ...
  }
  // etc
});

This comment has been minimized.

Copy link
@g-k

g-k Apr 10, 2017

Author Contributor

Yeah, that's looking about 3x faster: https://gist.github.com/g-k/c1efc1527cd74305b1e3cd272a49453a

Updated in: cf70d6d

This comment has been minimized.

Copy link
@DavidTPate

DavidTPate Apr 11, 2017

If we wanted to squeak out a tiny bit more performance we could utilize char codes for the comparison as well. It's not a drastic improvement like we saw with the previous switch (only ~5% improvement) and I'm not sure if it fits the style of the rest of the library but figured I'd suggest it and leave it up to you guys.

I put together some benchmarks here to show an example: https://jsperf.com/string-replacement-methods/1

The code would then look something like this:

//static variables
var lessThan = 0x3C;
var greaterThan = 0x3E;
var andSymbol = 0x26;
var lineSeperator = 0x2028;
var paragraphSeparator = 0x2029;

//replace method
var charCode;
twitterJson = twitterJson.replace(/[<>&\u2028\u2029]/g, (match) => {
    charCode = match.charCodeAt(0);

    if (charCode === lessThan) {
        return '\\u003c';
    }
    else if (charCode === greaterThan) {
        return '\\u003e';
    }
    else if (charCode === andSymbol) {
        return '\\u0026';
    }
    else if (charCode === lineSeperator) {
        return '\\u2028';
    }
    return '\\u2029';
});

This comment has been minimized.

Copy link
@nlf

nlf Apr 11, 2017

Member

I got a difference of less than 1% looking at that benchmark, I think where it's at now is pretty solid

This comment has been minimized.

Copy link
@g-k

g-k Apr 11, 2017

Author Contributor

Happy to use either version.

I'm seeing +2% w/ the char codes for Chrome 57.0.2987.133 on OSX 10.11.6.

We could run against a larger corpus and figure out how similar Chrome V8 perf on jsperf.com is too node.js V8 and dump JIT output for IRHydra2, but I doubt there's another 3x win in there.

This comment has been minimized.

Copy link
@DavidTPate

DavidTPate Apr 12, 2017

Agreed on there not being a big gain there. I was on the fence about even mentioning it.

There will be a slightly higher improvement in node land from what I recall, but less than 10% improvement from previous tests with similar changes.

This comment has been minimized.

Copy link
@g-k

g-k Apr 12, 2017

Author Contributor

Cool! Switched to the char code version with a couple changes (var -> const/let, etc.) to make the linter happy. Small improvements add up and nobody saw this version running slower.

g-k added 2 commits Apr 10, 2017
@g-k g-k force-pushed the g-k:add-escape-json branch from 41b3865 to d1231c9 Apr 10, 2017
@nlf

This comment has been minimized.

Copy link
Member

nlf commented Apr 11, 2017

this is looking really great performance-wise, i think all we're really missing here is documentation

g-k added 2 commits Apr 11, 2017
1-10% faster on the twitter.json
@geek

This comment has been minimized.

Copy link
Member

geek commented May 8, 2017

LGTM

Copy link

DavidTPate left a comment

LGTM

@nlf

This comment has been minimized.

Copy link
Member

nlf commented May 8, 2017

I concur

@nlf nlf merged commit 8a2e118 into hapijs:master May 8, 2017
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@g-k g-k deleted the g-k:add-escape-json branch May 9, 2017
@g-k g-k mentioned this pull request May 31, 2017
@nlf nlf added this to the 4.2.0 milestone Jul 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.