Skip to content

well formed unicode string (#147) #151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 27, 2019

Conversation

srmarjani
Copy link
Contributor

I changed $asStringSmall function so It can detect surrogate character.
But in surrogate.test.js if you uncomment line 50 and 65 , test will fail.
I am using node 11.12.0.
Is JSON.Stringify() is up to date or what @mathiasbynens has said in issue 147 is in test phase?

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much does this impact our benchmarks?

Overall this sounds correct, even if we don’t have the fixed V8 yet.

@srmarjani
Copy link
Contributor Author

In $asStringSmall there is one for that loops over input string. I have put surrogate detector condition in this loop. So I think it will has no impact on benchmarks.
But I have not tested it.

@mcollina
Copy link
Member

Just run https://github.com/fastify/fast-json-stringify/blob/master/bench.js before and after your change.

@srmarjani
Copy link
Contributor Author

befor:
FJS creation x 2,556 ops/sec ±8.41% (68 runs sampled)
JSON.stringify array x 1,323 ops/sec ±7.12% (65 runs sampled)
fast-json-stringify array x 2,723 ops/sec ±6.71% (67 runs sampled)
fast-json-stringify-uglified array x 1,729 ops/sec ±14.74% (56 runs sampled)
JSON.stringify long string x 7,548 ops/sec ±5.59% (68 runs sampled)
fast-json-stringify long string x 7,639 ops/sec ±5.20% (71 runs sampled)
fast-json-stringify-uglified long string x 5,762 ops/sec ±9.27% (54 runs sampled)
JSON.stringify short string x 2,427,132 ops/sec ±6.54% (63 runs sampled)
fast-json-stringify short string x 11,023,374 ops/sec ±4.68% (72 runs sampled)
fast-json-stringify-uglified short string x 10,934,208 ops/sec ±4.34% (70 runs sampled)
JSON.stringify obj x 682,282 ops/sec ±9.87% (61 runs sampled)
fast-json-stringify obj x 2,365,212 ops/sec ±7.54% (68 runs sampled)
fast-json-stringify-uglified obj x 2,571,663 ops/sec ±7.17% (69 runs sampled)

@srmarjani
Copy link
Contributor Author

after:
FJS creation x 613 ops/sec ±14.72% (44 runs sampled)
JSON.stringify array x 455 ops/sec ±14.29% (40 runs sampled)
fast-json-stringify array x 1,506 ops/sec ±6.85% (55 runs sampled)
fast-json-stringify-uglified array x 1,489 ops/sec ±3.84% (65 runs sampled)
JSON.stringify long string x 4,807 ops/sec ±3.92% (64 runs sampled)
fast-json-stringify long string x 5,278 ops/sec ±5.35% (66 runs sampled)
fast-json-stringify-uglified long string x 4,972 ops/sec ±3.71% (63 runs sampled)
JSON.stringify short string x 1,555,898 ops/sec ±4.50% (64 runs sampled)
fast-json-stringify short string x 9,776,310 ops/sec ±5.19% (67 runs sampled)
fast-json-stringify-uglified short string x 9,768,031 ops/sec ±6.04% (64 runs sampled)
JSON.stringify obj x 461,803 ops/sec ±4.94% (60 runs sampled)
fast-json-stringify obj x 1,703,510 ops/sec ±6.36% (52 runs sampled)
fast-json-stringify-uglified obj x 1,420,784 ops/sec ±3.57% (59 runs sampled)

@mcollina
Copy link
Member

You were likely running something else on your machines, as in the “after” all JSON.stringify benchmarks went down ad well. Can you rerun?

@@ -236,16 +236,22 @@ function $asString (str) {
// magically escape strings for json
// relying on their charCodeAt
// everything below 32 needs JSON.stringify()
// every string that contain surrogate needs JSON.stringify()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not accurate. Every lone surrogate needs escaping. Surrogates that appear in valid surrogate pairs must not be escaped.

Copy link
Contributor Author

@srmarjani srmarjani Mar 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know @mathiasbynens
I read your Implementation In c. But I think implementing it in JavaScript would be heavy.
The approach of @mcollina is an heuristic approach not deterministic.
Using of surrogate in keys of object in js is not common.
So I decided to use standard stringify when we encounter such character

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should clarify that comment and write down why we use that heuristics: doing the full algorithm here would be too expensive. However the penalty to try this for lower length strings and falling back to JSON.stringify is ok.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think this approach makes total sense. Lone surrogates are an edge case after all.

@srmarjani
Copy link
Contributor Author

Hi @mcollina
You have time to run benchmark?
I get different result in each run.
Maybe there is some problem in my laptop :))

const validate = validator(schema)
const stringify = build(schema)
const output = stringify('\uDF06\uD834')
// t.equal(output, '"\\udf06\\ud834"')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to use a mirror test for this.

t.equal(output, JSON.stringify('\uDF06\uD834'))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by mirror test?(Excuse me if this question is simple :)) )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I put above. Basically you have a target system A. You want to verify that B calls A and returns that output. You call B() and then A() and you compare results. In this way any change of output in A() will be automatically reflected in B() output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I got it.

@mcollina
Copy link
Member

It seems we are losing a bit:

~/repositories/fast-json-stringify (147-well-formed-unicode-string)
$ node bench.js
FJS creation x 8,953 ops/sec ±0.47% (87 runs sampled)
JSON.stringify array x 5,165 ops/sec ±0.23% (97 runs sampled)
fast-json-stringify array x 8,315 ops/sec ±0.48% (94 runs sampled)
fast-json-stringify-uglified array x 8,549 ops/sec ±0.27% (99 runs sampled)
JSON.stringify long string x 13,099 ops/sec ±0.13% (99 runs sampled)
fast-json-stringify long string x 13,104 ops/sec ±0.11% (98 runs sampled)
fast-json-stringify-uglified long string x 13,068 ops/sec ±0.22% (96 runs sampled)
JSON.stringify short string x 6,342,822 ops/sec ±0.08% (95 runs sampled)
fast-json-stringify short string x 40,577,317 ops/sec ±1.25% (92 runs sampled)
fast-json-stringify-uglified short string x 43,137,917 ops/sec ±0.74% (90 runs sampled)
JSON.stringify obj x 2,509,518 ops/sec ±0.53% (97 runs sampled)
fast-json-stringify obj x 8,766,504 ops/sec ±0.34% (97 runs sampled)
fast-json-stringify-uglified obj x 8,847,147 ops/sec ±0.26% (96 runs sampled)
~/repositories/fast-json-stringify (147-well-formed-unicode-string)

~/repositories/fast-json-stringify (master)
$ node bench.js
FJS creation x 8,868 ops/sec ±0.40% (92 runs sampled)
JSON.stringify array x 5,169 ops/sec ±0.23% (98 runs sampled)
fast-json-stringify array x 8,414 ops/sec ±0.66% (94 runs sampled)
fast-json-stringify-uglified array x 8,628 ops/sec ±0.36% (95 runs sampled)
JSON.stringify long string x 13,133 ops/sec ±0.06% (99 runs sampled)
fast-json-stringify long string x 13,111 ops/sec ±0.15% (99 runs sampled)
fast-json-stringify-uglified long string x 13,121 ops/sec ±0.08% (98 runs sampled)
JSON.stringify short string x 6,286,166 ops/sec ±0.11% (100 runs sampled)
fast-json-stringify short string x 44,704,671 ops/sec ±0.99% (93 runs sampled)
fast-json-stringify-uglified short string x 39,767,374 ops/sec ±1.41% (85 runs sampled)
JSON.stringify obj x 2,496,478 ops/sec ±0.48% (99 runs sampled)
fast-json-stringify obj x 9,197,710 ops/sec ±0.24% (94 runs sampled)
fast-json-stringify-uglified obj x 9,005,966 ops/sec ±0.30% (97 runs sampled)


I think this is the best we can get, overall +1 once the test is done.

@srmarjani srmarjani changed the title WIP-147 well formed unicode string (Please Don't Merge) well formed unicode string (#147) Mar 25, 2019
@mcollina mcollina merged commit eb5e9b1 into fastify:master Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants