Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adopting the Ryu algorithm #130

Open
cyberphone opened this issue May 11, 2019 · 8 comments

Comments

Projects
None yet
3 participants
@cyberphone
Copy link

commented May 11, 2019

The Ryu algorithm (https://github.com/ulfjack/ryu) for IEEE-754 serialization offer a number of interesting features such a simple, fast and (with a minor tweak) 100% compatible with ES6. Using the Ryu algorithm also makes https://tools.ietf.org/html/draft-rundgren-json-canonicalization-scheme-06 more realistic for JSONB, particularly with the proposed upgrade in eclipse-ee4j/jsonp#160. Here is a comparison made between String.valueOf(double) and my Ryu adaption for Java where each value has been serialized 1M times.

IEEE-754           JDK   Ryu  JDK Serialization         Ryu Serialization
Selected values:
0000000000000000    62    34  0.0                       0
8000000000000000    31    16  -0.0                      0
0000000000000001   511    80  4.9E-324                  5e-324
8000000000000001   522    64  -4.9E-324                 -5e-324
7fefffffffffffff  1764   142  1.7976931348623157E308    1.7976931348623157e+308
ffefffffffffffff  1745   110  -1.7976931348623157E308   -1.7976931348623157e+308
4340000000000000    96   157  9.007199254740992E15      9007199254740992
c340000000000000    80    79  -9.007199254740992E15     -9007199254740992
4430000000000000   397   158  2.9514790517935283E20     295147905179352830000
44b52d02c7e14af5   349   176  9.999999999999997E22      9.999999999999997e+22
44b52d02c7e14af6   349   207  9.999999999999999E22      1e+23
44b52d02c7e14af7   350    96  1.0000000000000001E23     1.0000000000000001e+23
444b1ae4d6e2ef4e   302   159  9.999999999999997E20      999999999999999700000
444b1ae4d6e2ef4f   304    95  9.999999999999999E20      999999999999999900000
444b1ae4d6e2ef50   126   287  1.0E21                    1e+21
3eb0c6f7a0b5ed8c  1236   175  9.999999999999997E-7      9.999999999999997e-7
3eb0c6f7a0b5ed8d   333   206  1.0E-6                    0.000001
41b3de4355555553   349   191  3.333333333333332E8       333333333.3333332
41b3de4355555554   326   111  3.3333333333333325E8      333333333.33333325
41b3de4355555555   304   111  3.333333333333333E8       333333333.3333333
41b3de4355555556   300   111  3.333333333333334E8       333333333.3333334
41b3de4355555557   318   113  3.3333333333333343E8      333333333.33333343
becbf647612f3696   980   128  -3.3333333333333333E-6    -0.0000033333333333333333
IEEE-754           JDK   Ryu  JDK Serialization         Ryu Serialization
Random values:
34ff465fb5a29fd8  1077    95  2.0407823107657903E-53    2.0407823107657903e-53
0180aa7cc33fd23b  1846   112  1.9442174934288537E-301   1.9442174934288537e-301
2ceb54308f24234d  1140    95  2.620311665382176E-92     2.620311665382176e-92
b287eee46fdff09f  1035   115  -2.840738340340803E-65    -2.840738340340803e-65
2229562c860af2e6  1318   110  4.0580809869721453E-144   4.0580809869721453e-144
f80f61239d20259a  1710    96  -2.0721989565386525E270   -2.0721989565386525e+270
add9ab29d8081dba  1095    95  -8.06461389117999E-88     -8.06461389117999e-88
d14fba0dc766156e  1178    97  -4.8152042555373337E83    -4.8152042555373337e+83
6ab1e7ab55b21b95  1237   111  8.98194436781924E205      8.98194436781924e+205
5e25856f84be4190  1111    95  3.35919387840897E145      3.35919387840897e+145
66eeda83d5ec1ec3  1243    96  6.712322230840099E187     6.712322230840099e+187
1d037151a2eee09d  1284    97  6.439734202044909E-169    6.439734202044909e-169
8d4fb7c3d3902f2e  1681    95  -1.4516336465501977E-244  -1.4516336465501977e-244
2f23cd7caecbc18f  1207    95  1.3047738249326575E-81    1.3047738249326575e-81
2dde8dcdf93762ab  1144    95  9.599492442524916E-88     9.599492442524916e-88
84d3ff3c4b08ae2b  1735    96  -2.1012090586052427E-285  -2.1012090586052427e-285
54e868c970dcc690  1171    95  1.0677862209985186E101    1.0677862209985186e+101
d01ced65deab00bd   968    95  -8.37389158027637E77      -8.37389158027637e+77
37c9c83893ccce89  1000    95  5.919282917670671E-40     5.919282917670671e-40
8eb7a89ab8126107  1381    95  -9.08307027853398E-238    -9.08307027853398e-238
c25a4aee0c1c5cab   332   111  -4.5170505739344794E11    -451705057393.44794
056dd3cb849bdc76  1747   111  1.6046797594959664E-282   1.6046797594959664e-282
770cb51be8ffb6c9  1686    95  2.8926835254693333E265    2.8926835254693333e+265
0af991011d61cd97  1540   111  8.513608375142944E-256    8.513608375142944e-256
fd86b4cb740afd06  1651    96  -4.6405640207979055E296   -4.6405640207979055e+296
6770f6560ace46c8  1284   111  1.889386235130226E190     1.889386235130226e+190
29c6087173eaf5ab  1144    96  1.876310978051856E-107    1.876310978051856e-107
03854f4599dcf432  1765    96  1.0677034467937273E-291   1.0677034467937273e-291
1f1eebdbc2be423a  1302   111  8.797521803925928E-159    8.797521803925928e-159
589eae2afa44a8f6  1125    95  7.736749288152502E118     7.736749288152502e+118
c0b9ef54d02451c3   350   111  -6639.331300992929        -6639.331300992929
7c6092051b900ba6  1715    96  1.2918692647856208E291    1.2918692647856208e+291
92f59d018298a475  1535    95  -2.4490875310340735E-217  -2.4490875310340735e-217
b7aca47913b015f6  1001   111  -1.643997295019911E-40    -1.643997295019911e-40
d872908c76063477  1191   111  -1.1703747023547008E118   -1.1703747023547008e+118
1704d41ceb0a5bb3  1424   111  8.707474946109467E-198    8.707474946109467e-198
48d6e50167e817f8   950    95  7.977587285062442E42      7.977587285062442e+42
887e4ef213b04cf3  1538    95  -9.179237569720412E-268   -9.179237569720412e-268
98c01cf9a909b084  1379   111  -1.808231857146977E-189   -1.808231857146977e-189
6e8b2131628be78d  1537    95  3.1381314757793522E224    3.1381314757793522e+224
e828e9002b49fb99  1505    95  -5.6825560219219796E193   -5.6825560219219796e+193
96f87c499d218bcb  1351    96  -5.11813583331773E-198    -5.11813583331773e-198
70fa702a8bead87a  1566   111  1.6812320757231358E236    1.6812320757231358e+236
0edbea290c4de85e  1617    95  4.2868295597931197E-237   4.2868295597931197e-237
c0d6ff0aba2e5742   302   110  -23548.167613587582       -23548.167613587582
a709f0b9af623191  1283   111  -1.2557040640113272E-120  -1.2557040640113272e-120
d1a640cc4b91653b  1141    95  -2.1615219364130858E85    -2.1615219364130858e+85
3d54e9aec27017c1   937    95  2.9718908747180134E-13    2.9718908747180134e-13
4249669384146a70   319   111  2.1819025207283154E11     218190252072.83154
1d7b128021ba310c  1413   111  1.1477493239359762E-166   1.1477493239359762e-166
JDK Total=74325 Ryu Total=8075
@rmannibucau

This comment has been minimized.

Copy link

commented May 11, 2019

Hi @cyberphone , isnt it an implementation optimization? Then it belongs to yasson bugtracker.

Side note: current serialization must stay since we got a 1.0 so if not an implementation detail you can do ryu adapters to achieve it.

@cyberphone

This comment has been minimized.

Copy link
Author

commented May 11, 2019

@rmannibucau I don't know where this "belongs" but both Go and C# is in the process of replacing their current number serializers with Ryu.

My personal interest is more on the ES6 compatibility side than on performance.

@rmannibucau

This comment has been minimized.

Copy link

commented May 11, 2019

Hmm, can you point out es6 - even es5? - incompatibilities maybe? I used it quick a lot with primitives already and issues didnt pop up yet both ways.

@cyberphone

This comment has been minimized.

Copy link
Author

commented May 11, 2019

ES6 compatibility is only needed for canonicalization. I just hoped to get this as a "bonus" 😀 since JCS is not a target for JSONB. The speed improvement was pretty impressive.

You'll find all links in the Internet-Draft.

@bravehorsie

This comment has been minimized.

Copy link
Contributor

commented May 11, 2019

https://github.com/ulfjack/ryu states that the java impl output may differ from the Double#toString methods. Can that break jsonb-spec 3.3.2 section?

@rmannibucau

This comment has been minimized.

Copy link

commented May 11, 2019

Looks it is compatible but it is also a draft so quite bad for a jakata spec - keep in mind jsonschema which is in draft already broke features. Also nothing requires an impl to use valueOf - guess they all do but this is not required AFAIK since not part of user facing API - this is why it is an impl detail for me.

So only question is for me the json number representation and while any round trip (java/json) works - which means js can consume it - I guess we are covered at spec level.

If a new final json spec pops up it could become a toggle - config property and annotation - IMHO.

Does it make sense?

@cyberphone

This comment has been minimized.

Copy link
Author

commented May 12, 2019

@rmannibucau JS (as well as any correctly implemented JSON parser), can consume both your existing and proposed format, it is only canonicalization that requires absolute ES6 compliance.

Regarding JDK roundtripping, the ES6/Ryu adaptation succeeds using my 100M test file:
https://github.com/cyberphone/json-canonicalization/tree/master/testdata#es6-numbers

https://github.com/cyberphone/json-canonicalization/blob/master/java/miscellaneous/src/ES6NumberTest.java#L52

I would not bother with a toggle since the "problem" rather is in the spec.

The test program failed on C# but it turned out to be due to a bug in the .NET number parser. After reporting it, Microsoft fixed it as well!

JSON Canonicalization now works on 5 platforms.

@rmannibucau

This comment has been minimized.

Copy link

commented May 12, 2019

Hmm, my point is I fail to see the problem in the spec. To be clear I can see it in some implementations but not the spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.