You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am suggesting, after some months of thought, to deprecate & encourage removal from future JSON5 parsers of the NaN/inf+/inf- format since I think they are the most obvious disadvantage of the format. Here is why:
It turns dropping JSON5 parsers in place of JSON parsers from a no-brainer into a security headache. Without the addition of NaN/inf+/inf-, enabling JSON5 for any input pipeline with no changes to what processes the data after is a no-brainer - because the possible output range is the same. This is single-handedly ruined by NaN/inf+/inf-, which suddenly introduce new values that were never possible to unexpectedly result from a classic JSON parser. (You could argue maybe inf+/inf- were depending on how it deals with values exceeding the allowed range, but certainly not NaN.) Especially NaN can badly trip up code that might be written to verify other things, but not that this special value isn't present since well, JSON does not usually support it. NaN is also particularly damaging due to its "infectious" silently corrupting nature in most math operations in most programming languages.
It solves a problem that was already trivial to deal with. I don't get who asked for this to be solved. An application can already trivially work around this by allowing values to be either a number, or a string specifying "inf+"/"inf-"/"nan". The implementation of this is trivial. And how many use cases actually need these special values anyway, rather than them indicating a major programming or calculation error that shouldn't pass in/out of a serialized data chunk anyway? I think blanket "solving" this is just not that useful, it reminds me a bit of YAML supporting complicated binary contents. Will some use it? Sure. Does it really help the majority of format users? IMHO doubtful. I would suggest there is an advantage in focusing a format in a lean way on just solving what is actually useful to most users, and otherwise just stepping aside and letting the application level handle it, and I see a missed opportunity in that regard here.
It makes JSON5 plain incompatible with some languages. Some programming languages may opt to not support all IEEE 754 features, especially not NaN. There is IMHO a point to be made that NaN is mostly a cheap hack to improve performance, and in a scripting language that is willing to sacrifice the math speed cost there might be good reasons to make all potentially NaN results throw an error instead and not providing this infectious, hidden bug inviting special value. However, these languages are now left with difficult decisions implementing JSON5, since they'll be in many ways now a 2nd class citizen. (Imagine writing a network API daemon in such a language that just passes through JSON5, now suddenly it'll 'eat' all NaN values even though all other endpoints connected might be able to understand it. Just not fully supporting all common numerical values isn't really an easy option sometimes, so this just results in lots of headaches.)
Summed up, I think this feature has more downsides than upsides and should be phased out.
The text was updated successfully, but these errors were encountered:
Thank you for your input. We had a discussion on this four years ago at json5/json5-spec#2. I encourage you to post there since it pertains more to the spec than the library.
I am suggesting, after some months of thought, to deprecate & encourage removal from future JSON5 parsers of the NaN/inf+/inf- format since I think they are the most obvious disadvantage of the format. Here is why:
It turns dropping JSON5 parsers in place of JSON parsers from a no-brainer into a security headache. Without the addition of NaN/inf+/inf-, enabling JSON5 for any input pipeline with no changes to what processes the data after is a no-brainer - because the possible output range is the same. This is single-handedly ruined by NaN/inf+/inf-, which suddenly introduce new values that were never possible to unexpectedly result from a classic JSON parser. (You could argue maybe inf+/inf- were depending on how it deals with values exceeding the allowed range, but certainly not NaN.) Especially NaN can badly trip up code that might be written to verify other things, but not that this special value isn't present since well, JSON does not usually support it. NaN is also particularly damaging due to its "infectious" silently corrupting nature in most math operations in most programming languages.
It solves a problem that was already trivial to deal with. I don't get who asked for this to be solved. An application can already trivially work around this by allowing values to be either a number, or a string specifying "inf+"/"inf-"/"nan". The implementation of this is trivial. And how many use cases actually need these special values anyway, rather than them indicating a major programming or calculation error that shouldn't pass in/out of a serialized data chunk anyway? I think blanket "solving" this is just not that useful, it reminds me a bit of YAML supporting complicated binary contents. Will some use it? Sure. Does it really help the majority of format users? IMHO doubtful. I would suggest there is an advantage in focusing a format in a lean way on just solving what is actually useful to most users, and otherwise just stepping aside and letting the application level handle it, and I see a missed opportunity in that regard here.
It makes JSON5 plain incompatible with some languages. Some programming languages may opt to not support all IEEE 754 features, especially not NaN. There is IMHO a point to be made that NaN is mostly a cheap hack to improve performance, and in a scripting language that is willing to sacrifice the math speed cost there might be good reasons to make all potentially NaN results throw an error instead and not providing this infectious, hidden bug inviting special value. However, these languages are now left with difficult decisions implementing JSON5, since they'll be in many ways now a 2nd class citizen. (Imagine writing a network API daemon in such a language that just passes through JSON5, now suddenly it'll 'eat' all NaN values even though all other endpoints connected might be able to understand it. Just not fully supporting all common numerical values isn't really an easy option sometimes, so this just results in lots of headaches.)
Summed up, I think this feature has more downsides than upsides and should be phased out.
The text was updated successfully, but these errors were encountered: