-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON: improved deserialization into union types #8689
Conversation
src/json/from_json.cr
Outdated
return pull.read_string | ||
{% end %} | ||
when .int? | ||
{% type_order = [Int64, Int32, Int16, Int8, UInt64, UInt32, UInt16, UInt8, Float64, Float32] %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm interested in other's thoughts on ordering Int
s and UInt
s here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters. Maybe it should be from smaller to bigger. For example if you have Union(Int8, Int16).from_json(1)
, what would you expect? 1
fits in both types.
That's why I'm not sure whether we should do anything about it other than putting floats to the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, my PR. But whenever I send a PR there are always thoughts, comments, etc. It's so slow to develop in Crystal. Meanwhile in Ruby they push commits without much thoughts, and it's a success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters.
JSON::PullParser#read?
silently wraps in the case of an overflow so order is important. Combine that with the alphabetic type order and your end up with Int16
> Int32
> Int64
> Int8
which would likely lead to some interesting behaviour if someone was to parse into a union of indiscriminate types. That was the main motivation for this following your PR.
alias T = Int16 | Int32
a : T = Int16::MAX.to_i + 2
# => 32769
b : T = T.from_json(a.to_json)
# => -32767
An initial implementation prioritised smaller types, converted with {{type}}.new
and caught Overflow errors along the way, but this seemed like a lot of runtime overhead when the type could be satisfied without it.
There is likely a separate discussion needed on if that silent cast is the correct behaviour from #read?
, but in the interests of keeping this focused on Union
I've not touched that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bcardiff i'd prefer to always use the largest type in the union, so that if you deserialize Array(Int32 | Int8)
, you get all int32's.
Again, this really doesn't matter. If you're doing that, you're doing it wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. for Int32 | Int8
it might be more stable to return always Int32
in that case.
I haven't checked how it currently works but whay is your opinion regarding Int32 | Float32
? Should it behave as it was Float32
alone? For the UInt64::MAX < Float32::MAX but that is no true for UInt128. So I would prefer to honor Int32 | Float32
depending if the input is a float or int based on what I said before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I would prefer to honor Int32 | Float32 depending if the input is a float or int based on what I said before.
This sounds acceptable, but I'm not sure if the JSON parser should read some context into number literals when there is absolutely no difference between 1.0
and 1
literals in JSON.
I'd probably prefer the largest float data type over any int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a user of the std lib, a primary argument for reading into the that decimal point is to keep encode/decode behavior symmetric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we still have some split opinions. In the interests of keeping things moving / forcing a decision I've amended the order to match what was mentioned to @RX14 above as well as adding spec to demonstrate stability when pulling into a (highly questionable) union of different integer types as well as symmetry of encode/decode for the Int | Float
case.
A re-review would be lovely as people have time.
Modified behaviour when reading into union types to prioritize: 1. native types used by JSON::PullParser 2. casts to numeric types of the same kind in order of precision 3. casts to numeric types of a different kind 4. non-primitive types
Modified behaviour when reading into union types to prioritize:
JSON::PullParser
Resolves #7333
Supersedes #8675
Extends #8686