New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSONSerialization: Use NSDecimalNumber for parsing json numbers. #1655
Conversation
- Replace strol()/strtod() parsing with Scanner.scanDecimal(). This allows parsing of Integers > Int64.max (upto UInt64.max) without losing information due to the fallback parsing provided by parsing as a Double.
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about three main things here:
-
The cost of parsing as
NSDecimalNumber
every time, especially by spinning up a newScanner
for every single read number -
Potential new differences in precision and representation. I haven't followed the implementation of
NSDecimalNumber
on swift-corelibs-foundation, but is its implementation significantly different fromNSDecimalNumber
on Darwin? Because on Darwin,NSDecimalNumber
has a significantly lower possible magnitude thanDouble
does (IIRC 10^±127 vs. 10^±308), with the tradeoff being that it allows for higher precision. On Darwin, we prefer to parse as aDouble
if the value fits both within the magnitude and precision of aDouble
— the C standard guarantees this to be true if the string representation of the mantissa is less thanDBL_DECIMAL_DIG
in length, so this is what we check. If we cannot fit the number in aDouble
without loss of precision, then we preferNSDecimalNumber
if the value fits within its magnitude; otherwise we're willing to lose precision to fit in aDouble
. If the number is too large to fit in either, we bail. Here we'd be giving up on the more efficient representation in anInt
/Double
, along with the greater range thatDouble
provides (unless I'm missing something andNSDecimalNumber
is more capable here than it is on Darwin) -
@millenomi — this one is more for you. With all of the bridging work you did, do we have any additional testing/coverage for
NSDecimalNumber
and how that works out? Folks probably have a lot of code which currently checksas? Int
andas? Double
since for a long time those were the only two supported options here — I'd hate for anNSDecimalNumber
to accidentally slip through with loss of precision or representation or something. Even so, the representation inNSDecimalNumber
is not good enough forInt
s. On Darwin, the following code printsnil
due to loss of precision making the number no longer "cleanly" representable in anInt
:let dec = Decimal(1123123123) let num = NSDecimalNumber(decimal: dec) let i = (num as NSNumber) as? Int print(i)
Overall I think the changes here are good in spirit but we'll likely need to maintain some of the existing representations we already have unless my concerns are unfounded due to my lack of experience here (which I would happily accept!). What we might be able to do is transition over to more of what we have on Darwin — prefer Double
unless we can get away with an NSDecimalNumber
.
// Validate the first few characters look like a JSON encoded number: | ||
// Optional '-' sign at start only 1 leading zero if followed by a decimal point. | ||
var index = input | ||
func nextDigit() -> UInt8? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this more clearly be called nextASCII()
/nextCharacter()
/similar?
let MINUS = UInt8(ascii: "-") | ||
|
||
func parseTypedNumber(_ string: String) throws -> (Any, IndexDistance)? { | ||
let scan = Scanner(string: string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth testing the performance of creating a new Scanner
for every single number we try to parse — I suspect that this is a potentially prohibitively expensive operation.
Note also that this conflicts with the work in #1654 so we might need to coordinate something here. |
@itaiferber Thanks for your feedback. I hadn't actually considered the cost of the As to the difference between As to the solution in #1654, I did originally implement that same solution and there are a number of issues with it: Firstly it can involve upto 3 separate string parses. Also care must be taken with As for the bridging issues, my tests showed up some problems but these should be resolved in #1653 . However thinking about this problem further, I realised the real issue is that we parse the string before knowing how we want it use it. So I came up with the following solution which is to store the number as a
This would ensure that full JSON number validation is performed, which currently doesn't happen, and the value is only parsed once using the correct parser. Does this seem reasonable or am missing anything here? |
@spevans I was quickly putting together an approach similar to #1654 — I agree that it can involve up to 3 parses but they should be relatively short and quick as the input is limited in length. Otherwise, though: how do we know how we want to use a parsed number? Aren't these values just returned to the developer directly? Or are you proposing a further deferral that will only parse the value once it's read and cast out (e.g. a number that totally defers evaluation until you try to access it)? (If so, it doesn't necessarily improve things as it's totally valid to write |
I was thinking out not parsing until its accessed and with Also, if it is a subclass of |
FYI, here is the version I had using the
|
Closed in favour of #1657 |
Replace strol()/strtod() parsing with Scanner.scanDecimal().
This allows parsing of Integers > Int64.max (upto UInt64.max)
without losing information due to the fallback parsing provided
by parsing as a Double.
Also allows parsing larger integers into a Decimal.
Requires PR #1653 for some
NSDecimalNumber
andNSNumber
fixes