-
Notifications
You must be signed in to change notification settings - Fork 121
Catch numeric overflow errors and parse the numbers as strings instead. #152
Conversation
Also, consider the ignoring-whitespace diff which kills most of the apparent changes to JSONParser.swift. |
// This would be more natural as `while true { ... }` with a meaningful .Done case, | ||
// but that causes compile time explosion in Swift 2.2. :-| | ||
while parser.state != .Done { | ||
switch parser.state { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems really weird to me that we're switching on parser.state
directly here, but none of the cases directly updates parser.state
. It feels like we've got half the parser implementation in this loop, and the other half inside the parser itself.
Is anything preventing moving all the number parsing logic into the parser itself?
This fails to address issues with 32-bit overflow on 32-bit platforms: diff --git a/Tests/JSONParserTests.swift b/Tests/JSONParserTests.swift
index 8aeee46..1141157 100644
--- a/Tests/JSONParserTests.swift
+++ b/Tests/JSONParserTests.swift
@@ -44,6 +44,14 @@ private func ==(lhs: JSONParser.Error, rhs: JSONParser.Error) -> Bool {
class JSONParserTests: XCTestCase {
+ func testParsingOutALargeNumberEspeciallyOn32BitPlatforms() {
+ let val = 1466719200000.0
+ let str = "{\"startDate\": 1466719200000}"
+ let data = str.dataUsingEncoding(NSUTF8StringEncoding)!
+ let json = try? (sizeof(Int) == sizeof(Int64)) ? JSON(data: data) : JSON(data: data, usingParser: NSJSONSerialization.self)
+ XCTAssertEqual(try? json!.double("startDate"), val)
+ }
+
private func JSONFromString(s: String) throws -> JSON {
var parser = JSONParser(string: s)
return try parser.parse() That's checking the path with NSJSONSerialization parser, but if you swap to |
Failure in the custom parser case is because it gets treated as a string, so that's expected behavior. It's only with the diff --git a/Tests/JSONParserTests.swift b/Tests/JSONParserTests.swift
index 8aeee46..f8c0412 100644
--- a/Tests/JSONParserTests.swift
+++ b/Tests/JSONParserTests.swift
@@ -44,6 +44,33 @@ private func ==(lhs: JSONParser.Error, rhs: JSONParser.Error) -> Bool {
class JSONParserTests: XCTestCase {
+ func test32BitOverflowResultsInDoubleWithNSJSONSerializationParser() {
+ let jsonString = "{\"startDate\": 1466719200000}"
+ let expectedValue = 1466719200000.0
+
+ let data = jsonString.dataUsingEncoding(NSUTF8StringEncoding)!
+ guard let json = try? JSON(data: data, usingParser: NSJSONSerialization.self) else {
+ XCTFail("Failed to even parse JSON: \(jsonString)")
+ return
+ }
+
+ XCTAssertEqual(try? json.double("startDate"), expectedValue)
+ }
+
+ func test32BitOverflowResultsInStringWithFreddyParser() {
+ let jsonString = "{\"startDate\": 1466719200000}"
+ let expectedValue = "1466719200000"
+
+ let data = jsonString.dataUsingEncoding(NSUTF8StringEncoding)!
+ guard let json = try? JSON(data: data) else {
+ XCTFail("Failed to even parse JSON: \(jsonString)")
+ return
+ }
+
+ XCTAssertEqual(try? json.double("startDate"), nil)
+ XCTAssertEqual(try? json.string("startDate"), expectedValue)
+ }
+
private func JSONFromString(s: String) throws -> JSON {
var parser = JSONParser(string: s)
return try parser.parse()
@@ -224,7 +251,7 @@ class JSONParserTests: XCTestCase {
let json = try JSONFromString(string)
// numbers overflow, but we should be able to get them out as strings
- XCTAssertEqual(try json.string(), string)
+ XCTAssertEqual(try? json.string(), string)
} catch {
XCTFail("Unexpected error \(error)")
} |
Prior to this, we would not crash when using the NSJSONSerialization parser, but we would mangle the value by incorrectly assuming "not a char and not a float means it fits in an Int", even when it was properly an Int64. We might want to force this overflow case to String for parity with the Freddy parser.
I've resolved that. We might want to push overflowing 32-bit values into String rather than the "promote to Double" approach I've taken here. That would be a simple fix after this. I had to switch from a |
Travis failed the build because my tests depend on architecture, and I only wrote them for 32-bit. TODO:
|
This standardizes behavior across the Freddy parser and the NSJSONSerialization parser. The new tests also pass under 64-bit devices, rather than just 32-bit devices, which should make Travis significantly happier. A commented out assertion captures an interesting fatal error, however, which should also be fixed as part of this bounds-handling escapade.
Otherwise, you get a String on 32-bit, and a Double on 64-bit. So our parser differ in overflowing value handling, but are at least now consistent across architectures.
@@ -57,20 +57,49 @@ extension NSJSONSerialization: JSONParserType { | |||
private static func makeJSON(object: AnyObject) -> JSON { | |||
switch object { | |||
case let n as NSNumber: | |||
switch n { | |||
case _ where CFNumberGetType(n) == .CharType || CFGetTypeID(n) == CFBooleanGetTypeID(): | |||
let numberType = CFNumberGetType(n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least as of 2.2, CFNumberType
is an enum
. We should switch over it exhaustively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might get ugly with the way the #if stuff works, but I'll give it a shot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think wrapping an entire case in #if
is still valid.
Once, this wasn't imported as an enum type, but it is now! [Thanks @zwaldowski.](https://github.com/bignerdranch/Freddy/pull/152/files/8e1e0626215d3c2a689619199020d4429e280c5f#r61319108)
The CFNumber switch is exhaustive, so there should be no need for it. Plus, after nuking it, no tests failed, so we must be good to go. Right?
TL;DR func testLongIntParsing() {
let val = 1470438000000.0
let jsonStr = "{\"big\":1470438000000}"
let json = try? JSON(jsonString: jsonStr)
XCTAssertEqual(try? json!.double("big"), val)
} should I use this workaround? let json = try? (sizeof(Int) == sizeof(Int64)) ? JSON(data: data) : JSON(data: data, usingParser: NSJSONSerialization.self) or this is better?
Either way, it is weird, that I have to use any workarounds. |
The first technique will let you pull it out as a double, but will be slower than the second technique (whether it's noticeably slower depends on your data). As far as having to work around it - we had to make a choice about what to do with numbers that don't fit in let bigNum = "36028797018963969".dataUsingEncoding(NSUTF8StringEncoding)!
let json = try! JSON(data: bigNum)
let double = try! json.double() // gives you 36028797018963968, off by one Other large numbers can be off by more. We chose to go the route that requires slightly more work but will not give you incorrect values; e.g., if you're hitting an API that returns large numbers as IDs where loss of precision is unacceptable, you still have the full original number available to do with what you will (e.g., putting it into an |
Closes #76.
Makes no changes to the "fast path" of numbers that don't overflow. If a numeric overflow is detected (via a thrown
NumberOverflow
, which is now a private error), we go back to the beginning of the number and re-parse it for validity only, returning it as a.String
.The good: Returns overflows as strings (e.g., see the unit tests) without slowing down the parser.
The bad: We would now have two nearly identical groups of methods that parse numbers - one accumulates values and the other just looks for the end.If we want to close #76 as part of 2.1, we could merge this and open a separate issue to clean up the internals. Not sure how to do that without sacrificing performance at the moment, but I'm sure we can come up with something.Update from last couple commits - extracted number parsing out into a common struct with an internal state machine. Much less duplication between parsing numbers as numbers and parsing numbers as strings. Minimal slowdown (~1% on benchmarking branch).
Also also:
fatalError
whenjson.int
is used with aDouble
that exceedsInt.max
.