Skip to content
This repository has been archived by the owner on Sep 19, 2018. It is now read-only.

Large integers cause overflows, especially on 32-bit hosts #76

Closed
zwaldowski opened this issue Dec 8, 2015 · 52 comments · Fixed by #152
Closed

Large integers cause overflows, especially on 32-bit hosts #76

zwaldowski opened this issue Dec 8, 2015 · 52 comments · Fixed by #152
Milestone

Comments

@zwaldowski
Copy link
Contributor

zwaldowski commented Dec 8, 2015

Follow-on from #57. Consensus was that JSON.Int should continue to use Swift.Int as its associated type, but that still leaves the issue of overflows at hand. The stdlib asserts inside the parser, generally bringing down the whole app as a result.

Options include:

  • Switch the parser to the &+ et. al. operators and explicitly allow overflows.
  • Switch the parser to the addWithOverflow(_:_:) et. al. functions, catch overflows, and throw errors, thus cancelling the parse.
  • Switch the parser to the addWithOverflow(_:_:) et. al. functions, catch overflows, and return Strings for big integers. This might require some workflow changes in the parser.
  • Switch the parser to the addWithOverflow(_:_:) et. al. functions, catch overflows, and return Doubles for big integers. (undesirable, but I mention it)

Workaround

If this is a problem for you today, you can work around it by using Apple's parser rather than Freddy's via JSON(data:usingParser:) and specifying the parser as NSJSONSerialization.self.

// assuming you have data: NSData
let json = try JSON(data: data, usingParser: NSJSONSerialization.self)

For a bit more discussion around this, see this comment below.

Discussion Summary

Eventual decision was to avoid overflow by treating numbers that would overflow as strings. The value can be pulled out and then parsed from the string if needed by the user of the API. Implemented in #152 and expected to land in Freddy v2.1.

@zwaldowski
Copy link
Contributor Author

Added extra option for arguments' sake.

@zwaldowski
Copy link
Contributor Author

The latest RFC his this to say:

   This specification allows implementations to set limits on the range
   and precision of numbers accepted.  Since software that implements
   IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
   generally available and widely used, good interoperability can be
   achieved by implementations that expect no more precision or range
   than these provide, in the sense that implementations will
   approximate JSON numbers within the expected precision.  A JSON
   number such as 1E400 or 3.141592653589793238462643383279 may indicate
   potential interoperability problems, since it suggests that the
   software that created it expects receiving software to have greater
   capabilities for numeric magnitude and precision than is widely
   available.

   Note that when such software is used, numbers that are integers and
   are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
   sense that implementations will agree exactly on their numeric
   values.

@zwaldowski
Copy link
Contributor Author

And this:

   An implementation may set limits on the range and precision
   of numbers.

@zwaldowski
Copy link
Contributor Author

@mdmathias might be interested in that last comment. Ping.

@zwaldowski
Copy link
Contributor Author

/me waves @jeremy-w into the thread

@jeremy-w
Copy link
Member

jeremy-w commented Dec 9, 2015

Another option: saturate at min/max values rather than overflowing.

We definitely don't want to take out the process; having some way to indicate there was a loss of range or precision in a field for those interested would be desirable, though.

@jeremy-w
Copy link
Member

jeremy-w commented Dec 9, 2015

catch overflows, and return Strings for big integers.

What would this look like on the user side? Swapping the type out on the fly doesn't seem to play well with Swift's type system. :\

@zwaldowski
Copy link
Contributor Author

Re: string swap, it'd be a thrown error during deserialization rather than parsing. But you could compensate for it in those places (move var userId: Int to var userId: String and then provide a fallback path for the parser still sometimes returning Int).

It's not my favorite solution, but it provides recoverability without loss of data.

@mdmathias
Copy link
Contributor

@zwaldowski re: RFC, do you think we should set and document the range and precision? And then throw an error in the parser if these are not respected?

That does not feel unreasonable to me, given our options.

@zwaldowski
Copy link
Contributor Author

I think it's the most sane path, but it also leaves the user (of the library) with an unrecoverable parse failure.

@jgallagher
Copy link
Contributor

I think the most important criteria from what you've mentioned are, in order, "without loss of data" and a distant second is "recoverability". That'd rule out &+ and converting to Double, IMO. Returning a string instead is a really clever solution - for users who never run into this issue, it has no effect on them (if I'm understanding the solution correctly), but for users who do, it gives them an "out" that will allow the parser to work with their data.

@mdmathias
Copy link
Contributor

TBH, I'm having trouble imagining when that String would be returned. Perhaps I'm missing the picture, but wouldn't a client be writing client code expecting to convert a JSON.Int into some property on a model and instead get a JSON.String? They would have to change their model, right? Which would mean that they would have to have perfect knowledge on the times wherein a "REALLY BIG INT" will occur. Is that always going to be possible?

@zwaldowski
Copy link
Contributor Author

They would have to change their model, yes, but that's at least a possible solution rather than a hard error. Obviously the best solution would be "tell the server to fix it" but as consultants we know we don't always have that luxury.

@mdmathias
Copy link
Contributor

So, we would still throw an error, but the error would contain a Stringified version of the BIG INT?

@zwaldowski
Copy link
Contributor Author

No, in the case I'm describing the JSON.Int(SOME_BIG_INT) in the parsed payload would come back as JSON.String("SOME_BIG_INT"). You'd hit a TypeNotConvertible error while deserializing that you could recover from. (Or the string initializer could automatically convert from numbers.)

@jeremy-w
Copy link
Member

Or the string initializer could automatically convert from numbers.

👍 This would handle the common case of "I am treating your fooID as a cookie, and I don't care what type you think it should be". It would also let users default to String to start with where integer range might be a concern.

@mdmathias
Copy link
Contributor

Okay, so a client would do something like:

let json = getSomeDataAndMakeItJSON()

let peopleJSON = try! json.array("people")
for personJSON in peopleJSON {
    do {
        let person = try Person(json: personJSON)
    } catch JSON.Error.ValueNotConvertible(let theType) {
        // do what here if the `Int` is too big?
    }
}

This feels like it would be hard to debug. Also, this doesn't feel all that recoverable. Perhaps I'm foggy-headed after the holiday party...but the model object won't be made, which is kind of a problem. I mean, we can prevent the crash with these errors, but the app won't have the data.

In any case, the associated value on .ValueNotConvertible just gives the type that the client wanted to convert JSON into, right? The error's associated value should more helpfully say something like, "The property 'bigInt' is of type 'String', not 'Int'."

I'm sorry if I'm not contributing here; I'm just trying to wrap my head around the proposed solution.

@zwaldowski
Copy link
Contributor Author

No, in your Person.init(json:):

do {
    try self.reallyBigIdNumber = json.int("id")
} catch {
   try self.reallyBigIdNumber = json.string("id")
}

Or any combination of that. If we do automatic Int->String conversion, then if you were aware of potential for overflow you would only use self.reallyBigIdCookie = json.string("id").

@mdmathias
Copy link
Contributor

I see. I don't think that folks will intuitively default to writing that code. So, we'll have to provide better error information in my above example (which I think people will gravitate to more naturally) that will help folks get it write. That is, they will hit that .ValueNotConvertible error. Agreed?

@zwaldowski
Copy link
Contributor Author

I agree that it's not intuitive; this is not supposed to be a common case.

I'm concerned adding the from-type to ValueNotConvertible would introduce a lot of extra complexity to the base JSONDecodable implementations.

@lyricsboy
Copy link
Contributor

In my opinion, this is rare enough that we can start with option 2 (throw errors) and then revisit a workaround solution if people request it as a feature.

@zwaldowski
Copy link
Contributor Author

@lyricsboy: I don't disagree, but I've already run into this on a project, so I was thinking about it in advance. It definitely is the route of least effort (in a good way).

@lyricsboy
Copy link
Contributor

What was your resolution in that situation?

@zwaldowski
Copy link
Contributor Author

Bourbon, I think.

@lyricsboy
Copy link
Contributor

💥

@zwaldowski zwaldowski added this to the Next milestone Jan 18, 2016
@matthewpalmer
Copy link

Not sure if I misunderstand, but I'm running this on 10.11 in an iPhone 5 simulator.

From my perspective (missing a lot of the surrounding info that you guys have, of course), I'd prefer if it stayed as a number, or if Int64 was used throughout (I figure that was the topic of #57). That case line feels like an implicit check against that anyway. Or we could check if the value exceeds the max and bump it up to a Double if it does.

To me it makes more sense to keep a js/json number as a number for as long as possible, since ostensibly that's the type that the server intended the value to have. Although it is a good point that really large IDs and stuff might get messed around by this.

@jeremy-w
Copy link
Member

I coded it up, and the path doesn't get hit in 64-bit processes. When I feed my 64-bit OS X build Swift.Int.max + 1 as the date value, CFNumber reads it in as a double. (Then I hit a crash trying to put the too-big-for-Int into an Int.) I guess this comes down to CFNumber not supporting unsigned backing storage, so once the value exceeds the 64-bit signed range, it bumps up?

@jeremy-w
Copy link
Member

Trying to run the MobileFreddy tests on iPhone 5, and I get a "run destination not compatible" error. #ThanksXcode

@matthewpalmer
Copy link

Seeing the same here. I figure NSJSONSerialization is putting the value through as an Int64, but Freddy's trying to use an Int32, so the value overflows. I think this would affect any value in (Int32.max...Int64.max) on a 32-bit device

@jeremy-w
Copy link
Member

Oops, I fed it UINT64_MAX+1 earlier. But INT64_MAX+1 on a 64-bit machine still results in a double.

As a workaround, pulling out the value using the json.double(path) accessor should give you the value without blowing up, and it should be exact for the range you need to work with Unix timestamps. Seeing as NSTimeInterval is a double anyway, that should play well with munging into an NSDate, if that's where you're headed.

matthewpalmer added a commit to matthewpalmer/Charter that referenced this issue Feb 26, 2016
- Freddy currently has an issue parsing large integers on 32 bit
  devices. See discussion:
  bignerdranch/Freddy#76.
@zwaldowski zwaldowski modified the milestones: Future, Next Mar 23, 2016
@siuying
Copy link

siuying commented Apr 26, 2016

I noticed the issue when I test my app on 32bit device, all JSON parsing will just fail. I expected the big integer issue and workaround it by accessing the string version of the field, but Freddy will always fail even I didn't touch the big integer field.

let str = "{\"id\": 598458199574970368, \"id_str\": \"598458199574970368\"}"
let data = str.dataUsingEncoding(NSUTF8StringEncoding)!
let json = try? JSON(data: data)
XCTAssertEqual(try json!.string("id_str"), "598458199574970368")

There is currently no way for Freddy to parse a JSON contains a big integer, even we didn't use it.

IMHO the "parse as string" workaround is not necessary, if people need big integer in JSON, they probably using string value already. A Integer value is just not a String value.

@jeremy-w
Copy link
Member

@siuying There is actually: Use the configurable parser support to use Apple's parser if that's what you need do, as described here. I'll edit the issue description at the top to include the workaround.

@jeremy-w jeremy-w changed the title Large integers cause overflows Large integers cause overflows, especially on 32-bit hosts Apr 26, 2016
@siuying
Copy link

siuying commented Apr 26, 2016

@jeremy-w thanks for the pointer! I like Freddy, and I'll have to support parsing file with big integer, this definitely give me best of both worlds.

@tylerarnold
Copy link

tylerarnold commented Apr 26, 2016

so is there actually a way to currently parse big integers on 32-bit ARM even using the NSONSerialization workaround?
I get the following when I run the following unit test in an iPhone 5 simulator :
screen shot 2016-04-26 at 1 10 37 pm

@jeremy-w
Copy link
Member

jeremy-w commented Apr 26, 2016

You're right – that isn't working as a workaround because of the JSON building. It avoids the crash, but mangles the value:

we got us an nsnumber: 1466719200000 - type is 4, float type? false

That's .SInt64. The cases in NSJSONSerialization.makeJSON(object:) assume that if it ain't a char and ain't floating, then it fits in an Int, which is wrong on 32-bit platforms.

ETA: If your JSON has 1466719200000.0, though, it does work. So if you represent large floats as floats rather than integers, you should be fine.

@tylerarnold
Copy link

unfortunately I do not have control over that JSON format, I guess I could do some temporary hackery to force all the date formats to float before handing them off to the parser for now or try to fix this locally.

@jeremy-w
Copy link
Member

I've brought the 32-bit issue up on #152, so we should be able to fix this along with the other overflow issues.

@kas-kad
Copy link

kas-kad commented Aug 21, 2016

Due to this issue, In our project, we have to switch from Freddy to SwiftyJSON. Regrettably, you don't have case Int64 or NSNumber in your JSON.

@ericmaxwell2003
Copy link

ericmaxwell2003 commented Nov 10, 2016

Freddy turns Int and Double types into NSNumber as part of the serialization process. Couldn't the Freddy guys just add NSNumber as a type in the JSON enum and then return the raw value when serializing as shown below? Or am I missing something..

private func toJSONSerializationValue() -> Any {
        switch self {
       . . . 
        case .NSNumber(let nsNum) 
            return nsNum
        case .double(let num):
            return NSNumber(value: num)
        case .int(let int):
            return NSNumber(value: int)
       . . .
        }

@jeremy-w
Copy link
Member

jeremy-w commented Nov 14, 2016

NSNumber is not a bignum implementation, so the problem reappears on 64-bit platforms once you overflow the 64-bit range, and NSNumber still does not resolve the issue. It would add another layer of wrapping and clumsiness, but not solve the underlying issue.

@ericmaxwell2003
Copy link

Gotcha. So NSNumber would solve the problem of using Freddy to serialize/deserialize Int64s on a on a 32 bit platform but not serializing/deserializing integers larger than 64 bits on a 64 bit platform.

Could you just explicitly add support for Int64? Or Int64 and Int32?

@jgallagher
Copy link
Contributor

We haven't seen a way to add support for the sized integers without making the common case (Int) more painful/onerous. The stringification solution is far from perfect, but lets Freddy not explode on big ints and not lose information.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.