New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large integer comparisons and msgpack encoding #176
Conversation
If we are comparing 2 numbers that are even integers, compare them via big.Int values rather than formatted strings. This avoids problems with large integers stored in big.Float with different precisions which could format differently.
There is a bug in the compact floats implementation, which could cause some integer values stored in a float to encode as an incorrect value. The result of the expression used to check for compact integer encoding is `float64(int64(n)) == n`, which is undefined when conversion is not exact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all makes sense to me. Thanks!
I'm going to merge this momentarily.
|
||
// UseCompactFloats can fail on some platforms due to undefined behavior of | ||
// float conversions | ||
enc.UseCompactFloats(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note for the future, since I went and looked this all up to remind myself for review purposes...
UseCompactFloats
here means "encode floats as integers when possible", which can potentially be productive because MessagePack has some specialized compact representations of small integers, whereas floats are always encoded as either 5 bytes (for float32
) or 9 bytes (for float64
).
That optimization seemed useful because cty.Number
doesn't distinguish between integers and floats anyway -- it's just "the number type" -- and so we'd prefer to encode all numbers in the most compact form possible.
However, the problem arises in how the upstream library makes the decision about whether to do it. It uses float64(int64(n)) == n
(where n
is float64
), assuming that this will return true if and only if n
is a float representation of an integer. That doesn't hold true on all architectures, which is a specific allowance made in the current Go specification:
In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent.
It turns out that on some architectures this causes the expression above to return true
even when n
is not an integer if n
is too large to fit into int64
. That then causes the cty
MessagePack encoding to silently lose precision when round-tripping.
This change doesn't actually cause any serialization bloat for valid cases in practice anyway, because the code in this package that calls into enc
methods has its own special case for treating whole numbers as integers, using the math/big.Float.Int64
definition instead of relying on the CPU's built-in type conversion rules:
Lines 86 to 93 in 1348b69
bf := val.AsBigFloat() | |
if iv, acc := bf.Int64(); acc == big.Exact { | |
err = enc.EncodeInt(iv) | |
} else if fv, acc := bf.Float64(); acc == big.Exact { | |
err = enc.EncodeFloat64(fv) | |
} else { | |
err = enc.EncodeString(bf.Text('f', -1)) | |
} |
Because we're still setting UseCompactInts
, enc.EncodeInt
will still select the most compact integer representation. But now when we call enc.EncodeFloat64
we will definitely always get the 9-byte MessagePack float64 representation.
This is two related bugs merged into a single PR, but we can split this up if needed.
The first is that
Equals
uses the string formatting for number comparison, but the format depends on the precision of thebig.Float
which could be different even with equal numbers. While we rely on the formatting implementation for "fuzzy" matching of floats, integers can be compared exactly.The related bug was in the compact floats implementation of msgpack. When setting
UseCompactFloats(true)
, the msgpack encoder uses the expressionfloat64(int64(n)) == n
to check if the number con be represented by anint64
. this can fail for some values on certain architectures, because the result of a failed conversion is undefined and system dependent.