Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large integer comparisons and msgpack encoding #176

Merged
merged 2 commits into from Jan 30, 2024

Conversation

jbardin
Copy link
Contributor

@jbardin jbardin commented Jan 30, 2024

This is two related bugs merged into a single PR, but we can split this up if needed.

The first is that Equals uses the string formatting for number comparison, but the format depends on the precision of the big.Float which could be different even with equal numbers. While we rely on the formatting implementation for "fuzzy" matching of floats, integers can be compared exactly.

The related bug was in the compact floats implementation of msgpack. When setting UseCompactFloats(true), the msgpack encoder uses the expression float64(int64(n)) == n to check if the number con be represented by an int64. this can fail for some values on certain architectures, because the result of a failed conversion is undefined and system dependent.

If we are comparing 2 numbers that are even integers, compare them via
big.Int values rather than formatted strings. This avoids problems with
large integers stored in big.Float with different precisions which could
format differently.
There is a bug in the compact floats implementation, which could cause
some integer values stored in a float to encode as an incorrect value.
The result of the expression used to check for compact integer encoding
is `float64(int64(n)) == n`, which is undefined when conversion is
not exact.
Copy link
Collaborator

@apparentlymart apparentlymart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all makes sense to me. Thanks!

I'm going to merge this momentarily.

Comment on lines +35 to +38

// UseCompactFloats can fail on some platforms due to undefined behavior of
// float conversions
enc.UseCompactFloats(false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note for the future, since I went and looked this all up to remind myself for review purposes...


UseCompactFloats here means "encode floats as integers when possible", which can potentially be productive because MessagePack has some specialized compact representations of small integers, whereas floats are always encoded as either 5 bytes (for float32) or 9 bytes (for float64).

That optimization seemed useful because cty.Number doesn't distinguish between integers and floats anyway -- it's just "the number type" -- and so we'd prefer to encode all numbers in the most compact form possible.

However, the problem arises in how the upstream library makes the decision about whether to do it. It uses float64(int64(n)) == n (where n is float64), assuming that this will return true if and only if n is a float representation of an integer. That doesn't hold true on all architectures, which is a specific allowance made in the current Go specification:

In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent.

It turns out that on some architectures this causes the expression above to return true even when n is not an integer if n is too large to fit into int64. That then causes the cty MessagePack encoding to silently lose precision when round-tripping.


This change doesn't actually cause any serialization bloat for valid cases in practice anyway, because the code in this package that calls into enc methods has its own special case for treating whole numbers as integers, using the math/big.Float.Int64 definition instead of relying on the CPU's built-in type conversion rules:

bf := val.AsBigFloat()
if iv, acc := bf.Int64(); acc == big.Exact {
err = enc.EncodeInt(iv)
} else if fv, acc := bf.Float64(); acc == big.Exact {
err = enc.EncodeFloat64(fv)
} else {
err = enc.EncodeString(bf.Text('f', -1))
}

Because we're still setting UseCompactInts, enc.EncodeInt will still select the most compact integer representation. But now when we call enc.EncodeFloat64 we will definitely always get the 9-byte MessagePack float64 representation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants