New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedccl: Tolerate a lack of trailing zeroes when encoding decimal columns #59075
Conversation
9032cd5
to
97a3702
Compare
pkg/ccl/changefeedccl/avro.go
Outdated
// If the decimal happens to fit a smaller width than the | ||
// column allows, add trailing zeroes so the scale is constant | ||
if colDesc.Type.Width() > -dec.Exponent { | ||
tree.DecimalCtx.WithPrecision(uint32(prec)).Quantize(&dec, &dec, -int32(width)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to check the return values of Quantize
?
Also, Quantize
seems to offer the possibility of rounding – can we guarantee that doesn’t happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WithPrecision
(and the column definition) should be guaranteeing that doesn't happen. Added a return value check just in case. Thanks for reviewing!
97a3702
to
6041ecb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB and @shermanCRL)
pkg/ccl/changefeedccl/avro_test.go, line 485 at r2 (raw file):
{sqlType: `DECIMAL(12,2)`, sql: `1e2`, avro: `{"bytes.decimal":"'\u0010"}`},
perhaps fe more test cases? maybe w/ negative exponent?
pkg/ccl/changefeedccl/avro_test.go, line 488 at r2 (raw file):
} for _, test := range truncs {
isn't it more common to have truncs test cases defined outside of the t.Run...
truncs := ...
for i, test := range truncs{
t.Run(fmt.Sprintf("lossless-%d", i), func(....))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB and @shermanCRL)
pkg/ccl/changefeedccl/avro.go, line 315 at r2 (raw file):
// If the decimal happens to fit a smaller width than the // column allows, add trailing zeroes so the scale is constant if colDesc.Type.Width() > -dec.Exponent {
Curious: Why are we checking for negative exponent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissed @shermanCRL from a discussion.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB, @miretskiy, and @shermanCRL)
pkg/ccl/changefeedccl/avro.go, line 316 at r1 (raw file):
Previously, HonoreDB (Aaron Zinger) wrote…
WithPrecision
(and the column definition) should be guaranteeing that doesn't happen. Added a return value check just in case. Thanks for reviewing!
Done.
pkg/ccl/changefeedccl/avro.go, line 315 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Curious: Why are we checking for negative exponent?
In order to honor a column width of greater than 0, you need digits after the decimal point, and the only way to get those is with a negative exponent. For example if you need a width of two, you need 600 to be rendered as 600.00, so it needs to be 60000e-2 rather than 6e+2.
pkg/ccl/changefeedccl/avro_test.go, line 488 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
isn't it more common to have truncs test cases defined outside of the t.Run...
truncs := ... for i, test := range truncs{ t.Run(fmt.Sprintf("lossless-%d", i), func(....)) }
Maybe in general, but I'm just copying the style in this file, e.g.
cockroach/pkg/ccl/changefeedccl/avro_test.go
Lines 348 to 349 in 6041ecb
t.Run("value_goldens", func(t *testing.T) { | |
goldens := []struct { |
…al columns Our avro encoder errors out when given a decimal with a fixed width that normalizes to a different width. This does not work with how we currently store and retrieve decimal data, so when we know we've hit this case, normalize to the stored width before encoding. Release note (bug fix): Changefeeds no longer error with "1e2 is not roundtrippable at scale 2" when 100 is stored in a column with width 2.
6041ecb
to
0cd693a
Compare
pkg/ccl/changefeedccl/avro_test.go, line 485 at r2 (raw file): Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB and @shermanCRL)
pkg/ccl/changefeedccl/avro.go, line 315 at r2 (raw file):
Previously, HonoreDB (Aaron Zinger) wrote…
In order to honor a column width of greater than 0, you need digits after the decimal point, and the only way to get those is with a negative exponent. For example if you need a width of two, you need 600 to be rendered as 600.00, so it needs to be 60000e-2 rather than 6e+2.
Argh... Why, oh why did we choose to use width instead of scale?
Can the dec.Exponent be negative?
The confusing part about this if statement (and the comment) has to do with the fact that we have a datum (which is an arbitrary precision decimal), and avro -- which may or may not have same scale/precision...
pkg/ccl/changefeedccl/avro.go, line 315 at r2 (raw file): Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Yeah, I don't know why that discrepancy since it's our decimal library. In a bug-free world, the Avro datatype has been mapped directly from the column datatype, and the datum being encoded comes from that column, so the scale, precision, width, whatever should all be identical. That means that dec.Exponent should always be negative, and in fact equal to negative the column's width, as that's the only way to encode a fixed-width decimal. 600 should have been encoded as a decimal with coefficient (600 * 10^width) and exponent |
See also #59489 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy and @shermanCRL)
bors r=[miretskiy] |
Build failed (retrying...): |
Bors run got messed up by a bad merge of #59407. Resubmitting on your behalf. bors r=[miretskiy] |
Build succeeded: |
Our avro encoder errors out when given a decimal with a fixed width
that normalizes to a different width. This does not work with how
we currently store and retrieve decimal data, so when we know we've
hit this case, normalize to the stored width before encoding.
Release note (bug fix): Changefeeds no longer error with
"1e2 is not roundtrippable at scale 2" when 100 is stored
in a column with width 2.
Closes #38058