Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: Tolerate a lack of trailing zeroes when encoding decimal columns #59075

Merged
merged 1 commit into from Jan 28, 2021

Conversation

HonoreDB
Copy link
Contributor

Our avro encoder errors out when given a decimal with a fixed width
that normalizes to a different width. This does not work with how
we currently store and retrieve decimal data, so when we know we've
hit this case, normalize to the stored width before encoding.

Release note (bug fix): Changefeeds no longer error with
"1e2 is not roundtrippable at scale 2" when 100 is stored
in a column with width 2.

Closes #38058

@HonoreDB HonoreDB requested review from miretskiy and a team January 15, 2021 23:01
@cockroach-teamcity
Copy link
Member

This change is Reviewable

// If the decimal happens to fit a smaller width than the
// column allows, add trailing zeroes so the scale is constant
if colDesc.Type.Width() > -dec.Exponent {
tree.DecimalCtx.WithPrecision(uint32(prec)).Quantize(&dec, &dec, -int32(width))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to check the return values of Quantize?

Also, Quantize seems to offer the possibility of rounding – can we guarantee that doesn’t happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithPrecision (and the column definition) should be guaranteeing that doesn't happen. Added a return value check just in case. Thanks for reviewing!

Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB and @shermanCRL)


pkg/ccl/changefeedccl/avro_test.go, line 485 at r2 (raw file):

			{sqlType: `DECIMAL(12,2)`,
				sql:  `1e2`,
				avro: `{"bytes.decimal":"'\u0010"}`},

perhaps fe more test cases? maybe w/ negative exponent?


pkg/ccl/changefeedccl/avro_test.go, line 488 at r2 (raw file):

		}

		for _, test := range truncs {

isn't it more common to have truncs test cases defined outside of the t.Run...

truncs := ...
for i, test := range truncs{
   t.Run(fmt.Sprintf("lossless-%d", i), func(....))
}

Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB and @shermanCRL)


pkg/ccl/changefeedccl/avro.go, line 315 at r2 (raw file):

			// If the decimal happens to fit a smaller width than the
			// column allows, add trailing zeroes so the scale is constant
			if colDesc.Type.Width() > -dec.Exponent {

Curious: Why are we checking for negative exponent?

Copy link
Contributor Author

@HonoreDB HonoreDB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @shermanCRL from a discussion.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB, @miretskiy, and @shermanCRL)


pkg/ccl/changefeedccl/avro.go, line 316 at r1 (raw file):

Previously, HonoreDB (Aaron Zinger) wrote…

WithPrecision (and the column definition) should be guaranteeing that doesn't happen. Added a return value check just in case. Thanks for reviewing!

Done.


pkg/ccl/changefeedccl/avro.go, line 315 at r2 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Curious: Why are we checking for negative exponent?

In order to honor a column width of greater than 0, you need digits after the decimal point, and the only way to get those is with a negative exponent. For example if you need a width of two, you need 600 to be rendered as 600.00, so it needs to be 60000e-2 rather than 6e+2.


pkg/ccl/changefeedccl/avro_test.go, line 488 at r2 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

isn't it more common to have truncs test cases defined outside of the t.Run...

truncs := ...
for i, test := range truncs{
   t.Run(fmt.Sprintf("lossless-%d", i), func(....))
}

Maybe in general, but I'm just copying the style in this file, e.g.

t.Run("value_goldens", func(t *testing.T) {
goldens := []struct {

…al columns

Our avro encoder errors out when given a decimal with a fixed width
that normalizes to a different width. This does not work with how
we currently store and retrieve decimal data, so when we know we've
hit this case, normalize to the stored width before encoding.

Release note (bug fix): Changefeeds no longer error with
"1e2 is not roundtrippable at scale 2" when 100 is stored
in a column with width 2.
@HonoreDB
Copy link
Contributor Author


pkg/ccl/changefeedccl/avro_test.go, line 485 at r2 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

perhaps fe more test cases? maybe w/ negative exponent?

Done.

Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB and @shermanCRL)


pkg/ccl/changefeedccl/avro.go, line 315 at r2 (raw file):

Previously, HonoreDB (Aaron Zinger) wrote…

In order to honor a column width of greater than 0, you need digits after the decimal point, and the only way to get those is with a negative exponent. For example if you need a width of two, you need 600 to be rendered as 600.00, so it needs to be 60000e-2 rather than 6e+2.

Argh... Why, oh why did we choose to use width instead of scale?

Can the dec.Exponent be negative?

The confusing part about this if statement (and the comment) has to do with the fact that we have a datum (which is an arbitrary precision decimal), and avro -- which may or may not have same scale/precision...

@HonoreDB
Copy link
Contributor Author


pkg/ccl/changefeedccl/avro.go, line 315 at r2 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Argh... Why, oh why did we choose to use width instead of scale?

Can the dec.Exponent be negative?

The confusing part about this if statement (and the comment) has to do with the fact that we have a datum (which is an arbitrary precision decimal), and avro -- which may or may not have same scale/precision...

Yeah, I don't know why that discrepancy since it's our decimal library.

In a bug-free world, the Avro datatype has been mapped directly from the column datatype, and the datum being encoded comes from that column, so the scale, precision, width, whatever should all be identical. That means that dec.Exponent should always be negative, and in fact equal to negative the column's width, as that's the only way to encode a fixed-width decimal. 600 should have been encoded as a decimal with coefficient (600 * 10^width) and exponent -width. And this in fact usually happens, just not when the upstream bug occurs.

@HonoreDB
Copy link
Contributor Author

See also #59489

@miretskiy miretskiy self-requested a review January 28, 2021 17:11
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy and @shermanCRL)

@HonoreDB
Copy link
Contributor Author

bors r=[miretskiy]

@craig
Copy link
Contributor

craig bot commented Jan 28, 2021

Build failed (retrying...):

@ajstorm
Copy link
Collaborator

ajstorm commented Jan 28, 2021

Bors run got messed up by a bad merge of #59407. Resubmitting on your behalf.

bors r=[miretskiy]

@craig
Copy link
Contributor

craig bot commented Jan 28, 2021

Build succeeded:

@craig craig bot merged commit b5f4391 into cockroachdb:master Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cdc: Avro number roundtrip error on changefeed creation
5 participants