New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
norm: do not fold floor division with unit denominator #87808
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix!
Could we make an exception for the case when the left argument is an integer? Since there wouldn't be any precision issues in that case. Also, can you add a test for //
with both inputs constant to the FoldBinary
tests to make sure it's still folded?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner)
a4f5062
to
6e24247
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r1, 4 of 4 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @msirek)
pkg/sql/opt/norm/general_funcs.go
line 69 at r2 (raw file):
// IsIntFamily returns true if the given scalar expression is of one of the // integer types. func (c *CustomFuncs) IsIntFamily(scalar opt.ScalarExpr) bool {
nit: I'd just name this IsInt
to match the other functions here
pkg/sql/opt/norm/rules/numeric.opt
line 51 at r2 (raw file):
(Cast $left (BinaryType (OpName) $left $right)) # FoldFloorDivOne folds $left / 1 for integer types.
nit: /
=> //
pkg/sql/logictest/testdata/logic_test/operator
line 52 at r2 (raw file):
statement ok CREATE TABLE table2 (col2 FLOAT8 NULL)
nit: it's common to name tables for regression tests after issue numbers, like t87605
, which ensures they'll never conflict with other tables in the file.
pkg/sql/logictest/testdata/logic_test/operator
line 63 at r2 (raw file):
---- 1.234567e+06 1.2345678901234e+13
I think pkg/sql/logictest/testdata/logic_test/float
would be a better place for this test.
I think we should probably backport this to 21.2, 22.1, and 22.2.1 after a baking period. I've added the labels. |
6e24247
to
51f987a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
pkg/sql/opt/norm/general_funcs.go
line 69 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: I'd just name this
IsInt
to match the other functions here
Done
pkg/sql/opt/norm/rules/numeric.opt
line 51 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit:
/
=>//
Done
pkg/sql/logictest/testdata/logic_test/operator
line 52 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: it's common to name tables for regression tests after issue numbers, like
t87605
, which ensures they'll never conflict with other tables in the file.
renamed
pkg/sql/logictest/testdata/logic_test/operator
line 63 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I think
pkg/sql/logictest/testdata/logic_test/float
would be a better place for this test.
moved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r2, 5 of 5 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should probably backport this to 21.2, 22.1, and 22.2.1 after a baking period. I've added the labels.
Does this fix all known occurences of #86790? If so, I think it's safe to backport immediately to silence the nightly failures.
Also, I can revert #87840 if you think it's safe to do so now.
Reviewed 5 of 5 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
The floor division operator `//` is like division but drops the fractional portion of the result. Normalization rule FoldDivOne rewrites `(column // 1)` as `(column)` which is incorrect if the data in `column` has fractional digits, as those digits should be dropped. The solution is to only fold `(column // 1)` when `column` is one of the integer types. Release note (bug fix): This patch fixes incorrect results from the floor division operator, `//`, when the numerator is non-constant and the denominator is the constant 1.
51f987a
to
9c01a38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't fix #86790. To do so we'd have to extend this fix to cover not just FloorDiv, but Div too.
If we're doing just a CAST, the exponent is set to zero:
cockroach/pkg/sql/sem/eval/cast.go
Lines 324 to 325 in 655ae28
case *tree.DInt: | |
dd.SetInt64(int64(*v)) |
If the division is done, we set the exponent to -1 here:
https://github.com/cockroachdb/apd/blob/e2030eb9d16dcd55cc79147ae1a94be1f3c2037e/context.go#L364
I think we could fix it by making sure the exponent is 0 when the remainder is 0 (by setting adjCoeffs
equal to adjExp10
).
This would impact many things outside just this case.
It is probably safer just to extend this current fix to also cover Div (only fold the Div if the right side is an integer).
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, skipping folding of Div if the right side is an integer would not be sufficient. We'd have to have a special CAST operation to decimal which always uses a scale of 1, no matter the final precision.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a benchmark for dividing int64 by constant int64(1). It is indeed slower than casting, so I guess we might not want to get rid of FoldDivOne, though I don't know how important it is to optimize divide by 1. I guess through constant folding we could have a constant expression which evaluates to 1, so maybe it's useful.
BenchmarkCastOp/useSel=true/hasNulls=true/int_to_decimal-32 123115 9526 ns/op 860.00 MB/s
BenchmarkCastOp/useSel=true/hasNulls=false/int_to_decimal-32 106650 10174 ns/op 805.15 MB/s
BenchmarkCastOp/useSel=false/hasNulls=true/int_to_decimal-32 67879 16494 ns/op 496.68 MB/s
BenchmarkCastOp/useSel=false/hasNulls=false/int_to_decimal-32 70771 16625 ns/op 492.75 MB/s
BenchmarkProjDivInt64ConstInt64Op/useSel=true/hasNulls=true/type=int-32 10000 101996 ns/op 80.32 MB/s
BenchmarkProjDivInt64ConstInt64Op/useSel=true/hasNulls=false/type=int-32 11337 109312 ns/op 74.94 MB/s
BenchmarkProjDivInt64ConstInt64Op/useSel=false/hasNulls=true/type=int-32 5575 189898 ns/op 43.14 MB/s
BenchmarkProjDivInt64ConstInt64Op/useSel=false/hasNulls=false/type=int-32 5844 202576 ns/op 40.44 MB/s
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @mgartner)
571616f
to
5d1b5f5
Compare
Yeah, it might seem silly that someone would write something like this, but besides the case you mentioned of constant-folding, it can also get generated by tools like ORMs that users don't really have much control over. |
Yeah, I wasn't thinking of hand-written queries having this. Definitely it's from either ORMs or other app-generated queries which are not smart enough to write better SQL (unless they're doing it for some other reason like formatting purposes). I was just hoping for an easy fix. If it's too tough for a quick fix, maybe I won't try to handle it in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner, @msirek, and @yuzefovich)
pkg/sql/colexec/colexecbase/project_div_test.go
line 31 at r4 (raw file):
) func BenchmarkProjDivInt64ConstInt64Op(b *testing.B) {
I think colexecproj.BenchmarkProjOp
already does this, apart from using Decimal
as the input types. If the type is important, it's probably better to just modify it instead of adding a new benchmark.
5d1b5f5
to
9c01a38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)
pkg/sql/colexec/colexecbase/project_div_test.go
line 31 at r4 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I think
colexecproj.BenchmarkProjOp
already does this, apart from usingDecimal
as the input types. If the type is important, it's probably better to just modify it instead of adding a new benchmark.
Sure. I found this command seems to test the same thing:
./dev bench pkg/sql/colexec/colexecproj -f BenchmarkProjOp/projDivInt64Int64Const --cpus 1 --count 5 --verbose --ignore-cache
I've removed the second commit and will merge the original version soon.
The FoldDivOne issue is handled through #88485.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
bors r+
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 9c01a38 to blathers/backport-release-21.2-87808: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. error creating merge commit from 9c01a38 to blathers/backport-release-22.1-87808: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. error creating merge commit from 9c01a38 to blathers/backport-release-22.2-87808: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
@msirek can you please open these backports manually since Blathers failed? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale)
Fixes #87605
The floor division operator
//
is like division but drops thefractional portion of the result. Normalization rule FoldDivOne
rewrites
(column // 1)
as(column)
which is incorrect if thedata in
column
has fractional digits, as those digits should bedropped.
The solution is to only fold
(column // 1)
whencolumn
is one ofthe integer types.
Release note (bug fix): This patch fixes incorrect results from
the floor division operator,
//
, when the numerator is non-constantand the denominator is the constant 1.