Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/distsqlrun: don't inspect encoding output during distinct #37901

Merged
merged 1 commit into from May 30, 2019

Conversation

Projects
None yet
3 participants
@mjibson
Copy link
Member

commented May 29, 2019

In the beginning of times a commit was added to teach DISTINCT about
the difference between filter columns and selected columns. In that
commit a check was added 1 such that the seen marker would only be
added if the encoded version of the column contained > 0 bytes. That
commit doesn't suggest a reason for the addition of that check, and it
is unclear to me now why it was added. (Note that I don't have experience
in the distsql directories, so I may be missing some history.) This file
has been improved since then, but the diligently check remained.

That check caused a GROUP BY with two rows each of empty arrays to not
consider those arrays equal (again, since the seen marker was avoided). A
experiment removing the check showed that no existing tests failed as
a result. And in addition, this new failing test now passed. I can't
find any evidence that this check was necessary, or why it was present
in the first place. I conclude that it is safe to remove until we find
a counter example.

Fixes #37544

Release note (bug fix): Fix GROUP BY for empty arrays.

sql/distsqlrun: don't inspect encoding output during distinct
In the beginning of times a commit was added to teach DISTINCT about
the difference between filter columns and selected columns. In that
commit a check was added [1] such that the seen marker would only be
added if the encoded version of the column contained > 0 bytes. That
commit doesn't suggest a reason for the addition of that check, and it
is unclear to me now why it was added. (Note that I don't have experience
in the distsql directories, so I may be missing some history.) This file
has been improved since then, but the diligently check remained.

That check caused a GROUP BY with two rows each of empty arrays to not
consider those arrays equal (again, since the seen marker was avoided). A
experiment removing the check showed that no existing tests failed as
a result. And in addition, this new failing test now passed. I can't
find any evidence that this check was necessary, or why it was present
in the first place. I conclude that it is safe to remove until we find
a counter example.

Fixes #37544

[1]: 965107f#diff-6a63b13f6fae0ef7417b27292db3f04aR130

Release note (bug fix): Fix GROUP BY for empty arrays.

@mjibson mjibson requested a review from asubiotto May 29, 2019

@mjibson mjibson requested review from cockroachdb/distsql-prs as code owners May 29, 2019

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

commented May 29, 2019

This change is Reviewable

@asubiotto
Copy link
Contributor

left a comment

:lgtm:

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@mjibson

This comment has been minimized.

Copy link
Member Author

commented May 30, 2019

bors r+

craig bot pushed a commit that referenced this pull request May 30, 2019

Merge #37901
37901: sql/distsqlrun: don't inspect encoding output during distinct r=mjibson a=mjibson

In the beginning of times a commit was added to teach DISTINCT about
the difference between filter columns and selected columns. In that
commit a check was added [1] such that the seen marker would only be
added if the encoded version of the column contained > 0 bytes. That
commit doesn't suggest a reason for the addition of that check, and it
is unclear to me now why it was added. (Note that I don't have experience
in the distsql directories, so I may be missing some history.) This file
has been improved since then, but the diligently check remained.

That check caused a GROUP BY with two rows each of empty arrays to not
consider those arrays equal (again, since the seen marker was avoided). A
experiment removing the check showed that no existing tests failed as
a result. And in addition, this new failing test now passed. I can't
find any evidence that this check was necessary, or why it was present
in the first place. I conclude that it is safe to remove until we find
a counter example.

Fixes #37544

[1]: 965107f#diff-6a63b13f6fae0ef7417b27292db3f04aR130

Release note (bug fix): Fix GROUP BY for empty arrays.

Co-authored-by: Matt Jibson <matt.jibson@gmail.com>
@craig

This comment has been minimized.

Copy link

commented May 30, 2019

Build succeeded

@craig craig bot merged commit 335955f into cockroachdb:master May 30, 2019

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details

@mjibson mjibson deleted the mjibson:group-array branch May 30, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.