release-20.2: colexec: make unordered distinct streaming-like #57643

yuzefovich · 2020-12-07T18:02:09Z

Backport 1/1 commits from #57579.

/cc @cockroachdb/release

Previously, when executing an unordered distinct, we would build the
whole hash table and consume the input source entirely before emitting
any output. This is a suboptimal behavior when the query has a limit -
we're likely to reach the limit long time before consuming the whole
input source.

This commit makes the unordered distinct more streaming-like - it builds
the hash table one batch at a time, and whenever some distinct tuples
are appended to the hash table, all of them are emitted in the output.

Fixes: #57566.

Release note (performance improvement): Previously, CockroachDB when
performing an unordered DISTINCT operation via the vectorized execution
engine would buffer up all tuples from the input which is a suboptimal
behavior when the query has a LIMIT clause, and this has now been fixed.
This behavior was introduced in 20.1. Note that the old row-by-row
engine doesn't have this issue.

Previously, when executing an unordered distinct, we would build the whole hash table and consume the input source entirely before emitting any output. This is a suboptimal behavior when the query has a limit - we're likely to reach the limit long time before consuming the whole input source. This commit makes the unordered distinct more streaming-like - it builds the hash table one batch at a time, and whenever some distinct tuples are appended to the hash table, all of them are emitted in the output. Release note (performance improvement): Previously, CockroachDB when performing an unordered DISTINCT operation via the vectorized execution engine would buffer up all tuples from the input which is a suboptimal behavior when the query has a LIMIT clause, and this has now been fixed. This behavior was introduced in 20.1. Note that the old row-by-row engine doesn't have this issue.

cockroach-teamcity · 2020-12-07T18:02:18Z

This change is

asubiotto

LGTM

yuzefovich requested a review from asubiotto December 7, 2020 18:02

asubiotto approved these changes Dec 8, 2020

View reviewed changes

yuzefovich merged commit 97caaf6 into cockroachdb:release-20.2 Dec 8, 2020

yuzefovich deleted the backport20.2-57579 branch December 8, 2020 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-20.2: colexec: make unordered distinct streaming-like #57643

release-20.2: colexec: make unordered distinct streaming-like #57643

yuzefovich commented Dec 7, 2020

cockroach-teamcity commented Dec 7, 2020

asubiotto left a comment

release-20.2: colexec: make unordered distinct streaming-like #57643

release-20.2: colexec: make unordered distinct streaming-like #57643

Conversation

yuzefovich commented Dec 7, 2020

cockroach-teamcity commented Dec 7, 2020

asubiotto left a comment

Choose a reason for hiding this comment