Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.1: colexec: fix performance inefficiency in materializer #48732

Merged
merged 1 commit into from
May 13, 2020

Conversation

yuzefovich
Copy link
Member

Backport 1/2 commits from #48669.

/cc @cockroachdb/release


colexec: fix performance inefficiency in materializer

We mistakenly were passing sqlbase.DatumAlloc by value, and not by
pointer, and as a result we would always be allocating 16 datums but
using only 1 - i.e. we were not only not pooling the allocations, but
actually making a bunch of useless allocations as well.

This inefficiency becomes noticeable when the vectorized query returns
many rows and when we have wrapped processors and those processors get
a lot of input rows - in all cases when we need to materialize a lot.
For example, TPC-H query 16 sees about 10% improvement (it returns 18k
rows) and TPC-DS query 6 sees 2x improvement (it has wrapped hash
aggregator with a decimal column) with this fix.

Release note (performance improvement): A performance inefficiency has
been fixed in the vectorized execution engine which results in speed ups
on all queries when run via the vectorized engine, with most noticeable
gains on the queries that output many rows.

We mistakenly were passing `sqlbase.DatumAlloc` by value, and not by
pointer, and as a result we would always be allocating 16 datums but
using only 1 - i.e. we were not only not pooling the allocations, but
actually making a bunch of useless allocations as well.

This inefficiency becomes noticeable when the vectorized query returns
many rows and when we have wrapped processors and those processors get
a lot of input rows - in all cases when we need to materialize a lot.
For example, TPC-H query 16 sees about 10% improvement (it returns 18k
rows) and TPC-DS query 6 sees 2x improvement (it has wrapped hash
aggregator with a decimal column) with this fix.

Release note (performance improvement): A performance inefficiency has
been fixed in the vectorized execution engine which results in speed ups
on all queries when run via the vectorized engine, with most noticeable
gains on the queries that output many rows.
@yuzefovich yuzefovich requested a review from asubiotto May 12, 2020 17:48
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich merged commit a0d8007 into cockroachdb:release-20.1 May 13, 2020
@yuzefovich yuzefovich deleted the backport20.1-48669 branch May 13, 2020 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants