-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve MarkDistinctHash #8967
Improve MarkDistinctHash #8967
Conversation
could the block builder take care of this? |
@findepi - I thought about it, and I'm not sure that complicating the general purpose usage of the block builder with a "no nulls" version handling is a win in general use cases. I looked around for other instances of "known to have no nulls" boolean blocks and the only ones I found were in the ORC reader and those were already just building |
i thought some block builders already do this, but maybe i misremember. |
Most blockbuilders (although not quite all, last time I checked) special case the null handling in that they will set |
@pettyjamesm do you have some perf numbers? |
assertTrue(builderBlock instanceof ByteArrayBlock); | ||
assertBlockEquals(BOOLEAN, wrappedBlock, builderBlock); | ||
// the wrapping instance does not copy the byte array defensively | ||
assertTrue(BOOLEAN.getBoolean(wrappedBlock, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AssertJ assertions would look much better here as you are actually comparing 0/1 byte values, not logical ones.
} | ||
|
||
@VisibleForTesting | ||
public int getCapacity() | ||
{ | ||
return groupByHash.getCapacity(); | ||
} | ||
|
||
private Block processNextGroupIds(GroupByIdBlock ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extraction itself might be a distinct commit. This way it is easier to compare the actual changes
core/trino-main/src/main/java/io/trino/operator/MarkDistinctHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestMarkDistinctOperator.java
Outdated
Show resolved
Hide resolved
} | ||
byte[] distinctMask = new byte[positions]; | ||
for (int position = 0; position < distinctMask.length; position++) { | ||
if (ids.getGroupId(position) == nextDistinctId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if branchless version could perform better here (Consider benchmarking it).
Something like:
int distinctMaskValue = ids.getGroupId(position) == nextDistinctId ? 1 : 0;
distinctMask[position] = distinctMaskValue;
nextDistinctId += distinctMaskValue;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about doing that here, but I chose not to for a couple reasons:
- There are no existing benchmarks to test
MarkDistinctHash
performance (which is itself, hard to isolate from the work of theGroupByHash
). This also answers @sopel39's question- I don't have good benchmark numbers for this change - I don't think that the branchless (ie:
cmov
version of this code) is actually likely to be better than the branching version in practice. This is a hard to prove fact, but I suspect that the control flow here is largely predictable in most query workloads where either most inputs are or are not distinct (ie: most of the time, > 75% of branches will go in the same direction). We could contrive a benchmark that shows whatever we want in that regard, but I'm not sure it would hold for real workloads and so I chose to leave the general code shape as-is in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the branchless (ie: cmov version of this code) is actually likely to be better than the branching version in practice.
We've seen branch reduction to have big impact actually. You might want to investigate it int follow-up work (with different distinct ratios).
ccee742
to
2a9911a
Compare
Adds a contract method to BooleanType that indicates that callers can legally choose to construct BooleanType blocks from byte[] values directly. Adding this method for bypassing BlockBuilders can be significant for BooleanType in particular, because the per-row overhead of ByteArrayBlockBuilder#valueIsNull is the same as for the values themselves, meaning each builder row is 2x larger when no values can be null. Also adds a similar utility to create single boolean value blocks directly from BooleanType.
Improves MarkDistinctHash by: - Detecting when no positions or all positions of the input are distinct and emitting a RunLengthEncoded block for the output mask - Building BooleanType mask blocks directly from byte[] instead of via the BlockBuilder, reducing allocation overhead per built mask by 1/2 since we know in advance no nulls will be present in the created block
2a9911a
to
3875070
Compare
Anything else I need to address on this one before it can be merged? |
} | ||
byte[] distinctMask = new byte[positions]; | ||
for (int position = 0; position < distinctMask.length; position++) { | ||
if (ids.getGroupId(position) == nextDistinctId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the branchless (ie: cmov version of this code) is actually likely to be better than the branching version in practice.
We've seen branch reduction to have big impact actually. You might want to investigate it int follow-up work (with different distinct ratios).
* encoding changes such that {@link ByteArrayBlock} is not always a valid or efficient representation, then this method must be | ||
* removed and any usages changed | ||
*/ | ||
public static Block wrapByteArrayAsBooleanBlockWithoutNulls(byte[] booleansAsBytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we probably should have similar methods for other types and have readers use them. cc @skrzypo987
Improves performance of MarkDistinctHash (and therefore, MarkDistinctOperator) by:
BooleanType
mask blocks directly frombyte[]
instead of via aBlockBuilder
, reducing allocation overhead per built mask by 1/2 since we know in advance no nulls will be present in the created block