Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve MarkDistinctHash #8967

Merged
merged 2 commits into from
Sep 6, 2021

Conversation

pettyjamesm
Copy link
Member

Improves performance of MarkDistinctHash (and therefore, MarkDistinctOperator) by:

  • Detecting when no positions or all positions of the input are distinct and emitting a RunLengthEncoded block for the output mask (so long as the input position count is > 1).
  • Building BooleanType mask blocks directly from byte[] instead of via a BlockBuilder, reducing allocation overhead per built mask by 1/2 since we know in advance no nulls will be present in the created block

@findepi
Copy link
Member

findepi commented Aug 26, 2021

  • Building BooleanType mask blocks directly from byte[] instead of via a BlockBuilder, reducing allocation overhead per built mask by 1/2 since we know in advance no nulls will be present in the created block

could the block builder take care of this?

@pettyjamesm
Copy link
Member Author

pettyjamesm commented Aug 26, 2021

could the block builder take care of this?

@findepi - I thought about it, and I'm not sure that complicating the general purpose usage of the block builder with a "no nulls" version handling is a win in general use cases. I looked around for other instances of "known to have no nulls" boolean blocks and the only ones I found were in the ORC reader and those were already just building ByteArrayBlock instances directly. This seemed safer, although we could consider dropping the static helper methods and using ByteArrayBlock directly here too? Adding the explicit contract seemed better for longer term maintainability, but either is fine.

@findepi
Copy link
Member

findepi commented Aug 26, 2021

complicating the general purpose usage of the block builder with a "no nulls" version handling is a win in general use cases.

i thought some block builders already do this, but maybe i misremember.

@pettyjamesm
Copy link
Member Author

pettyjamesm commented Aug 26, 2021

i thought some block builders already do this, but maybe i misremember.

Most blockbuilders (although not quite all, last time I checked) special case the null handling in that they will set valueIsNull = null on the block output when no nulls were present in the builder- but none I've seen avoid initializing a boolean valueIsNull[] array within the builder which is what this change avoids by bypassing the builder entirely.

@findepi findepi requested review from sopel39, skrzypo987 and lukasz-stec and removed request for findepi September 1, 2021 09:22
@sopel39
Copy link
Member

sopel39 commented Sep 1, 2021

@pettyjamesm do you have some perf numbers?

assertTrue(builderBlock instanceof ByteArrayBlock);
assertBlockEquals(BOOLEAN, wrappedBlock, builderBlock);
// the wrapping instance does not copy the byte array defensively
assertTrue(BOOLEAN.getBoolean(wrappedBlock, 0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AssertJ assertions would look much better here as you are actually comparing 0/1 byte values, not logical ones.

}

@VisibleForTesting
public int getCapacity()
{
return groupByHash.getCapacity();
}

private Block processNextGroupIds(GroupByIdBlock ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extraction itself might be a distinct commit. This way it is easier to compare the actual changes

}
byte[] distinctMask = new byte[positions];
for (int position = 0; position < distinctMask.length; position++) {
if (ids.getGroupId(position) == nextDistinctId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if branchless version could perform better here (Consider benchmarking it).
Something like:

int distinctMaskValue = ids.getGroupId(position) == nextDistinctId ? 1 : 0;
distinctMask[position] = distinctMaskValue;
nextDistinctId += distinctMaskValue;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about doing that here, but I chose not to for a couple reasons:

  1. There are no existing benchmarks to test MarkDistinctHash performance (which is itself, hard to isolate from the work of the GroupByHash). This also answers @sopel39's question- I don't have good benchmark numbers for this change
  2. I don't think that the branchless (ie: cmov version of this code) is actually likely to be better than the branching version in practice. This is a hard to prove fact, but I suspect that the control flow here is largely predictable in most query workloads where either most inputs are or are not distinct (ie: most of the time, > 75% of branches will go in the same direction). We could contrive a benchmark that shows whatever we want in that regard, but I'm not sure it would hold for real workloads and so I chose to leave the general code shape as-is in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the branchless (ie: cmov version of this code) is actually likely to be better than the branching version in practice.

We've seen branch reduction to have big impact actually. You might want to investigate it int follow-up work (with different distinct ratios).

@pettyjamesm pettyjamesm force-pushed the improve-mark-distinct-hash branch 2 times, most recently from ccee742 to 2a9911a Compare September 1, 2021 14:14
Adds a contract method to BooleanType that indicates that callers
can legally choose to construct BooleanType blocks from byte[]
values directly. Adding this method for bypassing BlockBuilders
can be significant for BooleanType in particular, because the
per-row overhead of ByteArrayBlockBuilder#valueIsNull is the same
as for the values themselves, meaning each builder row is 2x larger
when no values can be null.

Also adds a similar utility to create single boolean value blocks
directly from BooleanType.
Improves MarkDistinctHash by:
- Detecting when no positions or all positions of the input are
  distinct and emitting a RunLengthEncoded block for the output
  mask
- Building BooleanType mask blocks directly from byte[] instead
  of via the BlockBuilder, reducing allocation overhead per built
  mask by 1/2 since we know in advance no nulls will be present
  in the created block
@pettyjamesm
Copy link
Member Author

Anything else I need to address on this one before it can be merged?

}
byte[] distinctMask = new byte[positions];
for (int position = 0; position < distinctMask.length; position++) {
if (ids.getGroupId(position) == nextDistinctId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the branchless (ie: cmov version of this code) is actually likely to be better than the branching version in practice.

We've seen branch reduction to have big impact actually. You might want to investigate it int follow-up work (with different distinct ratios).

* encoding changes such that {@link ByteArrayBlock} is not always a valid or efficient representation, then this method must be
* removed and any usages changed
*/
public static Block wrapByteArrayAsBooleanBlockWithoutNulls(byte[] booleansAsBytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we probably should have similar methods for other types and have readers use them. cc @skrzypo987

@sopel39 sopel39 merged commit 15c9c53 into trinodb:master Sep 6, 2021
@sopel39 sopel39 mentioned this pull request Sep 6, 2021
13 tasks
@pettyjamesm pettyjamesm deleted the improve-mark-distinct-hash branch September 6, 2021 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants