[multistage] [enhancement] Split row data block when row size is too large by 61yao · Pull Request #9485 · apache/pinot

61yao · 2022-09-29T02:34:20Z

Split data block when the size is too large (exceeds grpc limit).

Set the limit to 4M for now.
The size is estimated from row size in bytes.

Columnar block split is not supported for now.

codecov-commenter · 2022-10-04T05:50:29Z

Codecov Report

Merging #9485 (649a4b2) into master (189fa18) will decrease coverage by 4.96%.
The diff coverage is 67.08%.

@@             Coverage Diff              @@
##             master    #9485      +/-   ##
============================================
- Coverage     68.72%   63.76%   -4.97%     
- Complexity     4860     5189     +329     
============================================
  Files          1924     1873      -51     
  Lines        102425   100317    -2108     
  Branches      15542    15304     -238     
============================================
- Hits          70392    63966    -6426     
- Misses        26994    31645    +4651     
+ Partials       5039     4706     -333

Flag	Coverage Δ
integration1	`?`
unittests1	`67.28% <67.08%> (+<0.01%)`	⬆️
unittests2	`15.62% <35.40%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...perator/dociditerators/ScanBasedDocIdIterator.java	`0.00% <0.00%> (ø)`
.../predicate/NotEqualsPredicateEvaluatorFactory.java	`54.46% <0.00%> (-25.35%)`	⬇️
...lter/predicate/NotInPredicateEvaluatorFactory.java	`63.38% <0.00%> (-14.98%)`	⬇️
.../operator/filter/predicate/PredicateEvaluator.java	`0.00% <0.00%> (-25.00%)`	⬇️
...ocal/segment/index/datasource/EmptyDataSource.java	`0.00% <0.00%> (ø)`
...nt/local/startree/v2/store/StarTreeDataSource.java	`37.50% <0.00%> (-2.50%)`	⬇️
...not/segment/spi/datasource/DataSourceMetadata.java	`100.00% <ø> (ø)`
...va/org/apache/pinot/spi/utils/CommonConstants.java	`24.65% <0.00%> (-0.35%)`	⬇️
...ter/predicate/EqualsPredicateEvaluatorFactory.java	`78.57% <12.50%> (-5.88%)`	⬇️
.../filter/predicate/InPredicateEvaluatorFactory.java	`69.92% <12.50%> (-12.48%)`	⬇️
... and 418 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

agavra

Thanks @61yao! Mostly minor comments, only one that I'd ask to see if you can address is the one about deserializing/reserializing the block.

agavra · 2022-10-04T16:28:36Z

...query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java

Suggested change

public static void testOnlySetMaxBlockSize(int maxBlockSize) {

public static void testOnlySetMaxBlockSizeMB(int maxBlockSizeMB) {

it's a bit confusing whether this is rows or size, but looking at the code looks like it's size

agavra · 2022-10-04T16:28:58Z

...query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java

it could be nice to make this configurable, with a 4MB default

agavra · 2022-10-04T16:37:19Z

...uery-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/TransferableBlockUtils.java

should we fail here as well? I think it's confusing to have a function that doesn't do what it says.

agavra · 2022-10-04T16:37:41Z

...uery-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/TransferableBlockUtils.java

nit, we can just move this down and have it fall into default

agavra · 2022-10-04T16:43:27Z

...uery-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/TransferableBlockUtils.java

this will deserialize the entire block just to re-serialize it again later when we buildFromRows - which I suspect will be a significant performance regression for large blocks. is there any way to do this split without going through the serde twice?

this is the first step. we will optimized this one out later

agavra · 2022-10-04T16:49:14Z

...t-query-runtime/src/test/java/org/apache/pinot/query/runtime/TransferableBlockUtilsTest.java

it's nice not to have randomness in tests, if an error happens due to a specific randomness generator we won't be able to debug it. Can we make this deterministic?

we can set a random seed. but for this particular test we are fine introduce some randomness

this reminds me... it is a good idea we should add a print out for the row itself for reproducibility

agavra · 2022-10-04T16:49:57Z

...t-query-runtime/src/test/java/org/apache/pinot/query/runtime/TransferableBlockUtilsTest.java

why are we excluding these types from our test?

b/c these types are not supported in datablocks

walterddr

lgtm. i did a minor clean up push on top and it should be good to go

siddharthteotia

To confirm - metadata block / error block / eos block are not split right ?

siddharthteotia · 2022-10-05T21:10:39Z

...uery-runtime/src/main/java/org/apache/pinot/query/runtime/blocks/TransferableBlockUtils.java

+    } else {
+      int rowSizeInBytes = ((RowDataBlock) block.getDataBlock()).getRowSizeInBytes();
+      int numRowsPerChunk = maxBlockSize / rowSizeInBytes;
+      Preconditions.checkState(numRowsPerChunk > 0, "row size too large for query engine to handle, abort!");


Can we include the offending rowSize in the message / exception ?

siddharthteotia · 2022-10-05T21:14:41Z

May be add a TODO to split columnar block as well in future

siddharthteotia · 2022-10-05T21:17:06Z

To confirm - metadata block / error block / eos block are not split right ?

Confirmed offline. Only Row data block at this point

- clean up the CSVs into AVROs so that it loads much faster - create integration test against H2 - rollback a change no longer necessary after #9485 however, severe thread thrashing is observed when JVM available core count < 4. tracked in #9563 Co-authored-by: Rong Rong <rongr@startree.ai>

61yao force-pushed the split_block branch 2 times, most recently from fc9bd66 to 5e25385 Compare October 3, 2022 22:44

agavra reviewed Oct 4, 2022

View reviewed changes

Yao Liu added 2 commits October 5, 2022 09:55

split row data block when row size is too large

793789c

fix test

68b7e51

walterddr force-pushed the split_block branch from 649a4b2 to c41fb00 Compare October 5, 2022 18:08

clean up

e10b38a

walterddr force-pushed the split_block branch from c41fb00 to e10b38a Compare October 5, 2022 20:05

walterddr approved these changes Oct 5, 2022

View reviewed changes

siddharthteotia approved these changes Oct 5, 2022

View reviewed changes

siddharthteotia reviewed Oct 5, 2022

View reviewed changes

walterddr merged commit 5a2ed96 into apache:master Oct 5, 2022

siddharthteotia added the multi-stage Related to the multi-stage query engine label Oct 5, 2022

This was referenced Oct 6, 2022

[multistage] split data table when size is too large #9272

Closed

SSB integration test #9559

Merged

	public static void testOnlySetMaxBlockSize(int maxBlockSize) {
	public static void testOnlySetMaxBlockSizeMB(int maxBlockSizeMB) {

Conversation

61yao commented Sep 29, 2022

Uh oh!

codecov-commenter commented Oct 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

agavra left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

walterddr Oct 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

walterddr Oct 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

walterddr left a comment

Choose a reason for hiding this comment

Uh oh!

siddharthteotia left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

siddharthteotia commented Oct 5, 2022

Uh oh!

siddharthteotia commented Oct 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-commenter commented Oct 4, 2022 •

edited

Loading

walterddr Oct 5, 2022 •

edited

Loading

walterddr Oct 4, 2022 •

edited

Loading