Skip to content

[multistage] [enhancement] Split row data block when row size is too large#9485

Merged
walterddr merged 3 commits intoapache:masterfrom
61yao:split_block
Oct 5, 2022
Merged

[multistage] [enhancement] Split row data block when row size is too large#9485
walterddr merged 3 commits intoapache:masterfrom
61yao:split_block

Conversation

@61yao
Copy link
Contributor

@61yao 61yao commented Sep 29, 2022

Split data block when the size is too large (exceeds grpc limit).

Set the limit to 4M for now.
The size is estimated from row size in bytes.

Columnar block split is not supported for now.

@61yao 61yao force-pushed the split_block branch 2 times, most recently from fc9bd66 to 5e25385 Compare October 3, 2022 22:44
@codecov-commenter
Copy link

codecov-commenter commented Oct 4, 2022

Codecov Report

Merging #9485 (649a4b2) into master (189fa18) will decrease coverage by 4.96%.
The diff coverage is 67.08%.

@@             Coverage Diff              @@
##             master    #9485      +/-   ##
============================================
- Coverage     68.72%   63.76%   -4.97%     
- Complexity     4860     5189     +329     
============================================
  Files          1924     1873      -51     
  Lines        102425   100317    -2108     
  Branches      15542    15304     -238     
============================================
- Hits          70392    63966    -6426     
- Misses        26994    31645    +4651     
+ Partials       5039     4706     -333     
Flag Coverage Δ
integration1 ?
unittests1 67.28% <67.08%> (+<0.01%) ⬆️
unittests2 15.62% <35.40%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...perator/dociditerators/ScanBasedDocIdIterator.java 0.00% <0.00%> (ø)
.../predicate/NotEqualsPredicateEvaluatorFactory.java 54.46% <0.00%> (-25.35%) ⬇️
...lter/predicate/NotInPredicateEvaluatorFactory.java 63.38% <0.00%> (-14.98%) ⬇️
.../operator/filter/predicate/PredicateEvaluator.java 0.00% <0.00%> (-25.00%) ⬇️
...ocal/segment/index/datasource/EmptyDataSource.java 0.00% <0.00%> (ø)
...nt/local/startree/v2/store/StarTreeDataSource.java 37.50% <0.00%> (-2.50%) ⬇️
...not/segment/spi/datasource/DataSourceMetadata.java 100.00% <ø> (ø)
...va/org/apache/pinot/spi/utils/CommonConstants.java 24.65% <0.00%> (-0.35%) ⬇️
...ter/predicate/EqualsPredicateEvaluatorFactory.java 78.57% <12.50%> (-5.88%) ⬇️
.../filter/predicate/InPredicateEvaluatorFactory.java 69.92% <12.50%> (-12.48%) ⬇️
... and 418 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @61yao! Mostly minor comments, only one that I'd ask to see if you can address is the one about deserializing/reserializing the block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static void testOnlySetMaxBlockSize(int maxBlockSize) {
public static void testOnlySetMaxBlockSizeMB(int maxBlockSizeMB) {

it's a bit confusing whether this is rows or size, but looking at the code looks like it's size

Comment on lines 74 to 76
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be nice to make this configurable, with a 4MB default

Comment on lines 64 to 65
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we fail here as well? I think it's confusing to have a function that doesn't do what it says.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, we can just move this down and have it fall into default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will deserialize the entire block just to re-serialize it again later when we buildFromRows - which I suspect will be a significant performance regression for large blocks. is there any way to do this split without going through the serde twice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the first step. we will optimized this one out later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's nice not to have randomness in tests, if an error happens due to a specific randomness generator we won't be able to debug it. Can we make this deterministic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can set a random seed. but for this particular test we are fine introduce some randomness

Copy link
Contributor

@walterddr walterddr Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reminds me... it is a good idea we should add a print out for the row itself for reproducibility

Comment on lines 47 to 41
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we excluding these types from our test?

Copy link
Contributor

@walterddr walterddr Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b/c these types are not supported in datablocks

Copy link
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. i did a minor clean up push on top and it should be good to go

Copy link
Contributor

@siddharthteotia siddharthteotia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm - metadata block / error block / eos block are not split right ?

} else {
int rowSizeInBytes = ((RowDataBlock) block.getDataBlock()).getRowSizeInBytes();
int numRowsPerChunk = maxBlockSize / rowSizeInBytes;
Preconditions.checkState(numRowsPerChunk > 0, "row size too large for query engine to handle, abort!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include the offending rowSize in the message / exception ?

@walterddr walterddr merged commit 5a2ed96 into apache:master Oct 5, 2022
@siddharthteotia
Copy link
Contributor

May be add a TODO to split columnar block as well in future

@siddharthteotia
Copy link
Contributor

To confirm - metadata block / error block / eos block are not split right ?

Confirmed offline. Only Row data block at this point

@siddharthteotia siddharthteotia added the multi-stage Related to the multi-stage query engine label Oct 5, 2022
walterddr added a commit that referenced this pull request Oct 10, 2022
- clean up the CSVs into AVROs so that it loads much faster
- create integration test against H2
- rollback a change no longer necessary after #9485

however, severe thread thrashing is observed when JVM available core count < 4. tracked in #9563

Co-authored-by: Rong Rong <rongr@startree.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants