Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MV raw forward index and MV BYTES data type #7595

Merged
merged 16 commits into from
Oct 22, 2021

Conversation

richardstartin
Copy link
Member

@richardstartin richardstartin commented Oct 19, 2021

Description

Co-authored with @kishoreg

  • Introduces BYTES_ARRAY type
  • Introduces forward index writers/readers for MV fixed and variable length raw bytes columns
  • Collects metadata for the largest row in bytes so it can be used for forward index chunk sizing (we choose the largest chunk size of 1MB or the largest row in bytes, so we guarantee a single row fits in a chunk, and that we don't risk estimating enormous chunk sizes based on the other statistics available)
  • Removes the compression buffer, compress directly into the forward index file.

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

Documentation

@codecov-commenter
Copy link

codecov-commenter commented Oct 19, 2021

Codecov Report

Merging #7595 (d8bd2ad) into master (6fef210) will decrease coverage by 40.58%.
The diff coverage is 0.85%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master    #7595       +/-   ##
=============================================
- Coverage     71.59%   31.01%   -40.59%     
=============================================
  Files          1559     1553        -6     
  Lines         79025    79022        -3     
  Branches      11702    11710        +8     
=============================================
- Hits          56579    24508    -32071     
- Misses        18639    52417    +33778     
+ Partials       3807     2097     -1710     
Flag Coverage Δ
integration1 29.44% <0.85%> (-0.06%) ⬇️
integration2 27.80% <0.57%> (-0.09%) ⬇️
unittests1 ?
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ot/segment/local/io/compression/LZ4Compressor.java 0.00% <0.00%> (-100.00%) ⬇️
...nt/local/io/compression/PassThroughCompressor.java 0.00% <0.00%> (-100.00%) ⬇️
...segment/local/io/compression/SnappyCompressor.java 0.00% <0.00%> (-100.00%) ⬇️
...ment/local/io/compression/ZstandardCompressor.java 0.00% <0.00%> (-100.00%) ⬇️
.../io/writer/impl/BaseChunkSVForwardIndexWriter.java 0.00% <0.00%> (-85.72%) ⬇️
...riter/impl/FixedByteChunkSVForwardIndexWriter.java 0.00% <ø> (-100.00%) ⬇️
.../writer/impl/VarByteChunkSVForwardIndexWriter.java 0.00% <0.00%> (-100.00%) ⬇️
...ment/creator/impl/SegmentColumnarIndexCreator.java 0.00% <0.00%> (-86.67%) ⬇️
...r/impl/fwd/MultiValueFixedByteRawIndexCreator.java 0.00% <0.00%> (ø)
...tor/impl/fwd/MultiValueVarByteRawIndexCreator.java 0.00% <0.00%> (ø)
... and 1070 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6fef210...d8bd2ad. Read the comment docs.

@richardstartin richardstartin force-pushed the mv-fwd-index branch 5 times, most recently from ea9a92b to 174f00b Compare October 19, 2021 22:07
@atris
Copy link
Contributor

atris commented Oct 20, 2021

This PR has a conflict with #7604 -- we need to figure out the sequencing of these two (duplicate commits for the FWD index).

@richardstartin
Copy link
Member Author

Can the other PR wait for this and then rebase? The work done here is intended to prevent OOM, and the common commits can’t be merged without the rest of this PR.

@richardstartin richardstartin force-pushed the mv-fwd-index branch 2 times, most recently from d8e37f4 to ae5701e Compare October 20, 2021 13:48
@richardstartin richardstartin marked this pull request as ready for review October 20, 2021 13:53
@kishoreg
Copy link
Member

This PR has a conflict with #7604 -- we need to figure out the sequencing of these two (duplicate commits for the FWD index).

Hi @atris. Let's get this one in first and then rebase text index support on top of this.

@richardstartin
Copy link
Member Author

I had to force derivation of numDocs for variable length data because there's no good solution to the buffer size problem given the following constraints:

  • There is a fixed number of documents per chunk
  • We don't want to OOM if there is a very large row in a segment, and applying an arbitrary multiplier amplifies this risk
  • Compression is applied at a chunk level, not intrachunk
  • The compression libraries all require a single buffer

When there is a very large row (> 1MB) we end up with 1 doc per chunk in the segment. The only good solution is to evolve the forward index format to allow variable numbers of docs per chunk for variable length data, but we can do that later if this becomes a problem,

@richardstartin richardstartin changed the title MV fwd index + MV BYTES Add MV raw forward index and MV BYTES data type Oct 20, 2021
Comment on lines +119 to +128
byte[] bytes = new byte[Integer.BYTES
+ values.length * Integer.BYTES]; //numValues, bytes required to store the content
ByteBuffer byteBuffer = ByteBuffer.wrap(bytes);
//write the length
byteBuffer.putInt(values.length);
//write the content of each element
for (final int value : values) {
byteBuffer.putInt(value);
}
_indexWriter.putBytes(bytes);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not require allocation of a temporary buffer, this could just be implemented as an MV pattern on _indexWriter, just as was done to eliminate the much larger buffers for byte[][] and String[]

Copy link
Contributor

@mayankshriv mayankshriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, lgtm otherwise.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reader for fixed-length MV is not implemented

throws IOException {
File file = new File(baseIndexDir,
column + Indexes.RAW_MV_FORWARD_INDEX_FILE_EXTENSION);
FileUtils.deleteQuietly(file);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) unnecessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kishoreg can you explain why you included this?

int length = value.length();
_minLength = Math.min(_minLength, length);
_maxLength = Math.max(_maxLength, length);
rowLength += length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we count the actual encoded bytes length? We need to add (1 + length) integers to this. Same for STRING type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems wrong to me, this is the length of the data and it's not known whether it would be length prefixed (+4) or null terminated (+1) here and adding either would prevent the other.

throw new UnsupportedOperationException();
}

default int getFloatMV(int docId, float[] valueBuffer, T context, int[] parentIndices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I hadn't noticed this from the initial commits @kishoreg made.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it in a follow up

kriti-sc pushed a commit to kriti-sc/incubator-pinot that referenced this pull request Dec 12, 2021
* Initial code for MultiValue forward Index

* Wiring in the segment creation driver Impl

* cleanup

* finish off adding BYTES_ARRAY type

* use less memory and fewer passes during encoding

* reduce memory requirement for forwardindexwriter

* track size in bytes of largest row so chunks can be sized to accommodate it

* remove TODOs

* force derivation of number of docs for raw MV columns

* specify character encoding

* leave changes to integration tests to MV TEXT index implementation

* fix javadoc

* don't use StringUtils

* fix formatting after rebase

* fix javadoc formatting again

* use zstd's compress bound

Co-authored-by: kishoreg <g.kishore@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants