Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality #730

Merged
merged 21 commits into from
Jun 26, 2019

Conversation

iverase
Copy link
Contributor

@iverase iverase commented Jun 19, 2019

Description

Currently if a leaf on the BKD tree contains only few values, then the leaf is treated the same way as it all values are different. It many cases it can be much more efficient to store the distinct values with the cardinality.

Solution

The strategy is the following:

  1. When writing a leaf block the cardinality is computed.
  2. Perform some naive calculation to compute if it is better to store the leaf as a low cardinality leaf. The storage cost are calculated as follows:
    low cardinality: leafCardinality * (packedBytesLength - prefixLenSum + 2) where two is the estimated size of storing the cardinality. This is an overestimation as in some cases you will only need one byte to store the cardinality.
    High cardinality: count * (packedBytesLength - prefixLenSum). We are not taking into account the runlen compression.
  3. If the tree has low cardinality then we set the compressed dim to -2. Note that -1 is when all values are equal.

Tests

BKD tree is extensively tested already so there is no need to add new tests.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach.

@@ -530,7 +602,7 @@ private void visitCompressedDocValues(int[] commonPrefixLengths, byte[] scratchP

private int readCompressedDim(IndexInput in) throws IOException {
int compressedDim = in.readByte();
if (compressedDim < -1 || compressedDim >= numDataDims) {
if (compressedDim < -2 || compressedDim >= numDataDims) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe fail if compressedDim is -2 and version is not gte VERSION_LOW_CARDINALITY_LEAVES?

visitDocValuesWithCardinality(commonPrefixLengths, scratchDataPackedValue, scratchMinIndexPackedValue, scratchMaxIndexPackedValue, in, docIDs, count, visitor);
} else {
visitDocValuesNoCardinality(commonPrefixLengths, scratchDataPackedValue, scratchMinIndexPackedValue, scratchMaxIndexPackedValue, in, docIDs, count, visitor);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we always call visitDocValuesWithCardinality? It seems to include the version check already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we cannot because in the previous version of the BKD tree the compressed dim byte was written after the leaf bounds. This is the reason I have to fork the version here.

int prefixLenSum = Arrays.stream(commonPrefixLengths).sum();
if (prefixLenSum == packedBytesLength) {
// all values in this block are equal
out.writeByte((byte) -1);
} else {
} else if (leafCardinality * (packedBytesLength - prefixLenSum + 2) <= count * (packedBytesLength - prefixLenSum)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I reading it right that you are counting 2 for the vint? I think you could make it 1 instead, the reasoning being that if you vints are 2 bytes on average, then it means than your runs are very long (vint start using 2 bytes when they are greater than 127) and so the sparse encoding is an obvious win.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was too conservative .

@iverase
Copy link
Contributor Author

iverase commented Jun 21, 2019

I had a look in how good is the formula to decide to use this optimisation and results are very interesting. I have only done it for 1D so far but it seems we are underestimating the compression in 1 dimension so the result is a bigger index.

The test has been done randomly ingesting 10M intPoint, in each iteration the cardinality has been increased. The size of the index has been calculated after indexing the data single threaded and after force merge into one segment.

  • Considering the size of vint of 1 Byte results are pretty bad for some cardinalities.
leafCardinality * (packedBytesLength - prefixLenSum + 1)  <= count * (packedBytesLength - prefixLenSum)

image

  • Considering the size of vint of 2 Byte results improve but still there is a region were index becomes bigger.
leafCardinality * (packedBytesLength - prefixLenSum + 2)  <= count * (packedBytesLength - prefixLenSum)

image

  • Considering the size of vint of 3 Byte results improve but still there is a region were index becomes bigger.
leafCardinality * (packedBytesLength - prefixLenSum + 3)  <= count * (packedBytesLength - prefixLenSum)

image

@iverase
Copy link
Contributor Author

iverase commented Jun 21, 2019

I think the calculation was incorrect, we need to consider the runLen compression, which basically means that if totally efficient there is one less byte on the high cardinality side per point. Therefore the formula looks like:

leafCardinality * (packedBytesLength - prefixLenSum + 1)  <= count * (packedBytesLength - prefixLenSum - 1)

With these computation, plot looks fine:

image

@iverase
Copy link
Contributor Author

iverase commented Jun 24, 2019

I had another iteration, main things are:

  1. Now the cost of runLen compression is computed exact.

  2. Cardinality is computed after the leaf has been sorted by the elected dimension. This computation does not copy arrays around.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments, in general it looks good to me. I wonder whether we should apply the run-length compression on the sorted dimension in the low-cardinality case as well, this could save some additional bytes, and might make the logic to decide whether to use the low-cardinality or high-cardinality encoding a bit easier?

visitor.grow(count);
visitUniqueRawDocValues(scratchDataPackedValue, docIDs, count, visitor);
} else {
if (numIndexDims != 1 && version >= BKDWriter.VERSION_LEAF_STORES_BOUNDS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the version check shouldn't be necessary since this method is only called when version gte VERSION_LOW_CARDINALITY_LEAVES?

@@ -80,7 +80,8 @@
//public static final int VERSION_CURRENT = VERSION_START;
public static final int VERSION_LEAF_STORES_BOUNDS = 5;
public static final int VERSION_SELECTIVE_INDEXING = 6;
public static final int VERSION_CURRENT = VERSION_SELECTIVE_INDEXING;
public static final int VERSION_LOW_CARDINALITY_LEAVES= 7;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a space before the equal sign? There are a couple other places where spaces are missing in this PR.

System.arraycopy(scratchDataPackedValue, 0, minPackedValue, 0, packedIndexBytesLength);
byte[] maxPackedValue = scratchMaxIndexPackedValue;
//Copy common prefixes before reading adjusted
// box
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move end of comment to the previous line?

byte[] minPackedValue = scratchMinIndexPackedValue;
System.arraycopy(scratchDataPackedValue, 0, minPackedValue, 0, packedIndexBytesLength);
byte[] maxPackedValue = scratchMaxIndexPackedValue;
//Copy common prefixes before reading adjusted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you leave a space between the slashes and the text like we usually do in the rest of the codebase?

@iverase
Copy link
Contributor Author

iverase commented Jun 24, 2019

I wonder whether we should apply the run-length compression on the sorted dimension in the
low-cardinality case as well, this could save some additional bytes, and might make the logic to
decide whether to use the low-cardinality or high-cardinality encoding a bit easier?

I don't think this would be so beneficial in the case of very low cardinality. Imagine that those few values only differ in the last Byte of the sorted dimension. For each value you add at least two bytes for the runLen compression and just save one when writing the sorted dimension. All in all the final size of the leaf will be bigger and I think this approach should favour this case.

@jpountz
Copy link
Contributor

jpountz commented Jun 24, 2019

Fair enough.

BKD tree is extensively tested already so there is no need to add new tests.

Do we already have tests for the case that leaves have lots of duplicates?

@iverase
Copy link
Contributor Author

iverase commented Jun 25, 2019

Tests trigger the new approach but it is true not as often as they should. I have added a new test that exercise the new code more often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants