LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality #730

iverase · 2019-06-19T07:53:39Z

Description

Currently if a leaf on the BKD tree contains only few values, then the leaf is treated the same way as it all values are different. It many cases it can be much more efficient to store the distinct values with the cardinality.

Solution

The strategy is the following:

When writing a leaf block the cardinality is computed.
Perform some naive calculation to compute if it is better to store the leaf as a low cardinality leaf. The storage cost are calculated as follows:
low cardinality: leafCardinality * (packedBytesLength - prefixLenSum + 2) where two is the estimated size of storing the cardinality. This is an overestimation as in some cases you will only need one byte to store the cardinality.
High cardinality: count * (packedBytesLength - prefixLenSum). We are not taking into account the runlen compression.
If the tree has low cardinality then we set the compressed dim to -2. Note that -1 is when all values are equal.

Tests

BKD tree is extensively tested already so there is no need to add new tests.

leaf

jpountz

I like the approach.

jpountz · 2019-06-19T08:56:21Z

lucene/core/src/java/org/apache/lucene/util/bkd/BKDReader.java

@@ -530,7 +602,7 @@ private void visitCompressedDocValues(int[] commonPrefixLengths, byte[] scratchP

  private int readCompressedDim(IndexInput in) throws IOException {
    int compressedDim = in.readByte();
-    if (compressedDim < -1 || compressedDim >= numDataDims) {
+    if (compressedDim < -2 || compressedDim >= numDataDims) {


maybe fail if compressedDim is -2 and version is not gte VERSION_LOW_CARDINALITY_LEAVES?

jpountz · 2019-06-20T07:58:52Z

lucene/core/src/java/org/apache/lucene/util/bkd/BKDReader.java

+      visitDocValuesWithCardinality(commonPrefixLengths, scratchDataPackedValue, scratchMinIndexPackedValue, scratchMaxIndexPackedValue, in, docIDs, count, visitor);
+    } else {
+      visitDocValuesNoCardinality(commonPrefixLengths, scratchDataPackedValue, scratchMinIndexPackedValue, scratchMaxIndexPackedValue, in, docIDs, count, visitor);
+    }


could we always call visitDocValuesWithCardinality? It seems to include the version check already?

Actually we cannot because in the previous version of the BKD tree the compressed dim byte was written after the leaf bounds. This is the reason I have to fork the version here.

jpountz · 2019-06-20T08:06:51Z

lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java

    int prefixLenSum = Arrays.stream(commonPrefixLengths).sum();
    if (prefixLenSum == packedBytesLength) {
      // all values in this block are equal
      out.writeByte((byte) -1);
-    } else {
+    } else if (leafCardinality * (packedBytesLength - prefixLenSum + 2)  <= count * (packedBytesLength - prefixLenSum)) {


Am I reading it right that you are counting 2 for the vint? I think you could make it 1 instead, the reasoning being that if you vints are 2 bytes on average, then it means than your runs are very long (vint start using 2 bytes when they are greater than 127) and so the sparse encoding is an obvious win.

Yes, I was too conservative .

…lity as 1 byte

iverase · 2019-06-21T10:32:35Z

I had a look in how good is the formula to decide to use this optimisation and results are very interesting. I have only done it for 1D so far but it seems we are underestimating the compression in 1 dimension so the result is a bigger index.

The test has been done randomly ingesting 10M intPoint, in each iteration the cardinality has been increased. The size of the index has been calculated after indexing the data single threaded and after force merge into one segment.

Considering the size of vint of 1 Byte results are pretty bad for some cardinalities.

leafCardinality * (packedBytesLength - prefixLenSum + 1)  <= count * (packedBytesLength - prefixLenSum)

Considering the size of vint of 2 Byte results improve but still there is a region were index becomes bigger.

leafCardinality * (packedBytesLength - prefixLenSum + 2)  <= count * (packedBytesLength - prefixLenSum)

Considering the size of vint of 3 Byte results improve but still there is a region were index becomes bigger.

leafCardinality * (packedBytesLength - prefixLenSum + 3)  <= count * (packedBytesLength - prefixLenSum)

iverase · 2019-06-21T12:21:33Z

I think the calculation was incorrect, we need to consider the runLen compression, which basically means that if totally efficient there is one less byte on the high cardinality side per point. Therefore the formula looks like:

leafCardinality * (packedBytesLength - prefixLenSum + 1)  <= count * (packedBytesLength - prefixLenSum - 1)

With these computation, plot looks fine:

into account the runLen compression.

iverase · 2019-06-24T06:20:58Z

I had another iteration, main things are:

Now the cost of runLen compression is computed exact.
Cardinality is computed after the leaf has been sorted by the elected dimension. This computation does not copy arrays around.

jpountz

I left some minor comments, in general it looks good to me. I wonder whether we should apply the run-length compression on the sorted dimension in the low-cardinality case as well, this could save some additional bytes, and might make the logic to decide whether to use the low-cardinality or high-cardinality encoding a bit easier?

jpountz · 2019-06-23T20:56:35Z

lucene/core/src/java/org/apache/lucene/util/bkd/BKDReader.java

+      visitor.grow(count);
+      visitUniqueRawDocValues(scratchDataPackedValue, docIDs, count, visitor);
+    } else {
+      if (numIndexDims != 1 && version >= BKDWriter.VERSION_LEAF_STORES_BOUNDS) {


the version check shouldn't be necessary since this method is only called when version gte VERSION_LOW_CARDINALITY_LEAVES?

jpountz · 2019-06-24T07:42:42Z

lucene/core/src/java/org/apache/lucene/util/bkd/BKDWriter.java

@@ -80,7 +80,8 @@
  //public static final int VERSION_CURRENT = VERSION_START;
  public static final int VERSION_LEAF_STORES_BOUNDS = 5;
  public static final int VERSION_SELECTIVE_INDEXING = 6;
-  public static final int VERSION_CURRENT = VERSION_SELECTIVE_INDEXING;
+  public static final int VERSION_LOW_CARDINALITY_LEAVES= 7;


Can you add a space before the equal sign? There are a couple other places where spaces are missing in this PR.

jpountz · 2019-06-24T07:46:35Z

lucene/core/src/java/org/apache/lucene/util/bkd/BKDReader.java

+        System.arraycopy(scratchDataPackedValue, 0, minPackedValue, 0, packedIndexBytesLength);
+        byte[] maxPackedValue = scratchMaxIndexPackedValue;
+        //Copy common prefixes before reading adjusted
+        // box


move end of comment to the previous line?

jpountz · 2019-06-24T07:48:17Z

lucene/core/src/java/org/apache/lucene/util/bkd/BKDReader.java

+        byte[] minPackedValue = scratchMinIndexPackedValue;
+        System.arraycopy(scratchDataPackedValue, 0, minPackedValue, 0, packedIndexBytesLength);
+        byte[] maxPackedValue = scratchMaxIndexPackedValue;
+        //Copy common prefixes before reading adjusted


can you leave a space between the slashes and the text like we usually do in the rest of the codebase?

iverase · 2019-06-24T11:27:44Z

I wonder whether we should apply the run-length compression on the sorted dimension in the
low-cardinality case as well, this could save some additional bytes, and might make the logic to
decide whether to use the low-cardinality or high-cardinality encoding a bit easier?

I don't think this would be so beneficial in the case of very low cardinality. Imagine that those few values only differ in the last Byte of the sorted dimension. For each value you add at least two bytes for the runLen compression and just save one when writing the sorted dimension. All in all the final size of the leaf will be bigger and I think this approach should favour this case.

jpountz · 2019-06-24T16:10:19Z

Fair enough.

BKD tree is extensively tested already so there is no need to add new tests.

Do we already have tests for the case that leaves have lots of duplicates?

iverase · 2019-06-25T05:06:07Z

Tests trigger the new approach but it is true not as often as they should. I have added a new test that exercise the new code more often.

iverase added 2 commits June 19, 2019 09:30

Add new storage strategy when cardinality if points is low on a BKD tree

47df1d4

leaf

Compute correctly when to perform the optimisation

1f8738e

jpountz reviewed Jun 20, 2019

View reviewed changes

iverase added 2 commits June 20, 2019 10:27

Change computation of low cardinality to consider the size of cardina…

69f4d4e

…lity as 1 byte

Fail if commpressed dim is -2 and we are on an earlier version

033564b

iverase added 8 commits June 21, 2019 14:23

fix the formula to decide to use the new storage strategy. It takes

15e8958

into account the runLen compression.

compute exactlay the cost of runLen compression

f39370c

Add TODO

80ee25d

compute cardinality after sorting leaf and avoid copying of arrays

f904dfb

clean up

918d399

clean up

2f39af3

leverage the commonPrefixLength info for mismatch

6eaf5a1

clean up

13f1f51

jpountz reviewed Jun 24, 2019

View reviewed changes

iverase added 2 commits June 24, 2019 10:29

Merge branch 'master' into storageLowCardinality

b9c271a

Address review comments, mostly formatting

c8103c7

iverase added 3 commits June 24, 2019 15:24

formatting

5054ff4

formatting

885ea93

formatting

a1471fe

Add test that triggers low cardinality leaves

9f453d7

jpountz approved these changes Jun 26, 2019

View reviewed changes

iverase added 3 commits June 26, 2019 09:57

Merge branch 'master' into storageLowCardinality

e6aa3e6

formatting

30f36ce

Add entry in Changes.txt

cc08c5f

iverase merged commit dac4310 into apache:master Jun 26, 2019

This was referenced Jun 26, 2019

[Backport ]LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality #743

Merged

LUCENE-8867: Store point with cardinality for low cardinality leaves #728

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality #730

LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality #730

iverase commented Jun 19, 2019

jpountz left a comment

jpountz Jun 19, 2019

jpountz Jun 20, 2019

iverase Jun 20, 2019

jpountz Jun 20, 2019

iverase Jun 20, 2019

iverase commented Jun 21, 2019 •

edited

iverase commented Jun 21, 2019 •

edited

iverase commented Jun 24, 2019

jpountz left a comment

jpountz Jun 23, 2019

jpountz Jun 24, 2019

jpountz Jun 24, 2019

jpountz Jun 24, 2019

iverase commented Jun 24, 2019

jpountz commented Jun 24, 2019

iverase commented Jun 25, 2019 •

edited

LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality #730

LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality #730

Conversation

iverase commented Jun 19, 2019

Description

Solution

Tests

jpountz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iverase commented Jun 21, 2019 • edited

iverase commented Jun 21, 2019 • edited

iverase commented Jun 24, 2019

jpountz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iverase commented Jun 24, 2019

jpountz commented Jun 24, 2019

iverase commented Jun 25, 2019 • edited

iverase commented Jun 21, 2019 •

edited

iverase commented Jun 21, 2019 •

edited

iverase commented Jun 25, 2019 •

edited