Supporting range queries using indexes #5240

kishoreg · 2020-04-11T06:06:05Z

Currently, range queries result in a full table scan and cannot be solved using inverted indices. This PR introduces the concept of Range index.

This will be useful in evaluating range predicates on time and metric columns

e.g. select count(*) from T where latency > 3sec

Design doc: https://docs.google.com/document/d/1eisu7L-ERLs1OZCASOz3qSpzZfoipplKrYgmBXaFobw/edit#

siddharthteotia · 2020-05-04T16:30:37Z

pinot-spi/src/main/java/org/apache/pinot/spi/config/table/IndexingConfig.java

@@ -27,6 +27,7 @@

 public class IndexingConfig extends BaseJsonConfig {
  private List<String> _invertedIndexColumns;
+  private List<String> _rangeIndexColumns;


We should not make a change to IndexingConfig. Let's start using FieldConfig.
IndexLoadingConfig can then be derived from FieldConfig

Similar to how we do it for text index.

I agree. will have to do this as part of another PR.

Let's please add a TODO

siddharthteotia · 2020-05-04T16:31:50Z

pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/RangeIndexReader.java

+    _buffer.close();
+  }
+
+  public int findRangeId(int value) {


Can we add a TODO here to indicate that for now we are doing a linear scan to find the corresponding range for a given dictId. If the number of ranges is high, we may want to do binary search?

siddharthteotia · 2020-05-04T16:31:54Z

pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/RangeIndexReader.java

+  public ImmutableRoaringBitmap getDocIds(Object value) {
+    // This should not be called from anywhere. If it happens, there is a bug
+    // and that's why we throw illegal state exception
+    throw new IllegalStateException("bitmap inverted index reader supports lookup only on dictionary id");


Change the exception

Please change the exception

siddharthteotia · 2020-05-04T16:32:00Z

pinot-core/src/main/java/org/apache/pinot/core/segment/index/readers/RangeIndexReader.java

+import org.slf4j.LoggerFactory;
+
+
+public class RangeIndexReader implements InvertedIndexReader<ImmutableRoaringBitmap> {


Please add javadoc

siddharthteotia · 2020-05-04T16:32:20Z

...rc/main/java/org/apache/pinot/core/segment/index/loader/invertedindex/RangeIndexHandler.java

+import org.slf4j.LoggerFactory;
+
+
+public class RangeIndexHandler {


Please add javadoc

siddharthteotia · 2020-05-04T16:32:30Z

pinot-core/src/main/java/org/apache/pinot/core/segment/index/loader/IndexLoadingConfig.java

@@ -86,6 +87,11 @@ private void extractFromTableConfig(@Nonnull TableConfig tableConfig) {
      _invertedIndexColumns.addAll(invertedIndexColumns);
    }

+    List<String> rangeIndexColumns = indexingConfig.getRangeIndexColumns();


We should use FieldConfig -- this is another index type

siddharthteotia · 2020-05-04T16:32:34Z

...e/src/main/java/org/apache/pinot/core/segment/index/column/PhysicalColumnIndexContainer.java

@@ -124,14 +129,21 @@ public PhysicalColumnIndexContainer(SegmentDirectory.Reader segmentReader, Colum
        _invertedIndex =
            new BitmapInvertedIndexReader(segmentReader.getIndexFor(columnName, ColumnIndexType.INVERTED_INDEX),
                metadata.getCardinality());
+        _rangeIndex = null;
+      } else if (loadRangeIndex) {


This should be if not else-if

siddharthteotia · 2020-05-04T16:32:40Z

...e/src/main/java/org/apache/pinot/core/segment/index/column/PhysicalColumnIndexContainer.java

@@ -124,14 +129,21 @@ public PhysicalColumnIndexContainer(SegmentDirectory.Reader segmentReader, Colum
        _invertedIndex =
            new BitmapInvertedIndexReader(segmentReader.getIndexFor(columnName, ColumnIndexType.INVERTED_INDEX),
                metadata.getCardinality());
+        _rangeIndex = null;


Not sure if the if-else blocks here can be simplified

I hope so but dint want to do that as part of this PR.

siddharthteotia · 2020-05-04T16:32:44Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+    @Override
+    public void put(int position, Number value) {
+      _buffer.putInt(position << 2, value.intValue());
+    }


Using constants for << 2, << 3 would be better for readability. Something like INT_SIZE_BYTES, DOUBLE_SIZE_BYTES etc

siddharthteotia · 2020-05-04T16:32:49Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+    //   Bitmap for range 2
+    //    ......
+    //   Bitmap for range R - 1
+    long bytesWritten = 0;


I don't see _forwardIndexValueBuffer being used

we dont need that right?

siddharthteotia · 2020-05-04T16:32:53Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+      return _valueBuffer.compare(val1, val2);
+    };
+    Swapper swapper = (i, j) -> {
+      Number temp = _docIdBuffer.get(i).intValue();


Looks like we are doing in-place sorting on _docIdBuffer. Why do we need _docIdValueBuffer?

to sort the forward index, instead of sorting the actual forward index, I sort the position array

Got it. Thanks.

siddharthteotia · 2020-05-04T16:32:59Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+      throws IOException {
+    //sort the forward index copy
+    //go over the sorted value to compute ranges
+    IntComparator comparator = (i, j) -> {


So this is either going to sort raw values or dictIds. Correct?

The actual sort happens on docIdBuffer but uses forward index for sorting

Let's not call it forward index. Let's call it keys?

siddharthteotia · 2020-05-04T16:33:03Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+
+  @Override
+  public void addDoc(Object document, int docIdCounter) {
+    throw new IllegalStateException("Bitmap inverted index creator does not support Object type currently");


Change the exception

Please fix this exception

siddharthteotia · 2020-05-04T16:33:08Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+  // Forward index buffers (from docId to dictId)
+  private int _nextDocId;
+  private PinotDataBuffer _forwardIndexValueBuffer;
+  private NumberValueBuffer _valueBuffer;


This buffer will either store dictIds or raw values. Correct?

Let's please rename it indexKeyBuffer?

siddharthteotia · 2020-05-04T16:33:11Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+
+  // Forward index buffers (from docId to dictId)
+  private int _nextDocId;
+  private PinotDataBuffer _forwardIndexValueBuffer;


Can we add some short comments (one line is enough) about each buffer type?

siddharthteotia · 2020-05-04T16:33:20Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+  private final int _numValues;
+  private final boolean _useMMapBuffer;
+
+  // Forward index buffers (from docId to dictId)


Why are we creating the forward index for the column as part of creating range index? The loop in SegmentColumnarIndexCreator that goes over each GenericRow takes care of that.

its a temp buffer, will renaming it to valueBuffer help?

Let's not use the term forward index inside range index creator. Can we please call it keyBuffer. The key that are stored could be dictIds or raw values.

siddharthteotia · 2020-05-04T16:33:23Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+ * </ul>
+ * <p>Based on the number of values we need to store, we use direct memory or MMap file to allocate the buffer.
+ */
+public final class RangeIndexCreator implements InvertedIndexCreator {


I think the problem with treating this as another inverted index by overriding InvertedIndexCreator and InvertedIndexReader interfaces, is that we are inherently making this work only on dictionary encoded columns because these interfaces and their APIs are written to work only on dictIds

However, for text index (which also implements the same interface), we added a docId based API but that takes an object value. May be we can expand that API for each primitive type and that way we can support raw value based inverted index as well. I think we should decide on this sooner because this question will pop up for every new index type that we may add.

Another alternative would be to change the inheritance hierarchy by introducing a mid layer of interfaces -- DictionaryBasedInvertedIndexCreator, RawValueBasedInvertedIndexCreator. Move all the dictId based APIs to former and raw value based to latter

yes, Jackie and I discussed about it and concluded the same. Jackie will clean this up at some point.

siddharthteotia · 2020-05-04T16:33:27Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+ *     In the second pass (processing values phase), when seal() method is called, all the dictIds should already been
+ *     added. We first reorder the values into the inverted index buffers by going over the dictIds in forward index
+ *     value buffer (for multi-valued column we also need forward index length buffer to get the docId for each dictId).
+ *     <p>Once we have the inverted index buffers, we simply go over them and create the bitmap for each dictId and


This doesn't sound right. We don't create a bitmap for each dictId. I believe there is a bitmap for a contiguous range of dictIds

sorry, this is copy paste error from bitmap index

siddharthteotia · 2020-05-04T16:33:30Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+ *     A
+ *   </li>
+ *   <li>
+ *     In the first pass (adding values phase), when add() method is called, store the raw values into the forward index


You mean store the dictionary Ids?

siddharthteotia · 2020-05-04T16:33:35Z

...t-core/src/main/java/org/apache/pinot/core/operator/filter/predicate/PredicateEvaluator.java

@@ -22,6 +22,8 @@


 public interface PredicateEvaluator {
+


siddharthteotia · 2020-05-04T16:33:39Z

pinot-core/src/main/java/org/apache/pinot/core/operator/filter/RangeFilterOperator.java

+  @Override
+  protected FilterBlock getNextBlock() {
+
+    //only dictionary based is supported for now


Let's add a TODO for supporting this on raw columns

siddharthteotia · 2020-05-04T16:33:42Z

pinot-core/src/main/java/org/apache/pinot/core/operator/filter/RangeFilterOperator.java

+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+
+public class RangeFilterOperator extends BaseFilterOperator {


please add javadoc

siddharthteotia · 2020-05-04T16:34:04Z

pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java

@@ -54,6 +54,12 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic

    Predicate.Type predicateType = predicateEvaluator.getPredicateType();

+    //Only for dictionary encoded columns and offline data sources
+    if (predicateType == Predicate.Type.RANGE && dataSource.getDictionary() != null


Why only dictionary encoded? For raw columns, each range could be a range of start and end raw values instead of dictIds. The data structure could be sorted on start raw values (just like it is on start dictIds)

Let's add a TODO

siddharthteotia · 2020-05-04T16:34:32Z

.../java/org/apache/pinot/core/operator/docvaliterators/DictionaryBasedSingleValueIterator.java

+
+
+@SuppressWarnings("unchecked")
+public final class DictionaryBasedSingleValueIterator extends BlockSingleValIterator {


Please add javadoc

siddharthteotia · 2020-05-04T16:34:40Z

...n/java/org/apache/pinot/core/operator/docvaliterators/DictionaryBasedMultiValueIterator.java

+
+
+@SuppressWarnings("unchecked")
+public final class DictionaryBasedMultiValueIterator extends BlockMultiValIterator {


Please add javadoc

siddharthteotia · 2020-05-04T16:35:28Z

.../java/org/apache/pinot/core/operator/docvaliterators/DictionaryBasedSingleValueIterator.java

+
+  private int _nextDocId;
+
+  public DictionaryBasedSingleValueIterator(SingleColumnSingleValueReader reader, Dictionary dictionary, int numDocs) {


Also, why do we have to implement two new iterators?

siddharthteotia · 2020-05-04T16:40:18Z

pinot-core/src/main/java/org/apache/pinot/core/operator/filter/RangeFilterOperator.java

+    int startRangeId = rangeIndexReader.findRangeId(evaluator.getStartDictId());
+    int endRangeId = rangeIndexReader.findRangeId(evaluator.getEndDictId());
+    //Handle Matching Ranges - some ranges match fully but some partially
+    //below code assumes first and last range always match partially which may not be the case always //todo: optimize it


This is not about optimization. Won't this impact correctness?
For a given startDictId and endDictId, we find the matching ranges represented by startRangeId and endRangeId.
Some/all ranges between these could be a full match and some/all could be a partial match. Right?
I understand that all full matches can be ORed but how can we assume that everything from startRangeId + 1 to endRangeId - 1 will be a full match?

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

siddharthteotia · 2020-05-11T16:23:25Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+  private NumberValueBuffer _numberValueBuffer;
+
+  private final File _tempDocIdBufferFile;
+  private PinotDataBuffer _docIdValueBuffer;


Let's rename _docIdValueBuffer to _docIdBuffer and _docIdBuffer to _docIdBufferWrapperForSorting?

siddharthteotia · 2020-05-11T16:36:15Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+
+  private final File _rangeIndexFile;
+
+  private final File _tempValueBufferFile;


Let'a add comments explaining purpose of each buffer.

siddharthteotia · 2020-05-11T16:38:03Z

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java

+      dump();
+    }
+
+    //sort the value buffer, change the docId buffer to maintain the mapping


Let's add a diagram here showing the contents of _numberValueBuffer and _docIdBuffer before and after sorting to explain the process.

siddharthteotia

LGTM. Thanks for working on this and addressing the comments. There are some follow-ups that I can take up:

(1) UseFieldConfig instead of IndexingConfig for specifying range index in table config
(2) Cleanup the if-else logic in PhysicalColumnIndexContainer that decides what indexes we should load.
(3) Refactor the inverted index interface to allow creating both raw and dictionary based (I already have this in progress).
(4) After (3), add support for creating raw values based range indexes (the index creator/reader need not change but there are assumptions elsewhere in the code)

1. Fix the compilation error introduced because of the merge of apache#4597 and apache#5240 2. Fix the bug of not loading the range index if both inverted index and range index exist TODO: The range index triggeres another severe issue of accessing closed DataBuffer which can cause JVM crash. Will address in a separate PR

1. Fix the compilation error introduced because of the merge of #4597 and #5240 2. Fix the bug of not loading the range index if both inverted index and range index exist TODO: The range index triggeres another severe issue of accessing closed DataBuffer which can cause JVM crash. Will address in a separate PR

Adding range index support

8c2874b

kishoreg force-pushed the range-index branch from 7b2eacf to 8c2874b Compare April 24, 2020 00:04

xiangfu0 force-pushed the range-index branch from 8c2874b to af8ef5d Compare April 24, 2020 07:25

kishoreg changed the title ~~WIP: Supporting range queries using indexes~~ Supporting range queries using indexes Apr 24, 2020

kishoreg requested a review from siddharthteotia April 26, 2020 08:42

siddharthteotia reviewed May 4, 2020

View reviewed changes

siddharthteotia reviewed May 10, 2020

View reviewed changes

pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/inv/RangeIndexCreator.java Show resolved Hide resolved

Adding comments

e8d3284

kishoreg force-pushed the range-index branch from af8ef5d to e8d3284 Compare May 10, 2020 23:49

Make range index to store start of all ranges and end of last range

34267be

siddharthteotia reviewed May 11, 2020

View reviewed changes

kishoreg added 2 commits May 11, 2020 12:57

Adding comments and fixing broken tests

5872b71

More comments

66eac58

siddharthteotia approved these changes May 11, 2020

View reviewed changes

kishoreg merged commit 602f28a into master May 14, 2020

Jackie-Jiang mentioned this pull request May 14, 2020

Fix the compilation error and bug introduced in Range Index #5389

Merged

Jackie-Jiang deleted the range-index branch May 14, 2020 19:38

mcvsubbu mentioned this pull request Aug 2, 2021

Support EXPLAIN PLAN #7210

Closed

		import org.slf4j.LoggerFactory;


		public class RangeIndexReader implements InvertedIndexReader<ImmutableRoaringBitmap> {

		import org.slf4j.LoggerFactory;


		public class RangeIndexHandler {

		import org.roaringbitmap.buffer.MutableRoaringBitmap;


		public class RangeFilterOperator extends BaseFilterOperator {



		@SuppressWarnings("unchecked")
		public final class DictionaryBasedSingleValueIterator extends BlockSingleValIterator {



		@SuppressWarnings("unchecked")
		public final class DictionaryBasedMultiValueIterator extends BlockMultiValIterator {


		private int _nextDocId;

		public DictionaryBasedSingleValueIterator(SingleColumnSingleValueReader reader, Dictionary dictionary, int numDocs) {


		private final File _rangeIndexFile;

		private final File _tempValueBufferFile;

Supporting range queries using indexes #5240

Supporting range queries using indexes #5240

Conversation

kishoreg commented Apr 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kishoreg May 9, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siddharthteotia May 11, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siddharthteotia left a comment • edited Loading

Choose a reason for hiding this comment

kishoreg commented Apr 11, 2020 •

edited

Loading

kishoreg May 9, 2020 •

edited

Loading

siddharthteotia May 11, 2020 •

edited

Loading

siddharthteotia left a comment •

edited

Loading