Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting range queries using indexes #5240

Merged
merged 5 commits into from
May 14, 2020
Merged

Supporting range queries using indexes #5240

merged 5 commits into from
May 14, 2020

Conversation

kishoreg
Copy link
Member

@kishoreg kishoreg commented Apr 11, 2020

Currently, range queries result in a full table scan and cannot be solved using inverted indices. This PR introduces the concept of Range index.

This will be useful in evaluating range predicates on time and metric columns

e.g. select count(*) from T where latency > 3sec

Design doc: https://docs.google.com/document/d/1eisu7L-ERLs1OZCASOz3qSpzZfoipplKrYgmBXaFobw/edit#

@kishoreg kishoreg changed the title WIP: Supporting range queries using indexes Supporting range queries using indexes Apr 24, 2020
@@ -27,6 +27,7 @@

public class IndexingConfig extends BaseJsonConfig {
private List<String> _invertedIndexColumns;
private List<String> _rangeIndexColumns;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not make a change to IndexingConfig. Let's start using FieldConfig.
IndexLoadingConfig can then be derived from FieldConfig

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to how we do it for text index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. will have to do this as part of another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's please add a TODO

_buffer.close();
}

public int findRangeId(int value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a TODO here to indicate that for now we are doing a linear scan to find the corresponding range for a given dictId. If the number of ranges is high, we may want to do binary search?

public ImmutableRoaringBitmap getDocIds(Object value) {
// This should not be called from anywhere. If it happens, there is a bug
// and that's why we throw illegal state exception
throw new IllegalStateException("bitmap inverted index reader supports lookup only on dictionary id");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the exception

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the exception

import org.slf4j.LoggerFactory;


public class RangeIndexReader implements InvertedIndexReader<ImmutableRoaringBitmap> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add javadoc

import org.slf4j.LoggerFactory;


public class RangeIndexHandler {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add javadoc

@@ -86,6 +87,11 @@ private void extractFromTableConfig(@Nonnull TableConfig tableConfig) {
_invertedIndexColumns.addAll(invertedIndexColumns);
}

List<String> rangeIndexColumns = indexingConfig.getRangeIndexColumns();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use FieldConfig -- this is another index type

@@ -124,14 +129,21 @@ public PhysicalColumnIndexContainer(SegmentDirectory.Reader segmentReader, Colum
_invertedIndex =
new BitmapInvertedIndexReader(segmentReader.getIndexFor(columnName, ColumnIndexType.INVERTED_INDEX),
metadata.getCardinality());
_rangeIndex = null;
} else if (loadRangeIndex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be if not else-if

@@ -124,14 +129,21 @@ public PhysicalColumnIndexContainer(SegmentDirectory.Reader segmentReader, Colum
_invertedIndex =
new BitmapInvertedIndexReader(segmentReader.getIndexFor(columnName, ColumnIndexType.INVERTED_INDEX),
metadata.getCardinality());
_rangeIndex = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the if-else blocks here can be simplified

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope so but dint want to do that as part of this PR.

@Override
public void put(int position, Number value) {
_buffer.putInt(position << 2, value.intValue());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using constants for << 2, << 3 would be better for readability. Something like INT_SIZE_BYTES, DOUBLE_SIZE_BYTES etc

// Bitmap for range 2
// ......
// Bitmap for range R - 1
long bytesWritten = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see _forwardIndexValueBuffer being used

Copy link
Member Author

@kishoreg kishoreg May 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont need that right?

return _valueBuffer.compare(val1, val2);
};
Swapper swapper = (i, j) -> {
Number temp = _docIdBuffer.get(i).intValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are doing in-place sorting on _docIdBuffer. Why do we need _docIdValueBuffer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to sort the forward index, instead of sorting the actual forward index, I sort the position array

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks.

throws IOException {
//sort the forward index copy
//go over the sorted value to compute ranges
IntComparator comparator = (i, j) -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is either going to sort raw values or dictIds. Correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual sort happens on docIdBuffer but uses forward index for sorting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not call it forward index. Let's call it keys?


@Override
public void addDoc(Object document, int docIdCounter) {
throw new IllegalStateException("Bitmap inverted index creator does not support Object type currently");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the exception

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix this exception

// Forward index buffers (from docId to dictId)
private int _nextDocId;
private PinotDataBuffer _forwardIndexValueBuffer;
private NumberValueBuffer _valueBuffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This buffer will either store dictIds or raw values. Correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's please rename it indexKeyBuffer?


// Forward index buffers (from docId to dictId)
private int _nextDocId;
private PinotDataBuffer _forwardIndexValueBuffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some short comments (one line is enough) about each buffer type?

private final int _numValues;
private final boolean _useMMapBuffer;

// Forward index buffers (from docId to dictId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we creating the forward index for the column as part of creating range index? The loop in SegmentColumnarIndexCreator that goes over each GenericRow takes care of that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its a temp buffer, will renaming it to valueBuffer help?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not use the term forward index inside range index creator. Can we please call it keyBuffer. The key that are stored could be dictIds or raw values.

* </ul>
* <p>Based on the number of values we need to store, we use direct memory or MMap file to allocate the buffer.
*/
public final class RangeIndexCreator implements InvertedIndexCreator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem with treating this as another inverted index by overriding InvertedIndexCreator and InvertedIndexReader interfaces, is that we are inherently making this work only on dictionary encoded columns because these interfaces and their APIs are written to work only on dictIds

However, for text index (which also implements the same interface), we added a docId based API but that takes an object value. May be we can expand that API for each primitive type and that way we can support raw value based inverted index as well. I think we should decide on this sooner because this question will pop up for every new index type that we may add.

Another alternative would be to change the inheritance hierarchy by introducing a mid layer of interfaces -- DictionaryBasedInvertedIndexCreator, RawValueBasedInvertedIndexCreator. Move all the dictId based APIs to former and raw value based to latter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, Jackie and I discussed about it and concluded the same. Jackie will clean this up at some point.

* In the second pass (processing values phase), when seal() method is called, all the dictIds should already been
* added. We first reorder the values into the inverted index buffers by going over the dictIds in forward index
* value buffer (for multi-valued column we also need forward index length buffer to get the docId for each dictId).
* <p>Once we have the inverted index buffers, we simply go over them and create the bitmap for each dictId and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't sound right. We don't create a bitmap for each dictId. I believe there is a bitmap for a contiguous range of dictIds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, this is copy paste error from bitmap index

* A
* </li>
* <li>
* In the first pass (adding values phase), when add() method is called, store the raw values into the forward index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean store the dictionary Ids?

@@ -22,6 +22,8 @@


public interface PredicateEvaluator {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

@Override
protected FilterBlock getNextBlock() {

//only dictionary based is supported for now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a TODO for supporting this on raw columns

import org.roaringbitmap.buffer.MutableRoaringBitmap;


public class RangeFilterOperator extends BaseFilterOperator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add javadoc

@@ -54,6 +54,12 @@ public static BaseFilterOperator getLeafFilterOperator(PredicateEvaluator predic

Predicate.Type predicateType = predicateEvaluator.getPredicateType();

//Only for dictionary encoded columns and offline data sources
if (predicateType == Predicate.Type.RANGE && dataSource.getDictionary() != null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only dictionary encoded? For raw columns, each range could be a range of start and end raw values instead of dictIds. The data structure could be sorted on start raw values (just like it is on start dictIds)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a TODO



@SuppressWarnings("unchecked")
public final class DictionaryBasedSingleValueIterator extends BlockSingleValIterator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add javadoc



@SuppressWarnings("unchecked")
public final class DictionaryBasedMultiValueIterator extends BlockMultiValIterator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add javadoc


private int _nextDocId;

public DictionaryBasedSingleValueIterator(SingleColumnSingleValueReader reader, Dictionary dictionary, int numDocs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why do we have to implement two new iterators?

int startRangeId = rangeIndexReader.findRangeId(evaluator.getStartDictId());
int endRangeId = rangeIndexReader.findRangeId(evaluator.getEndDictId());
//Handle Matching Ranges - some ranges match fully but some partially
//below code assumes first and last range always match partially which may not be the case always //todo: optimize it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not about optimization. Won't this impact correctness?
For a given startDictId and endDictId, we find the matching ranges represented by startRangeId and endRangeId.
Some/all ranges between these could be a full match and some/all could be a partial match. Right?
I understand that all full matches can be ORed but how can we assume that everything from startRangeId + 1 to endRangeId - 1 will be a full match?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not?

private NumberValueBuffer _numberValueBuffer;

private final File _tempDocIdBufferFile;
private PinotDataBuffer _docIdValueBuffer;
Copy link
Contributor

@siddharthteotia siddharthteotia May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename _docIdValueBuffer to _docIdBuffer and _docIdBuffer to _docIdBufferWrapperForSorting?


private final File _rangeIndexFile;

private final File _tempValueBufferFile;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let'a add comments explaining purpose of each buffer.

dump();
}

//sort the value buffer, change the docId buffer to maintain the mapping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a diagram here showing the contents of _numberValueBuffer and _docIdBuffer before and after sorting to explain the process.

Copy link
Contributor

@siddharthteotia siddharthteotia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for working on this and addressing the comments. There are some follow-ups that I can take up:

(1) UseFieldConfig instead of IndexingConfig for specifying range index in table config
(2) Cleanup the if-else logic in PhysicalColumnIndexContainer that decides what indexes we should load.
(3) Refactor the inverted index interface to allow creating both raw and dictionary based (I already have this in progress).
(4) After (3), add support for creating raw values based range indexes (the index creator/reader need not change but there are assumptions elsewhere in the code)

@kishoreg kishoreg merged commit 602f28a into master May 14, 2020
@Jackie-Jiang Jackie-Jiang deleted the range-index branch May 14, 2020 19:38
Jackie-Jiang added a commit to Jackie-Jiang/pinot that referenced this pull request May 14, 2020
1. Fix the compilation error introduced because of the merge of apache#4597 and apache#5240
2. Fix the bug of not loading the range index if both inverted index and range index exist

TODO: The range index triggeres another severe issue of accessing closed DataBuffer which can cause JVM crash. Will address in a separate PR
Jackie-Jiang added a commit that referenced this pull request May 15, 2020
1. Fix the compilation error introduced because of the merge of #4597 and #5240
2. Fix the bug of not loading the range index if both inverted index and range index exist

TODO: The range index triggeres another severe issue of accessing closed DataBuffer which can cause JVM crash. Will address in a separate PR
@mcvsubbu mcvsubbu mentioned this pull request Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants