Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding basic null predicate support #4943

Merged
merged 3 commits into from
Dec 21, 2019

Conversation

icefury71
Copy link
Contributor

Adding null predicate support in reference to Issue #4230

This PR adds limited support for "IS NULL" and "IS NOT NULL" filter predicates. Currently this only works for leaf filter predicates.

@codecov-io
Copy link

codecov-io commented Dec 20, 2019

Codecov Report

Merging #4943 into master will decrease coverage by 0.22%.
The diff coverage is 65.21%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #4943      +/-   ##
============================================
- Coverage     56.57%   56.34%   -0.23%     
  Complexity       16       16              
============================================
  Files          1176     1180       +4     
  Lines         62835    63068     +233     
  Branches       9222     9251      +29     
============================================
- Hits          35549    35538      -11     
- Misses        24643    24881     +238     
- Partials       2643     2649       +6
Impacted Files Coverage Δ Complexity Δ
...mpl/invertedindex/RealtimeInvertedIndexReader.java 100% <ø> (+3.03%) 0 <0> (ø) ⬇️
...gment/index/readers/NullValueVectorReaderImpl.java 100% <100%> (ø) 0 <0> (ø) ⬇️
.../apache/pinot/pql/parsers/pql2/ast/FilterKind.java 100% <100%> (ø) 0 <0> (ø) ⬇️
...ava/org/apache/pinot/core/plan/FilterPlanNode.java 92% <100%> (+1.09%) 0 <0> (ø) ⬇️
.../org/apache/pinot/pql/parsers/Pql2AstListener.java 91.82% <100%> (+2.51%) 0 <0> (ø) ⬇️
...t/pql/parsers/pql2/ast/IsNullPredicateAstNode.java 32% <32%> (ø) 0 <0> (?)
...e/pinot/core/common/predicate/IsNullPredicate.java 66.66% <66.66%> (ø) 0 <0> (?)
...inot/core/common/predicate/IsNotNullPredicate.java 66.66% <66.66%> (ø) 0 <0> (?)
...rg/apache/pinot/common/request/FilterOperator.java 85.18% <66.66%> (-6.12%) 0 <0> (ø)
...nullvalue/RealtimeNullValueVectorReaderWriter.java 85.71% <66.66%> (-14.29%) 0 <0> (ø)
... and 40 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b250439...321a07c. Read the comment docs.

Copy link
Member

@kishoreg kishoreg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Minor comments

@@ -111,6 +113,15 @@ private static BaseFilterOperator constructPhysicalOperator(FilterQueryTree filt
// Leaf filter operator
Predicate predicate = Predicate.newPredicate(filterQueryTree);

// Check for null predicate
Predicate.Type type = predicate.getType();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this code to FilterOperatorUtils.getLeafOperator? we can have an IsPredicateEvaluator that return empty list for matchings and non matching dictIds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a little bit hard to model NULL into PredicateEvaluator though, as there is no easy way to hook the nullValueVector into it. I'm okay making this an extra check for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - couldn't figure out how to do this in predicate evaluator. Leaving it as is for now.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good except for the thread-safety issue

@@ -111,6 +113,15 @@ private static BaseFilterOperator constructPhysicalOperator(FilterQueryTree filt
// Leaf filter operator
Predicate predicate = Predicate.newPredicate(filterQueryTree);

// Check for null predicate
Predicate.Type type = predicate.getType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a little bit hard to model NULL into PredicateEvaluator though, as there is no easy way to hook the nullValueVector into it. I'm okay making this an extra check for now.

@@ -39,4 +40,9 @@ public void setNull(int docId) {
public boolean isNull(int docId) {
return _nullBitmap.contains(docId);
}

@Override
public ImmutableRoaringBitmap getNullBitmap() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread-safe. You might want to use the ThreadSafeMutableRoaringBitmap wrapper in RealtimeInvertedIndexReader

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jackie-Jiang why do we need that? there is only one thread updating the bitmap and multiple threads reading it. cloning entire bitmap for every access is going to be expensive. when did we do this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kishoreg MutableRoaringBitmap itself is not thread-safe. You cannot read the bitmap while updating it. One possible solution is using ReadWriteLock for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually ReadWriteLock might not work in this case as there are too many reading methods

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about this. we should measure the performance impact on the ingestion rate (due to synchronization) and query throughput (bcos to clone). Readwrite lock will not help since its getting updated frequently.

The synchronization is not really needed since only one thread(consumer) is updating it. clone is important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need synchronization because clone and update cannot happen at the same time either

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed that. let's create another issue to measure and optimize this.

- Refactor and use ThreadSafeMutableRoaringBitmap for thread safety
- Other minor fixes
Copy link
Contributor Author

@icefury71 icefury71 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed comments

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if (type.equals(Predicate.Type.IS_NULL) || type.equals(Predicate.Type.IS_NOT_NULL)) {
DataSource dataSource = segment.getDataSource(filterQueryTree.getColumn());
ImmutableRoaringBitmap nullBitmap = dataSource.getNullValueVector().getNullBitmap();
boolean exclusive = (type == Predicate.Type.IS_NULL) ? false : true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can either change this to boolean exclusive = type != Predicate.Type.IS_NULL or maybe move the entire thing to the next statement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer avoiding ternary operators altogether as a convention :) https://agiletribe.wordpress.com/2011/11/01/21-avoid-ternary-conditional-operator/

boolean exclusive = false;
if(type == Predicate.Type.IS_NULL) {
  exclusive = true
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are different performance wise. This if condition is redundant.

@Jackie-Jiang Jackie-Jiang merged commit 09db4d9 into apache:master Dec 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants