Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Convert predicate to arrow filter and push down to parquet reader #295

Merged
merged 21 commits into from
May 15, 2024

Conversation

viirya
Copy link
Member

@viirya viirya commented Mar 24, 2024

This implements the feature of row filtering when reading Parquet files in Iceberg scan. It is achieved by converting predicates into Parquet Arrow filter which is used to filter rows during reading in Parquet reader.

This implements AlwaysTrue, AlwaysFalse, And, Or, Not, Binary, partial Unary predicates. Unimplemented predicates (some Unary and Set predicates) are because no existing kernels to be used in arrow. I'll implement them in following works.

close #265

@viirya viirya marked this pull request as draft March 24, 2024 17:26
@viirya viirya force-pushed the filter_to_row_filter branch 3 times, most recently from 5606bbe to 67c3f05 Compare March 25, 2024 00:43
@viirya viirya marked this pull request as ready for review March 25, 2024 00:43
crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
@liurenjie1024
Copy link
Collaborator

cc @viirya Is this ready for review or you still need to do more update?

@viirya
Copy link
Member Author

viirya commented Apr 1, 2024

@liurenjie1024 It is ready for review. I will fix the conflicts.

Copy link
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @viirya Thanks for this pr, the general idea of reusing arrow kernel looks great! But I found some small problems which could be improved.

crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow.rs Outdated Show resolved Hide resolved
PredicateOperator::NotNull => Ok(Box::new(ArrowPredicateFn::new(
self.projection_mask.clone(),
move |batch| {
let column = batch.column(term_index);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maybe incorrect for nested column, I think maybe we should either return projection_mask for each leave column, or implement a general purpose flatten method for struct array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Parquet reader API will flatten required columns based on the projection_mask we provide. I.e., If the projection mask selects one nested column a.b, it will be the first column of the record batch when calling evaluate of ArrowPredicate API.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some test but it seems that it doesn't work like this, say the schem is like following:

message {
  struct a {
    int b
  }
}

And we pass ProjectionMask::leaves([0]), it will return struct array, so batch.column[term_index] will return StructArray

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's what I got from reading the Parquet code. Let me take further look and test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wrote a test to test it in arrow-rs. Yea, for a nested column such as struct.a, the batch passed to evaluate contains a struct array with the column a. Looks like Parquet will project the requested column indices and construct the upper nested type (i.e., struct) before passing to evaluate,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test to arrow-rs to clarify its usage: apache/arrow-rs#5600

Copy link
Member Author

@viirya viirya Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to have a flatten method for struct array sounds more simple way. I'm looking in arrow-rs to see if there is existing one, if not, we need to implement it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maybe incorrect for nested column, I think maybe we should either return projection_mask for each leave column, or implement a general purpose flatten method for struct array.

I tried to change to return projection_mask for each leave column, it is pretty straightforward to implement. Please let me know if it looks good to you. Thanks.

@viirya
Copy link
Member Author

viirya commented Apr 3, 2024

I've addressed some of above reviews. I will resolve other reviews soon. Thanks.

@viirya
Copy link
Member Author

viirya commented Apr 12, 2024

@liurenjie1024 I've addressed all comments. Thank you.

Copy link
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @viirya for this pr, it looks promising! Given #334 has been merged, do you mind to rewrite some of the visitors with new api? This helps to make code easier to maintain.

crates/iceberg/src/arrow/reader.rs Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Show resolved Hide resolved
@viirya
Copy link
Member Author

viirya commented Apr 28, 2024

@liurenjie1024 Thanks for review. Sorry for late. I addressed the comments by rewriting the visitors using the new API. I replied with another questions.

Copy link
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @viirya for this great pr! It's really hard work, and you did it in a quite elegant way. I found some small problems to fix.

crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
fn always_true(&mut self) -> Result<Self::T> {
Ok(Box::new(ArrowPredicateFn::new(
self.projection_mask.clone(),
|batch| Ok(BooleanArray::from(vec![true; batch.num_rows()])),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maybe not a blocker, but is it possible to build a const array in arrow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I know, there is no const array in arrow for now. It has dictionary array which I think is close to const array, but the ArrowPredicateFn API defines returned type to be BooleanArray.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created an issue to track this: apache/arrow-rs#5701

crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
reference: &BoundReference,
_predicate: &BoundPredicate,
) -> Result<Self::T> {
let projected_mask = self.bound_reference(reference)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems incorrect to me. Let's say the predicate is a is null AND b >1, then the batch passed to this ArrowPredicateFn is constructed by projection mask of [a, b]. I think one possible solution is to use same project mask for all predicates, and pass the column_idx to get_leaf_column.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, as RowFilter API design accepts multiple ArrowPredicates. Each ArrowPredicate has its projection and the API doc said the API will be passed batches that contains the columns specified in projection. I think it should be projected for each ArrowPredicate and projection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will go to test it in arrow-rs to verify it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, although I used multiple ArrowPredicates before, it was changed to one ArrowPredicates after you suggested. So now we generate only one ArrowPredicates.

Hmm, let me think how do deal with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we use one ArrowPredicates to represent the entire predicate, we should use one projection mask which contains all leaf columns in the predicate.

But it brings another question, how do we access the correct array from the RecordBatch. For top-level column, it should be straightforward, but for nested column, I don't find a way to get it quickly.

I created one issue at arrow-rs: apache/arrow-rs#5699

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on what I searched and the kindly reply on the issue, I think there is no way to do nested projection on RecordBatch currently.

To implement the feature in arrow-rs might block this. I tend to finish top-level column only in this PR.

WDYT, @liurenjie1024 ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we can support top-level column only first to move on.

crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
}

#[tokio::test]
async fn test_filter_on_arrow_is_not_null() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests, is it possible to add serveral test cases for more complex types such as AND, OR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I added some more tests using AND and OR.

},
)))
} else {
self.build_always_true()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a column is missing, I think we should treat it as null, so this should be false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed it.

Ok(BooleanArray::from(vec![true; batch.num_rows()]))
})))
} else {
self.build_always_true()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
},
)))
} else {
self.build_always_true()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved
@viirya
Copy link
Member Author

viirya commented May 10, 2024

@liurenjie1024 I've addressed your comments. Please take a look when you can. Thanks.

Copy link
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @viirya for this effort!

@liurenjie1024
Copy link
Collaborator

Oh, sorry, seems we need to resolve conflicts. Others LGTM, thanks!

@viirya
Copy link
Member Author

viirya commented May 14, 2024

Thanks @liurenjie1024. I just resolved the conflicts.

@liurenjie1024
Copy link
Collaborator

Thanks @viirya for this great effort!

@liurenjie1024 liurenjie1024 merged commit 81df940 into apache:main May 15, 2024
7 checks passed
@viirya
Copy link
Member Author

viirya commented May 15, 2024

Thanks @liurenjie1024 for your review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert row filter to arrow filter
2 participants