docs(arrow-select): document FilterSelection / FilterPredicate::selection (docs for #9755)#10056
Draft
alamb wants to merge 2 commits into
Draft
docs(arrow-select): document FilterSelection / FilterPredicate::selection (docs for #9755)#10056alamb wants to merge 2 commits into
alamb wants to merge 2 commits into
Conversation
Teach BatchCoalescer to reuse a FilterPredicate when coalescing filtered batches whose non-primitive columns are inline Utf8View/BinaryView values. This avoids materializing an intermediate filtered RecordBatch for sparse filters and copies inline views and nulls directly into the in-progress arrays. Keep materialized filtering for dense filters, batches that do not fit the coalescer buffer, and byte-view arrays with external buffers. Use a looser dense threshold for multi-column batches, where sharing the row selection across columns pays for itself. Add shared FilterSelection iterators so primitive and byte-view coalescers can consume materialized or lazy row selections without matching per row. Signed-off-by: cl <cailue@apache.org>
…ection Add rationale and usage docs for the new `pub(crate)` filtering APIs introduced alongside the fused inline-view coalescing path: explain that `FilterSelection` borrows the predicate's internal indices/slices so the same predicate can drive several arrays without cloning, document each `FilterSelection` variant, the `FilterIterator` materialized/lazy split, its `for_each`/`try_for_each` helpers, and the `strategy` field. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Documentation follow-up for #9755 (
arrow-select: fuse inline Utf8View/BinaryView filter coalescing).Note
This branch is stacked on top of #9755 — that PR is not yet merged to
main, so the diff here also shows its feature commit. The contribution in this PR is the single docs commit on top (arrow-select/src/filter.rsonly); the intent is to fold it into #9755 (or merge alongside it).Rationale for this change
The
pub(crate)filtering APIs added alongside the fused inline-view path (FilterSelection,FilterIterator, andFilterPredicate::selection) had little explanation of why they exist or how to use them. This adds that rationale.What changes are included in this PR?
Comments only
Are there any user-facing changes?
No (the documented items are
pub(crate)).