Skip to content

feat: Support data evolution row id filter#222

Merged
JingsongLi merged 23 commits intoapache:mainfrom
littlecoder04:support_row_id_filter
Apr 7, 2026
Merged

feat: Support data evolution row id filter#222
JingsongLi merged 23 commits intoapache:mainfrom
littlecoder04:support_row_id_filter

Conversation

@littlecoder04
Copy link
Copy Markdown
Contributor

@littlecoder04 littlecoder04 commented Apr 7, 2026

Purpose

Linked issue: sub task of #173

Brief change log

Tests

API and Format

Documentation

…Selection instead of skipping IO filtering

- Replace post-read filter approach with pre-computed selected row ID sequence
- RowSelection is always applied at Parquet level for IO optimization
- Row IDs are assigned from the pre-computed sequence matching RowSelection output
- Extract insert_column_at to deduplicate column insertion logic
- Empty row_ranges treated as None (no filtering)
- Use saturating_add to prevent overflow in merge_row_ranges and build_row_ranges_selection
- Compute row_ranges before moving file_group to avoid clone
- Remove arrow-select dependency (no longer needed)
…redundant merge, avoid clone in merge_row_ranges, update limit comment
Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 96c8715 into apache:main Apr 7, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants