Skip to content

feat(scan): add BucketSelectConverter for predicate-based bucket pruning#234

Merged
lxy-9602 merged 18 commits into
alibaba:mainfrom
liangjie3138:dev_bucket_selector
Apr 22, 2026
Merged

feat(scan): add BucketSelectConverter for predicate-based bucket pruning#234
lxy-9602 merged 18 commits into
alibaba:mainfrom
liangjie3138:dev_bucket_selector

Conversation

@liangjie3138
Copy link
Copy Markdown
Contributor

Purpose

When reading bucketed primary key tables, if the query predicate contains equality conditions on all bucket key fields (for example, a point lookup such as WHERE pk = 'value'), the current implementation still scans data files from all buckets.

This PR introduces BucketSelectConverter, which automatically extracts literal values for bucket keys from EQUAL predicates and computes the target bucket ID using the configured bucket function (DEFAULT, MOD, or HIVE). This allows irrelevant buckets to be pruned during the scan phase, significantly reducing I/O and computation overhead for point lookup scenarios.

Main Changes

  • Add a new BucketSelectConverter class that extracts equality constraints on bucket key fields from AND predicates, builds a BinaryRow, and computes the target bucket using the corresponding BucketFunction.
  • Integrate the converter into KeyValueFileStoreScan::SplitAndSetKeyValueFilter(), so that when a target bucket can be derived, a bucket filter is automatically applied.
  • Add a protected method SetBucketFilterIfAbsent() in FileStoreScan to ensure that an explicitly configured bucket filter is not overridden.

Tests

  • add BucketSelectConverterTest

API and Format

None. This change does not affect public APIs under the include directory and does not introduce any storage format or protocol changes.

Documentation

None.

Generative AI Tooling

Generated-by: Claude Code (Claude Opus 4.6)

Comment thread src/paimon/core/bucket/bucket_select_converter.h
Comment thread src/paimon/core/bucket/bucket_select_converter.h Outdated
Comment thread src/paimon/core/bucket/bucket_select_converter.cpp
Comment thread src/paimon/core/bucket/bucket_select_converter.cpp
Comment thread src/paimon/core/bucket/bucket_select_converter.cpp Outdated
Comment thread src/paimon/core/bucket/bucket_select_converter.cpp
Comment thread src/paimon/core/bucket/bucket_select_converter_test.cpp Outdated
Comment thread src/paimon/core/bucket/bucket_select_converter_test.cpp Outdated
Comment thread src/paimon/core/bucket/bucket_select_converter_test.cpp Outdated
Comment thread src/paimon/CMakeLists.txt
@lxy-9602
Copy link
Copy Markdown
Collaborator

Great work — thanks for the high-quality submission!
Looking forward to your updates. Feel free to reach out if anything is unclear.

@liangjie3138 liangjie3138 requested a review from lxy-9602 April 20, 2026 06:11
lxy-9602
lxy-9602 previously approved these changes Apr 21, 2026
Copy link
Copy Markdown
Collaborator

@lxy-9602 lxy-9602 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

lszskye
lszskye previously approved these changes Apr 21, 2026
Comment thread src/paimon/core/bucket/bucket_select_converter.cpp
Copy link
Copy Markdown
Collaborator

@lxy-9602 lxy-9602 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@lxy-9602 lxy-9602 merged commit d39b3df into alibaba:main Apr 22, 2026
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants