Skip to content

[python] Add per-partition bucket pruning for HASH_FIXED tables#7804

Merged
JingsongLi merged 2 commits into
apache:masterfrom
TheR1sing3un:py-bucket-pruning-per-partition
May 11, 2026
Merged

[python] Add per-partition bucket pruning for HASH_FIXED tables#7804
JingsongLi merged 2 commits into
apache:masterfrom
TheR1sing3un:py-bucket-pruning-per-partition

Conversation

@TheR1sing3un
Copy link
Copy Markdown
Member

@TheR1sing3un TheR1sing3un commented May 10, 2026

Background

PR-5.4 (#7744) added bucket pruning for HASH_FIXED tables but only on
the bucket-key dimension. Predicates that mix a partition column and
a bucket column under a top-level OR — e.g.
(part='a' AND id=1) OR (part='b' AND id=2) — couldn't be pruned:
the OR mixes two dimensions, so the existing logic gave up and read
every bucket in both partitions. PR-5.4 left this as a TODO in the
module docstring.

Effect

Same query now reads exactly one bucket per partition (the bucket
holding id=1 in part='a', the bucket holding id=2 in
part='b'). The selector evaluates the predicate per partition
value first — the OR collapses to a single AND inside each partition
— and bucket selection runs on that simplified form.

Soundness contract is unchanged: the bucket set remains a superset
of the buckets that contain matching rows; any error falls open to
"all buckets accept", never drops a bucket with matches.

Two commits — helper + FileScanner wiring. 9 unit tests cover the
predicate-fold walker and the per-partition cache; one e2e test on a
2-partition × 4-bucket table proves the mixed-OR query reads ≤ 2
splits instead of one per (partition, bucket).

Adds the predicate-replace + AND/OR fold infrastructure that lets the
bucket selector specialise itself per concrete partition value, the
piece called out as a TODO at the bottom of the bucket_select_converter
module docstring.

Three pieces ship in this commit, all internal:

* ``replace_partition_predicate(predicate, partition_field_names,
  partition_values)``: walker that substitutes partition leaves with
  their evaluated truth value and folds AND/OR. Three-way return —
  ``None`` (cleared / always true), ``False`` (always false), or
  the simplified ``Predicate``.
* ``_Selector`` is now keyed by ``(partition_tuple, total_buckets)``
  and accepts a third positional ``partition`` arg in ``__call__``.
  Two-arg legacy callers (early manifest filter) still work — they
  get the partition-agnostic over-approximation.
* ``create_bucket_selector`` now takes an optional ``partition_fields``
  list. The selector built without it (or with a predicate that does
  not touch any partition column) keeps the existing shape and result.

This commit does not yet wire the partition into ``FileScanner``;
``_filter_manifest_entry`` still calls the selector with two args, so
all existing pushdown_bucket tests stay green.

Tests: nine new unit cases covering ``replace_partition_predicate``
folding, the per-partition cache, fall-through when partition is
unknown, and the empty-bucket-set result for an unsatisfiable
partition.
Switches ``_filter_manifest_entry`` to call the bucket selector with
the entry's partition row, and passes the table's partition fields
into ``create_bucket_selector`` so the selector can specialise the
predicate per concrete partition value.

The early manifest filter (``_build_early_bucket_filter``) still uses
the two-arg form because the partition row hasn't been deserialised
at that stage; the selector internally falls back to a sound
partition-agnostic over-approximation there. Per-partition tightening
runs on the late filter once the entry is fully decoded.

End-to-end test: ``(part='a' AND id=1) OR (part='b' AND id=2)`` on a
two-partition four-bucket table, asserting both correctness (only the
two matching rows come back) and pruning effectiveness (≤ 2 splits
instead of one per (partition, bucket) combination).
Copy link
Copy Markdown
Contributor

@leaves12138 leaves12138 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the change against the Java BucketSelectConverter / BucketSelector / PartitionValuePredicateVisitor flow.

The important semantics line up: the selector specializes the predicate with the concrete partition value before extracting finite Equal/In bucket-key constraints, keeps the existing MAX_VALUES / fail-open behavior for unsupported bucket predicates, and keeps the early manifest filter conservative when the partition is still unknown.

One difference from Java is intentional and correctness-safe: when partition specialization folds the predicate to false, Python returns an empty bucket set, while Java's BucketSelector itself falls open and relies on the scan-level partition filter to drop the entry. Since no row in that concrete partition can satisfy the complete predicate, the stricter Python pruning does not introduce false negatives.

The added tests cover the partition predicate folding, per-partition cache keying, unknown-partition fallback, and the mixed partition/bucket OR integration case. Looks good to me.

@JingsongLi
Copy link
Copy Markdown
Contributor

+1

@JingsongLi JingsongLi merged commit 5eee480 into apache:master May 11, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants