[python] Add per-partition bucket pruning for HASH_FIXED tables#7804
Conversation
Adds the predicate-replace + AND/OR fold infrastructure that lets the bucket selector specialise itself per concrete partition value, the piece called out as a TODO at the bottom of the bucket_select_converter module docstring. Three pieces ship in this commit, all internal: * ``replace_partition_predicate(predicate, partition_field_names, partition_values)``: walker that substitutes partition leaves with their evaluated truth value and folds AND/OR. Three-way return — ``None`` (cleared / always true), ``False`` (always false), or the simplified ``Predicate``. * ``_Selector`` is now keyed by ``(partition_tuple, total_buckets)`` and accepts a third positional ``partition`` arg in ``__call__``. Two-arg legacy callers (early manifest filter) still work — they get the partition-agnostic over-approximation. * ``create_bucket_selector`` now takes an optional ``partition_fields`` list. The selector built without it (or with a predicate that does not touch any partition column) keeps the existing shape and result. This commit does not yet wire the partition into ``FileScanner``; ``_filter_manifest_entry`` still calls the selector with two args, so all existing pushdown_bucket tests stay green. Tests: nine new unit cases covering ``replace_partition_predicate`` folding, the per-partition cache, fall-through when partition is unknown, and the empty-bucket-set result for an unsatisfiable partition.
Switches ``_filter_manifest_entry`` to call the bucket selector with the entry's partition row, and passes the table's partition fields into ``create_bucket_selector`` so the selector can specialise the predicate per concrete partition value. The early manifest filter (``_build_early_bucket_filter``) still uses the two-arg form because the partition row hasn't been deserialised at that stage; the selector internally falls back to a sound partition-agnostic over-approximation there. Per-partition tightening runs on the late filter once the entry is fully decoded. End-to-end test: ``(part='a' AND id=1) OR (part='b' AND id=2)`` on a two-partition four-bucket table, asserting both correctness (only the two matching rows come back) and pruning effectiveness (≤ 2 splits instead of one per (partition, bucket) combination).
leaves12138
left a comment
There was a problem hiding this comment.
Reviewed the change against the Java BucketSelectConverter / BucketSelector / PartitionValuePredicateVisitor flow.
The important semantics line up: the selector specializes the predicate with the concrete partition value before extracting finite Equal/In bucket-key constraints, keeps the existing MAX_VALUES / fail-open behavior for unsupported bucket predicates, and keeps the early manifest filter conservative when the partition is still unknown.
One difference from Java is intentional and correctness-safe: when partition specialization folds the predicate to false, Python returns an empty bucket set, while Java's BucketSelector itself falls open and relies on the scan-level partition filter to drop the entry. Since no row in that concrete partition can satisfy the complete predicate, the stricter Python pruning does not introduce false negatives.
The added tests cover the partition predicate folding, per-partition cache keying, unknown-partition fallback, and the mixed partition/bucket OR integration case. Looks good to me.
|
+1 |
Background
PR-5.4 (#7744) added bucket pruning for HASH_FIXED tables but only on
the bucket-key dimension. Predicates that mix a partition column and
a bucket column under a top-level OR — e.g.
(part='a' AND id=1) OR (part='b' AND id=2)— couldn't be pruned:the OR mixes two dimensions, so the existing logic gave up and read
every bucket in both partitions. PR-5.4 left this as a TODO in the
module docstring.
Effect
Same query now reads exactly one bucket per partition (the bucket
holding
id=1inpart='a', the bucket holdingid=2inpart='b'). The selector evaluates the predicate per partitionvalue first — the OR collapses to a single AND inside each partition
— and bucket selection runs on that simplified form.
Soundness contract is unchanged: the bucket set remains a superset
of the buckets that contain matching rows; any error falls open to
"all buckets accept", never drops a bucket with matches.
Two commits — helper +
FileScannerwiring. 9 unit tests cover thepredicate-fold walker and the per-partition cache; one e2e test on a
2-partition × 4-bucket table proves the mixed-OR query reads ≤ 2
splits instead of one per (partition, bucket).