Skip to content

[core] Support partition predicate pushdown for PartitionsTable#7628

Open
sundapeng wants to merge 3 commits intoapache:masterfrom
sundapeng:feat/partitions-table-predicate-pushdown
Open

[core] Support partition predicate pushdown for PartitionsTable#7628
sundapeng wants to merge 3 commits intoapache:masterfrom
sundapeng:feat/partitions-table-predicate-pushdown

Conversation

@sundapeng
Copy link
Copy Markdown
Member

Purpose

PartitionsTable.withFilter() was a no-op (// TODO at line 189), causing full manifest scans when querying with partition filters like SELECT * FROM t$partitions WHERE partition = 'dt=20260410'. This adds partition predicate pushdown following the same pattern established by BucketsTable (#7592), FilesTable (#7376), ManifestsTable (#7310), and ConsumersTable (#7329).

The implementation uses a dual-path filtering strategy:

  • Catalog path: preserves catalog.listPartitions() call and filters results in memory, keeping metadata columns (created_at, created_by, updated_by, options) intact
  • TableScan fallback path: pushes predicate down to InnerTableScan.withPartitionFilter() for manifest-level pruning

Also refactors PartitionPredicateHelper.applyPartitionFilter() into a two-step build+apply pattern (buildPartitionPredicate() + apply), and extends parsePartitionSpec() to support PartitionsTable's key=value/key=value format in addition to the existing {value1, value2} format.

Tests

  • Equal filter on single partition key
  • IN filter on multiple partition values
  • No-match filter returns empty result
  • Non-partition column filter safely ignored
  • Multi-column partition keys with Equal and IN filters
  • Existing BucketsTableTest passes (verifies refactored helper is backward-compatible)

API and Format

No API or storage format changes.

Documentation

No documentation changes needed.

sundapeng and others added 3 commits April 11, 2026 08:30
PartitionsTable.withFilter() was a no-op (TODO), causing full manifest
scans when querying with partition filters. This adds predicate pushdown
following the same pattern as BucketsTable (apache#7592) and FilesTable (apache#7376).

Key changes:
- PartitionsScan extracts partition predicate via LeafPredicateExtractor
- PartitionsSplit carries the predicate to PartitionsRead
- Catalog path: in-memory filter preserving metadata columns
- TableScan path: manifest-level pushdown via withPartitionFilter
- PartitionPredicateHelper refactored to build+apply two-step pattern
- parsePartitionSpec extended for key=value/key=value format

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…redicate

filterByPredicate used raw p.spec().get(key) which renders null as literal
"null", while toRow substitutes null with defaultPartitionName. This caused
predicate pushdown to fail matching null-valued partitions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
testPartitionPredicateFilterMultiColumnKeys created MultiPartTable directly
via filesystem (SchemaUtils.forceCommit), which works for local catalog but
fails for REST catalog since it's unaware of tables created outside its API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Object value =
TypeUtils.castFromString(
partSpec.get(partitionKeys.get(i)), partitionType.getTypeAt(i));
predicates.add(partBuilder.equal(i, value));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider default partitions here. If the value is the default partition name, this should become isNull(...) rather than equal(...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants