Skip to content

feat(scan): support data-level stats pruning in TableScan#196

Merged
JingsongLi merged 5 commits intoapache:mainfrom
QuakeWang:feat/data-level-stats-pruning
Apr 4, 2026
Merged

feat(scan): support data-level stats pruning in TableScan#196
JingsongLi merged 5 commits intoapache:mainfrom
QuakeWang:feat/data-level-stats-pruning

Conversation

@QuakeWang
Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #188

Brief change log

Tests

API and Format

Documentation

/// If either min or max BinaryRow has an arity different from
/// `expected_fields`, the stats were likely written in dense mode or
/// under a different schema — making index-based access unsafe.
fn arity_matches(&self, expected_fields: usize) -> bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider schema evolution here, otherwise may occur exception here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is better to merge this first: #197

We can finish schema evolution here. Read schema file to check evolution here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is better to merge this first: #197

We can finish schema evolution here. Read schema file to check evolution here.

Good idea, I will review #197 soon~

}
}

fn split_partition_and_data_predicates(
Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you align this implementation to Java splitPartitionPredicatesAndDataPredicates? apache/paimon#7473 is important to many use cases.

@QuakeWang QuakeWang force-pushed the feat/data-level-stats-pruning branch from a8d8669 to 75802a8 Compare April 4, 2026 00:58
}

// ---------------------------------------------------------------------------
// Limit pushdown integration tests
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove limit tests?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to fix it.

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Thanks @QuakeWang

@JingsongLi JingsongLi merged commit 1f4aac6 into apache:main Apr 4, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(scan): support data-level stats pruning in TableScan

2 participants