Skip to content

Spark: Disable min/max aggregation push down for binary#16328

Merged
huaxingao merged 1 commit into
apache:mainfrom
hantangwangd:disable_min_max_aggregate_pushdown_for_binary
May 14, 2026
Merged

Spark: Disable min/max aggregation push down for binary#16328
huaxingao merged 1 commit into
apache:mainfrom
hantangwangd:disable_min_max_aggregate_pushdown_for_binary

Conversation

@hantangwangd
Copy link
Copy Markdown
Contributor

For min/max aggregation push down, binary has the same limitations as string. Iceberg may truncate min/max statistics for binary columns in Parquet files. Since min/max push down is already disabled for string, this PR extends the same behavior to binary columns to avoid potential incorrect results. Refer to the test case in this PR for details.

Co-authored-by: Timothy Meehan <tim@timdmeehan.com>
@github-actions github-actions Bot added the spark label May 14, 2026
@hantangwangd hantangwangd marked this pull request as ready for review May 14, 2026 07:06
@hantangwangd
Copy link
Copy Markdown
Contributor Author

Hi @huaxingao, thanks for review #16320. This PR is another fix for min/max aggregation push down, applying the same constraints to binary columns as are already applied to string columns. Since Iceberg may truncate min/max statistics for binary columns in Parquet files.

Please take a look when you have a chance, thanks a lot!

@huaxingao huaxingao changed the title Spark: Also disable min/max aggregation push down for binary Spark: Disable min/max aggregation push down for binary May 14, 2026
@huaxingao huaxingao merged commit 87a7e4b into apache:main May 14, 2026
28 checks passed
@huaxingao
Copy link
Copy Markdown
Contributor

Thanks @hantangwangd for the PR!

@hantangwangd hantangwangd deleted the disable_min_max_aggregate_pushdown_for_binary branch May 14, 2026 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants