Skip to content

[feature](be) Support parquet minmax aggregate pushdown#63868

Merged
Gabriel39 merged 1 commit into
apache:refact_reader_branchfrom
Gabriel39:dev_0529
May 29, 2026
Merged

[feature](be) Support parquet minmax aggregate pushdown#63868
Gabriel39 merged 1 commit into
apache:refact_reader_branchfrom
Gabriel39:dev_0529

Conversation

@Gabriel39
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary: Add a metadata-backed MIN/MAX aggregate pushdown path for external Parquet readers and gate Iceberg v2 aggregate pushdown when delete files are present.

Release note

Support min/max aggregate pushdown for eligible external Parquet scans.

Check List (For Author)

  • Test: Unit Test / Manual test

    • Added AggregateReaderTest and ParquetReaderTest.minmax_pushdown_from_statistics.

    • Manual test: git diff --check and git diff --cached --check.

    • Not run: run-be-ut.sh failed because this environment only has JDK 11 and requires JDK 17; clang-format script failed because llvm@16 is not installed.

  • Behavior changed: Yes, eligible Parquet scans can return min/max aggregate rows from footer statistics; unsafe Iceberg delete-file scans disable aggregate pushdown.

  • Does this need documentation: No

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Gabriel39 Gabriel39 force-pushed the dev_0529 branch 7 times, most recently from 613b346 to 9723336 Compare May 29, 2026 06:28
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary: Add aggregate pushdown support for the new table reader/file reader path so count and min/max can be served from parquet metadata without changing old vparquet reader, generic reader, or file scanner.

### Release note

Support count and min/max aggregate pushdown in the new table reader parquet path.

### Check List (For Author)

- Test: Unit Test

    - Added TableReaderTest coverage for count, min/max, casted min/max, and Iceberg delete fallback.

    - Not run locally: run-be-ut.sh requires JDK-17, but this machine has JDK-11; cmake/ninja are also unavailable.

- Behavior changed: Yes (new table reader can push down count and min/max aggregates when eligible)

- Does this need documentation: No
@Gabriel39 Gabriel39 merged commit 0ae7e1c into apache:refact_reader_branch May 29, 2026
11 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants