Skip to content

branch-4.0: [fix](iceberg) Fix NPE in COUNT(*) pushdown when snapshot summary omits total-* counters (#64648)#65060

Merged
hello-stephen merged 1 commit into
apache:branch-4.0from
morningman:40_bp64648
Jul 1, 2026
Merged

branch-4.0: [fix](iceberg) Fix NPE in COUNT(*) pushdown when snapshot summary omits total-* counters (#64648)#65060
hello-stephen merged 1 commit into
apache:branch-4.0from
morningman:40_bp64648

Conversation

@morningman

Copy link
Copy Markdown
Contributor

Proposed changes

Backport of #64648 to branch-4.0.

IcebergScanNode.getCountFromSnapshot() read total-equality-deletes /
total-position-deletes / total-records from the Iceberg snapshot summary and
called .equals() / Long.parseLong() directly on the Map.get() results. An
Iceberg snapshot summary is not guaranteed to carry these total-* counters
(snapshots written by compaction/replace, or some writers, may omit them), so
SELECT COUNT(*) threw a NullPointerException on such tables while SELECT *
worked fine.

Fix

  • Extract the summary parsing into a pure static
    IcebergUtils.getCountFromSummary(summary, ignoreDanglingDelete) that
    null-checks the counters and falls back to a normal scan (return -1) when any
    required counter is absent, and reuse it from both IcebergScanNode and
    IcebergUtils.getIcebergRowCount().
  • BE: IcebergTableReader::init_row_filters() accepts a table-level row count of
    0 (>= 0 instead of > 0) so a genuine pushed-down count of 0 takes the
    CountReader fast path. Applied to this branch's
    be/src/vec/exec/format/table/iceberg_reader.cpp (master's
    iceberg_reader_mixin.h does not exist on this branch).

Unit test IcebergCountPushDownTest covers the missing-counter, no-delete,
equality-delete, position-delete and zero-count cases.

… summary omits total-* counters (apache#64648)

Backport of apache#64648 to branch-4.0.

IcebergScanNode.getCountFromSnapshot() read total-equality-deletes /
total-position-deletes / total-records from the snapshot summary and called
.equals() / Long.parseLong() directly on the Map.get() results. An Iceberg
snapshot summary is not guaranteed to carry these counters (snapshots written
by compaction/replace, or some writers, may omit them), so `SELECT COUNT(*)`
threw a NullPointerException on such tables while `SELECT *` worked fine.

Extract the summary parsing into a pure static IcebergUtils.getCountFromSummary()
that null-checks the counters and falls back to a normal scan (returns -1) when
any is absent, and reuse it from both IcebergScanNode and IcebergUtils row-count
estimation. On the BE side, IcebergTableReader::init_row_filters() now accepts a
table-level row count of 0 (>= 0 instead of > 0) so a genuine pushed-down count
of 0 takes the CountReader fast path.

The BE change is applied to branch-4.0's
be/src/vec/exec/format/table/iceberg_reader.cpp, since master's
iceberg_reader_mixin.h does not exist on this branch.

Co-Authored-By: Raghvendra Singh <raghav@cashify.in>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.44% (25545/35758)
Line Coverage 54.37% (271086/498639)
Region Coverage 51.96% (224834/432718)
Branch Coverage 53.35% (96739/181321)

@hello-stephen hello-stephen merged commit 809692d into apache:branch-4.0 Jul 1, 2026
28 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants