Skip to content

[improve](nereids) filter nereidsPrunedTabletIds per partition in distributionPrune#63851

Open
Larborator wants to merge 1 commit into
apache:masterfrom
Larborator:optimize/distribute-prune-per-partition
Open

[improve](nereids) filter nereidsPrunedTabletIds per partition in distributionPrune#63851
Larborator wants to merge 1 commit into
apache:masterfrom
Larborator:optimize/distribute-prune-per-partition

Conversation

@Larborator
Copy link
Copy Markdown
Contributor

@Larborator Larborator commented May 28, 2026

What problem does this PR solve?

#53403 short-circuited distributionPrune to return the entire nereidsPrunedTabletIds set when running under Nereids. However, the caller computeTabletInfo invokes distributionPrune inside a per-partition loop and then iterates the returned ids, calling MaterializedIndex.getTablet(id) on each. When nereidsPrunedTabletIds contains tablets across many
partitions, every per-partition iteration walks the entire global set and does a getTablet hash lookup on ids that belong to other partitions (which are then filtered out by the null check), yielding O(partitionNum * globalPrunedSize) lookups. The short-circuit also copies the full HashSet into a new ArrayList once per partition.

Filter the global set down to the current partition's tablet ids (tabletIdsInOrder, already prepared by the caller) before returning. The result is identical to what the caller's null-check would have produced, so behavior is unchanged; only the redundant lookups and copies are eliminated. The non-Nereids path, the sampleTabletIds path and the empty-set
fallback are untouched.

Issue Number: close #63854

Related PR: #53403

Problem Summary:

Plan time of OlapScan queries with many partitions and many globally pruned tablets degrades quadratically due to redundant per-partition iterations over the global pruned tablet set in OlapScanNode.distributionPrune. Restore per-partition complexity by filtering the global set down to the current partition's tablets before returning.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Larborator Larborator force-pushed the optimize/distribute-prune-per-partition branch from ee08fcc to 04cc2a6 Compare May 28, 2026 12:49
@Larborator Larborator changed the title [optimize](nereids) filter nereidsPrunedTabletIds per partition in distributionPrune [improve](nereids) filter nereidsPrunedTabletIds per partition in distributionPrune May 28, 2026
…tributionPrune

apache#53403 short-circuited `distributionPrune` to return the entire
`nereidsPrunedTabletIds` set when running under Nereids. However, the
caller `computeTabletInfo` invokes `distributionPrune` inside a
per-partition loop and then iterates the returned ids, calling
`MaterializedIndex.getTablet(id)` on each. When `nereidsPrunedTabletIds`
contains tablets across many partitions, every per-partition iteration
walks the entire global set and does a `getTablet` hash lookup on ids
that belong to other partitions (which are then filtered out by the
null check), yielding O(partitionNum * globalPrunedSize) lookups. The
short-circuit also copies the full HashSet into a new ArrayList once
per partition.

Filter the global set down to the current partition's tablet ids
(`tabletIdsInOrder`, already prepared by the caller) before returning.
The result is identical to what the caller's null-check would have
produced, so behavior is unchanged; only the redundant lookups and
copies are eliminated. The non-Nereids path, the `sampleTabletIds`
path and the empty-set fallback are untouched.

Also cache the `selectedTable.getTablet(id)` result in the caller's
loop so the lookup runs once per id instead of twice.
@Larborator Larborator force-pushed the optimize/distribute-prune-per-partition branch from 04cc2a6 to a7bc775 Compare May 28, 2026 13:08
@Larborator
Copy link
Copy Markdown
Contributor Author

/review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug](nereids) distributionPrune doesn't slice nereidsPrunedTabletIds per partition

2 participants