Skip to content

Add batched SegmentZKMetadata iteration API and migrate controller usages#17706

Open
xiangfu0 wants to merge 1 commit intoapache:masterfrom
xiangfu0:codex/add-batch-segment-metadata-api
Open

Add batched SegmentZKMetadata iteration API and migrate controller usages#17706
xiangfu0 wants to merge 1 commit intoapache:masterfrom
xiangfu0:codex/add-batch-segment-metadata-api

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Feb 15, 2026

Summary

  • Add batched/streaming SegmentZKMetadata enumeration API in PinotHelixResourceManager via ZKMetadataProvider
  • Add configurable batch size support (default 1000) for retention scanning
  • Migrate heavy controller callsites from list-based getSegmentsZKMetadata(...) to iterative processing where applicable

@codecov-commenter
Copy link

codecov-commenter commented Feb 15, 2026

Codecov Report

❌ Patch coverage is 0% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.61%. Comparing base (657e7f0) to head (2005659).

Files with missing lines Patch % Lines
...ache/pinot/common/metadata/ZKMetadataProvider.java 0.00% 26 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (657e7f0) and HEAD (2005659). Click for more details.

HEAD has 8 uploads less than BASE
Flag BASE (657e7f0) HEAD (2005659)
java-21 5 4
unittests 4 2
temurin 10 8
java-11 5 4
unittests2 2 0
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17706      +/-   ##
============================================
- Coverage     63.23%   55.61%   -7.62%     
+ Complexity     1502      727     -775     
============================================
  Files          3179     2479     -700     
  Lines        190710   140471   -50239     
  Branches      29153    22384    -6769     
============================================
- Hits         120597    78126   -42471     
+ Misses        60746    55751    -4995     
+ Partials       9367     6594    -2773     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 55.59% <0.00%> (-7.62%) ⬇️
java-21 55.57% <0.00%> (-7.61%) ⬇️
temurin 55.61% <0.00%> (-7.62%) ⬇️
unittests 55.61% <0.00%> (-7.62%) ⬇️
unittests1 55.61% <0.00%> (-0.02%) ⬇️
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 force-pushed the codex/add-batch-segment-metadata-api branch from 28e86e3 to 17da5ce Compare February 16, 2026 03:37
@xiangfu0 xiangfu0 force-pushed the codex/add-batch-segment-metadata-api branch from 17da5ce to 2005659 Compare February 16, 2026 08:57
_pinotHelixResourceManager.getSegmentsZKMetadata(tableNameWithType);

for (SegmentZKMetadata segmentZKMetadata : segmentZKMetadataList) {
_pinotHelixResourceManager.forEachSegmentsZKMetadata(tableNameWithType, segmentZKMetadata -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this actually be slower due to more ZK point calls rather than earlier getChildren() calls ?

I think if use cases require all data, then earlier getChildren() calls is probably a better choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For propertyStore, underlying it's also wrapping a for loop on top of zkclient to call each path to fetch the znRecord then construct the segment metadata.

So there is no difference in terms of the zk overhead.

GetChildren for all segment name is always just one zk call to fetch all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants