Skip to content

Conversation

@yashmayya
Copy link
Contributor

@yashmayya yashmayya commented Dec 10, 2025

  • Currently, ORDER BY / LIMIT / OFFSET queries have two sort stages - a local / initial sort followed by an exchange and a global / final sort.
  • The final sort is done on a single worker / server, and since this is a bottleneck, the expectation is to choose a random or different server for each such query (see https://github.com/apache/pinot/blob/d1497bf77667c165e83c357d3c3fdab5dc957216/pino[…]src/main/java/org/apache/pinot/query/routing/WorkerManager.java
    https://github.com/apache/pinot/blame/d1497bf77667c165e83c357d3c3fdab5dc957216/pin[…]pache/pinot/query/planner/physical/DispatchablePlanVisitor.java).
  • Currently, this works as expected for queries that have an ORDER BY and LIMIT / OFFSET. However, for queries that only have ORDER BY, or only have LIMIT / OFFSET, the hotspot server is always the same. The reason is that the condition to set the singleton worker constraint on a sort node is !node.getCollations().isEmpty() && node.getOffset() != -1. node.getCollations() is empty for non ORDER BY queries and node.getOffset() is -1 for both local and global sort node for ORDER BY queries without LIMIT and OFFSET.
  • So what ends up happening in such queries is that we assign multiple workers / servers for the global sort intermediate stage, but only one worker actually ends up receiving and sending all the data (because the hash exchange uses an empty key selector that routes all the data to the first worker / mailbox). And since the worker / mailbox list is usually ordered in the same way, the hotspot server is always the one assigned to the first worker in the list.
  • This patch fixes the above issue by always using a singleton worker for the global / final sort stage (which is checked by ensuring the input is a mailbox node).

@yashmayya yashmayya added multi-stage Related to the multi-stage query engine enhancement labels Dec 10, 2025
@codecov-commenter
Copy link

codecov-commenter commented Dec 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.22%. Comparing base (4a33926) to head (10742a1).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17347      +/-   ##
============================================
- Coverage     63.27%   63.22%   -0.05%     
+ Complexity     1475     1474       -1     
============================================
  Files          3135     3141       +6     
  Lines        186600   187398     +798     
  Branches      28510    28700     +190     
============================================
+ Hits         118075   118488     +413     
- Misses        59398    59722     +324     
- Partials       9127     9188      +61     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.19% <100.00%> (-0.05%) ⬇️
java-21 63.18% <100.00%> (-0.06%) ⬇️
temurin 63.22% <100.00%> (-0.05%) ⬇️
unittests 63.22% <100.00%> (-0.05%) ⬇️
unittests1 55.64% <100.00%> (-0.01%) ⬇️
unittests2 33.85% <0.00%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yashmayya yashmayya force-pushed the hotspot-server-non-order-by-limit-query-mse branch from c09cd5b to ac5ea0b Compare December 10, 2025 21:27
@yashmayya yashmayya changed the title Use singleton worker for LIMIT queries without ORDER BY to rotate hotspot server Always use singleton worker for final sort stage to ensure hotspot server rotation Dec 10, 2025
@yashmayya yashmayya force-pushed the hotspot-server-non-order-by-limit-query-mse branch from ac5ea0b to 10742a1 Compare December 10, 2025 21:31
@yashmayya yashmayya marked this pull request as ready for review December 10, 2025 22:02
@yashmayya yashmayya merged commit cc77607 into apache:master Dec 11, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants