Fix intermediate stage routing for multi-stage engine by Jackie-Jiang · Pull Request #11393 · apache/pinot

Jackie-Jiang · 2023-08-20T02:50:03Z

Currently we use server tag to determine the worker for intermediate stage, and it has the following issues:

Maintaining the table to tag mapping in BrokerRoutingManager is super expensive. It reads all table configs for each instance config change (this can result in millions of read when rolling restart all servers for a large cluster)
It has a bug that only change from new enabled servers can be picked up. Changes to existing servers cannot be picked up
When instance assignment config is used, server tag is ignored, so the mapping will be wrong

To address the above issues, instead of using tag to determine servers, directly picking the serving servers from the InstanceSelector.

kishoreg · 2023-08-20T03:01:38Z

Are we losing any functionality/feature? What about brokers as intermediate nodes.. also, how can we mark some nodes as shuffle servers in future for expensive queries

codecov-commenter · 2023-08-20T03:28:00Z

Codecov Report

Merging #11393 (373b649) into master (1783f2a) will increase coverage by 0.00%.
Report is 2 commits behind head on master.
The diff coverage is 71.73%.

@@            Coverage Diff            @@
##             master   #11393   +/-   ##
=========================================
  Coverage     61.44%   61.44%           
+ Complexity     6515     6514    -1     
=========================================
  Files          2234     2234           
  Lines        120174   120154   -20     
  Branches      18240    18238    -2     
=========================================
- Hits          73838    73829    -9     
+ Misses        40911    40894   -17     
- Partials       5425     5431    +6

Flag	Coverage Δ
integration1	`0.00% <0.00%> (ø)`
integration2	`0.00% <0.00%> (ø)`
java-11	`61.41% <71.73%> (+0.02%)`	⬆️
java-17	`61.29% <71.73%> (+0.01%)`	⬆️
java-20	`61.29% <71.73%> (-0.02%)`	⬇️
temurin	`61.44% <71.73%> (+<0.01%)`	⬆️
unittests1	`66.95% <72.00%> (-0.02%)`	⬇️
unittests2	`14.57% <32.60%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
...ker/routing/instanceselector/InstanceSelector.java	`100.00% <ø> (ø)`
...che/pinot/broker/routing/BrokerRoutingManager.java	`58.74% <20.00%> (-0.45%)`	⬇️
...broker/routing/instanceselector/SegmentStates.java	`87.50% <50.00%> (-12.50%)`	⬇️
.../org/apache/pinot/query/routing/WorkerManager.java	`68.82% <72.00%> (-0.63%)`	⬇️
...routing/instanceselector/BaseInstanceSelector.java	`92.97% <92.85%> (-0.47%)`	⬇️

... and 16 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Jackie-Jiang · 2023-08-20T04:50:21Z

Are we losing any functionality/feature? What about brokers as intermediate nodes.. also, how can we mark some nodes as shuffle servers in future for expensive queries

@kishoreg We are not losing any functionality, and this should be the correct way to handle it. Without this fix, it can only support tables with the basic balanced assignment.
In the future if we want to introduce dedicated shuffle servers, we can add a new tag for it. Initially I tried to solve the problem by maintaining a tag to servers map (no table config read), but realize it won't work for more advanced assignment, e.g. one table referencing the instance partitions from another table (this is a common setup to colocate tables). That won't be big change, so we can do it that way when adding the shuffle server feature.

xiangfu0 · 2023-08-20T12:27:37Z

pinot-query-planner/src/main/java/org/apache/pinot/query/routing/WorkerManager.java

+          }
+        }
+      }
+      serverInstances = new ArrayList<>(servers.size());


actual serverInstances size might be smaller than servers.size()

Correct, but this should give a very good estimation. It doesn't need to be exact

xiangfu0 · 2023-08-20T12:31:19Z

...ker/src/main/java/org/apache/pinot/broker/routing/instanceselector/BaseInstanceSelector.java

+    _segmentStates = new SegmentStates(instanceCandidatesMap, servingInstances, unavailableSegments);
+  }
+
+  private List<SegmentInstanceCandidate> getEnabledCandidates(List<SegmentInstanceCandidate> candidates,


Add some comments since this method is doing more than the method name or fix the method name?

walterddr · 2023-09-15T00:40:00Z

this is a thorough fix over #11386

Fix intermediate stage routing for multi-stage engine

373b649

Jackie-Jiang added enhancement bugfix labels Aug 20, 2023

Jackie-Jiang requested a review from xiangfu0 August 20, 2023 02:50

Jackie-Jiang added the multi-stage Related to the multi-stage query engine label Aug 20, 2023

xiangfu0 approved these changes Aug 20, 2023

View reviewed changes

Address comment

f8a09f1

Jackie-Jiang merged commit 2c02389 into apache:master Aug 20, 2023

Jackie-Jiang deleted the fix_multi_stage_routing branch August 20, 2023 18:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix intermediate stage routing for multi-stage engine#11393

Fix intermediate stage routing for multi-stage engine#11393
Jackie-Jiang merged 2 commits intoapache:masterfrom
Jackie-Jiang:fix_multi_stage_routing

Jackie-Jiang commented Aug 20, 2023

Uh oh!

kishoreg commented Aug 20, 2023

Uh oh!

codecov-commenter commented Aug 20, 2023 •

edited

Loading

Uh oh!

Jackie-Jiang commented Aug 20, 2023 •

edited

Loading

Uh oh!

xiangfu0 Aug 20, 2023

Uh oh!

Jackie-Jiang Aug 20, 2023

Uh oh!

xiangfu0 Aug 20, 2023

Uh oh!

Jackie-Jiang Aug 20, 2023

Uh oh!

walterddr commented Sep 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Jackie-Jiang commented Aug 20, 2023

Uh oh!

kishoreg commented Aug 20, 2023

Uh oh!

codecov-commenter commented Aug 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Jackie-Jiang commented Aug 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xiangfu0 Aug 20, 2023

Choose a reason for hiding this comment

Uh oh!

Jackie-Jiang Aug 20, 2023

Choose a reason for hiding this comment

Uh oh!

xiangfu0 Aug 20, 2023

Choose a reason for hiding this comment

Uh oh!

Jackie-Jiang Aug 20, 2023

Choose a reason for hiding this comment

Uh oh!

walterddr commented Sep 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-commenter commented Aug 20, 2023 •

edited

Loading

Jackie-Jiang commented Aug 20, 2023 •

edited

Loading