Fix intermediate stage routing for multi-stage engine#11393
Fix intermediate stage routing for multi-stage engine#11393Jackie-Jiang merged 2 commits intoapache:masterfrom
Conversation
|
Are we losing any functionality/feature? What about brokers as intermediate nodes.. also, how can we mark some nodes as shuffle servers in future for expensive queries |
Codecov Report
@@ Coverage Diff @@
## master #11393 +/- ##
=========================================
Coverage 61.44% 61.44%
+ Complexity 6515 6514 -1
=========================================
Files 2234 2234
Lines 120174 120154 -20
Branches 18240 18238 -2
=========================================
- Hits 73838 73829 -9
+ Misses 40911 40894 -17
- Partials 5425 5431 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 16 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@kishoreg We are not losing any functionality, and this should be the correct way to handle it. Without this fix, it can only support tables with the basic balanced assignment. |
| } | ||
| } | ||
| } | ||
| serverInstances = new ArrayList<>(servers.size()); |
There was a problem hiding this comment.
actual serverInstances size might be smaller than servers.size()
There was a problem hiding this comment.
Correct, but this should give a very good estimation. It doesn't need to be exact
| _segmentStates = new SegmentStates(instanceCandidatesMap, servingInstances, unavailableSegments); | ||
| } | ||
|
|
||
| private List<SegmentInstanceCandidate> getEnabledCandidates(List<SegmentInstanceCandidate> candidates, |
There was a problem hiding this comment.
Add some comments since this method is doing more than the method name or fix the method name?
|
this is a thorough fix over #11386 |
Currently we use server tag to determine the worker for intermediate stage, and it has the following issues:
BrokerRoutingManageris super expensive. It reads all table configs for each instance config change (this can result in millions of read when rolling restart all servers for a large cluster)To address the above issues, instead of using tag to determine servers, directly picking the serving servers from the
InstanceSelector.