Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinator documentation is stale #7201

Open
leventov opened this issue Mar 6, 2019 · 3 comments
Open

Coordinator documentation is stale #7201

leventov opened this issue Mar 6, 2019 · 3 comments

Comments

@leventov
Copy link
Member

leventov commented Mar 6, 2019

Coordinator documentation has the following paragraph:

Before any unassigned segments are serviced by historical nodes, the available historical nodes for each tier are first sorted in terms of capacity, with least capacity servers having the highest priority. Unassigned segments are always assigned to the nodes with least capacity to maintain a level of balance between nodes. The coordinator does not directly communicate with a historical node when assigning it a new segment; instead the coordinator creates some temporary information about the new segment under load queue path of the historical node. Once this request is seen, the historical node will load the segment and begin servicing it.

As far as I can tell, both key pieces of information that are communicated in this paragraph are wrong:

  • "Unassigned segments are always assigned to the nodes with least capacity" - no, actually regular balancing rules are in play during loading. (see DruidCoordinatorRuleRunner).
  • "The coordinator does not directly communicate with a historical node when assigning it a new segment" - actually it does, if HTTP announcing (should we better call it "HTTP segment loading info communication"?) is used.

Could somebody please verify my conclusions?

Suggestions about how this paragraph should be rephrased are also welcome (or you can go ahead with a PR yourself).

@clintropolis @egor-ryashin @gianm

@clintropolis
Copy link
Member

Yes, you are right, loading and dropping decisions are done with the BalancerStrategy so that the load rules and the balancer do not make contradictory decisions. I believe DiskNormalizedCostBalancerStrategy would achieve what the current document is describing, but I think CostBalancerStrategy or CachingCostBalancerStrategy don't care so much about disk i think and just prefer to spread out the timeline for maximum fanout.

You are also correct about http segment loading, where the coordinator maintains a list of segments to load and drop (HttpLoadQueuePeon), and talks to the historical directly over http (SegmentListerResource on historical side) to give it a batch of load or drop operations to do.

@clintropolis
Copy link
Member

... actually it does, if HTTP announcing (should we better call it "HTTP segment loading info communication"?) is used.

Additionally, I suppose there are actually 2 ways it communicates directly, since http segment loading (druid.coordinator.loadqueuepeon.type of 'http', what I was describing in previous comment) and http announcing (druid.announcer.type) are sort of independent, since you can do things like use zk based announcements with http loading.

@leventov
Copy link
Member Author

This comment should also be addressed: #7306 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants