-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Scheduler method to compute target cores and memory #2258
Conversation
This moves logic from Adaptive onto the main Scheduler and is a small tentative step towards restructuring deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -957,7 +958,8 @@ def __init__( | |||
'heartbeat_worker': self.heartbeat_worker, | |||
'get_task_status': self.get_task_status, | |||
'get_task_stream': self.get_task_stream, | |||
'register_worker_callbacks': self.register_worker_callbacks | |||
'register_worker_callbacks': self.register_worker_callbacks, | |||
'target_workers': self.target_workers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why making this a handler method that can be called remotely?
The decision of "how many workers should the cluster have" is best made by the scheduler, which has all of the information necessary to make this decision. However the actual creation and destruction of the workers may happen on a separate process (ClusterManager) and so it may want to ask the scheduler for the target number of workers.
There is still plenty of work to do, but I don't anticipate working on this topic personally in the next month.
I'm not sure I understand what you mean here by non-python scheduler |
From previous exchanges, I had the impression that it was the Scheduler that should ask the ClusterManager to scale to a given number of cores, not the ClusterManager asking how many workers to launch, but it may be the correct solution, I'm not sure. It depends where Adaptive logic must be run.
In another issue, IIRC, you speak about the possibility of Scheduler non being a python piece of software in the future. |
Yeah, I don't have a strong opinion here. My guess though is that it will be easier for a ClusterManager to contact the Scheduler rather than the other way. The scheduler is already running a server and we're accustomed to contacting it. If we can avoid setting up another server for the ClusterManager that sounds ideal. The cluster manager will also start the scheduler, and so will probably know the address to contact it. |
Yep you are right, that's one important design décision for #2235:
|
Closing as I think was implemented elsewhere |
This moves logic from Adaptive onto the main Scheduler and is a small
tentative step towards restructuring deployment.
cc @jcrist