-
-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adaptive.needs_cpu does not depend on number of tasks remaining #2329
Comments
Thanks for the excellent issue @delgadom . Your analysis seems spot-on to me. Rather than iterate through tasks, it might be enough to either:
|
I have a slight preference for the second choice I think |
Thanks @mrocklin. The first option is actually the one I implemented - I first tried iterating through all tasks and it took about 10-100x as long! The second option definitely sounds reasonable - I didn't try this because I was nervous about restructuring the The part where I think the change should be made is the start of def recommendations(self, comm=None):
should_scale_up = self.should_scale_up()
workers = set(self.workers_to_close(key=self.worker_key,
minimum=self.minimum))
if should_scale_up and workers:
logger.info("Attempting to scale up and scale down simultaneously.")
self.close_counts.clear()
return {'status': 'error',
'msg': 'Trying to scale up and down simultaneously'}
... This in turn calls
Could we postpone the def recommendations(self, comm=None):
workers_to_close = set(self.workers_to_close(key=self.worker_key,
minimum=self.minimum))
if workers_to_close:
d = {}
to_close = []
for w, c in self.close_counts.items():
if w in workers_to_close:
if c >= self.wait_count:
to_close.append(w)
else:
d[w] = c
for w in workers_to_close:
d[w] = d.get(w, 0) + 1
self.close_counts = d
if to_close:
return {'status': 'down', 'workers': to_close}
elif self.should_scale_up():
self.close_counts.clear()
return toolz.merge({'status': 'up'}, self.get_scale_up_kwargs())
else:
self.close_counts.clear()
return None I may be missing an edge case here, and if so, I'm happy to modify If this looks good to you I'll look into how this is tested. |
Oh this will work on the first iteration, but def needs_cpu(self):
"""
Check if the cluster is CPU constrained (too many tasks per core)
Notes
-----
Returns ``False`` if any workers are idle. Otherwise, returns ``True``
if the occupancy per core is some factor larger than ``startup_cost``.
"""
if not all(ws.processing for ws in self.scheduler.workers.values()):
return False
total_occupancy = self.scheduler.total_occupancy
total_cores = sum([ws.ncores for ws in self.scheduler.workers.values()])
... |
actually this may make more sense in def should_scale_up(self):
"""
Determine whether additional workers should be added to the cluster
Returns
-------
scale_up : bool
Notes
----
Additional workers are added whenever
1. There are fewer workers than our minimum
2. There are unrunnable tasks and no workers
3. There are no idle tasks, and
a. The cluster is CPU constrained, or
b. The cluster is RAM constrained
See Also
--------
needs_cpu
needs_memory
"""
with log_errors():
if len(self.scheduler.workers) < self.minimum:
return True
if self.maximum is not None and len(self.scheduler.workers) >= self.maximum:
return False
if self.scheduler.unrunnable and not self.scheduler.workers:
return True
if not all(ws.processing for ws in self.scheduler.workers.values()):
return False
needs_cpu = self.needs_cpu()
needs_memory = self.needs_memory()
if needs_cpu or needs_memory:
return True
return False |
Hmm actually it seems tough to get around the need to check the number of tasks. Should have thought of this originally, but once the idle workers have been scaled down, def should_scale_up(self):
"""
Determine whether additional workers should be added to the cluster
Returns
-------
scale_up : bool
Notes
----
Additional workers are added whenever
1. There are fewer workers than our minimum
2. There are unrunnable tasks and no workers
3. There are no idle workers and the number of pending tasks exceeds
the number of workers, and
a. The cluster is CPU constrained, or
b. The cluster is RAM constrained
See Also
--------
needs_cpu
needs_memory
"""
with log_errors():
if len(self.scheduler.workers) < self.minimum:
return True
if self.maximum is not None and len(self.scheduler.workers) >= self.maximum:
return False
if self.scheduler.unrunnable and not self.scheduler.workers:
return True
if not all(ws.processing for ws in self.scheduler.workers.values()):
return False
tasks_processing = sum((len(w.processing) for w in self.scheduler.workers.values()))
num_workers = len(self.scheduler.workers)
if tasks_processing <= num_workers:
return False
needs_cpu = self.needs_cpu()
needs_memory = self.needs_memory()
if needs_cpu or needs_memory:
return True
return False
|
Is this issue resolved now that scheduler has absorbed the logic in |
I personally don't know. If someone wants to look though I would recommend starting here: distributed/distributed/scheduler.py Lines 5209 to 5260 in 2acffc3
|
This one must be fixed by #2330! |
Indeed, thank you for following up here @guillaumeeb! |
Issue description
We're using distributed (with
KubeCluster
) with client.map to schedule a lot of long-running tasks (right now we're running a Fortran-based hydrological model).We noticed that clusters don't scale down when the number of tasks remaining falls below the number of workers until all tasks have completed.
I isolated the problem to
Adaptive.needs_cpu()
. The current method does not check whether there are any pending tasks on the scheduler:This results in
adapt.recommendations()
returning the error messageTrying to scale up and down simultaneously
whenever there are fewer pending tasks than there are workers, as long as the average task time suggests that more cores are needed (independent of the number of pending tasks).Proposed solution
I implemented a quick fix, by finding the total number of pending tasks and only recommending a "scale up" if the number of tasks exceeds the number of existing workers, in addition to the current criteria:
Pros
Cons
needs_cpu
. I tested this out on limited cases with between 800 - 100,000 tasks and found the current implementation usually takes ~ 30-40 µs, and the proposed implementation roughly doubles this. There may be faster ways of doing this, but I imagine this may be a critical problem with this implementation, so help would be appreciated in estimating tasks remaining more quickly!Testable example
Requires some interactivity, but reliably re-produces the problem
The text was updated successfully, but these errors were encountered: