-
-
Notifications
You must be signed in to change notification settings - Fork 749
Open
Labels
Description
Currently the worker_objective function uses worker managed memory as a tiebreaker if it looks like a task will start in the same amount of time on multiple workers:
distributed/distributed/scheduler.py
Line 3236 in 00bf8ed
| return (start_time, ws.nbytes) |
In a heterogeneous cluster, this means we might pick a small worker with less memory available instead of a large worker with lots of memory available, but more total data in memory.
Maybe we should compare by percentage of memory used, rather than total bytes used:
diff --git a/distributed/scheduler.py b/distributed/scheduler.py
index eb5828bf..5325af4b 100644
--- a/distributed/scheduler.py
+++ b/distributed/scheduler.py
@@ -3233,7 +3233,7 @@ class SchedulerState:
if ts.actor:
return (len(ws.actors), start_time, ws.nbytes)
else:
- return (start_time, ws.nbytes)
+ return (start_time, ws.nbytes / ws.memory_limit)
def add_replica(self, ts: TaskState, ws: WorkerState):
"""Note that a worker holds a replica of a task with state='memory'"""#7248 does this for root tasks when queuing is enabled. I think it would make sense to do in all cases though.
Reactions are currently unavailable