I would want to be able to have some fairness between clients.
Use case:
- A cluster is running indefinitely
- Client A connects and submits a job which will take 2 hours to execute
- 10 minutes later Client B connects and submits a job which would take 1 minute to finish
Client B now must wait for 2 hours before it will get access to a worker to execute its task.
My suggestion is that the worker selects tasks in a round robin fashion from different clients to achieve a very simple fairness between clients, so small jobs can jump ahead in the queue.
In practice in our environment we have jobs that take between a few minutes and several days to complete.
I have not made any modifications to the DASK source code, but is happy to have a look if this feature would be valuable to someone else. Can you give some pointers on where it makes sense to implement this, and which approach you would recommend. Thanks!
I would want to be able to have some fairness between clients.
Use case:
Client B now must wait for 2 hours before it will get access to a worker to execute its task.
My suggestion is that the worker selects tasks in a round robin fashion from different clients to achieve a very simple fairness between clients, so small jobs can jump ahead in the queue.
In practice in our environment we have jobs that take between a few minutes and several days to complete.
I have not made any modifications to the DASK source code, but is happy to have a look if this feature would be valuable to someone else. Can you give some pointers on where it makes sense to implement this, and which approach you would recommend. Thanks!