In the worker resources docs, it shows the following scenario.
dask-worker scheduler:8786 --nprocs 3 --nthreads 2 --resources "process=1"
With the code below, there will be at most 3 tasks running concurrently and each task will run in a separate process:
from distributed import Client
client = Client('scheduler:8786')
futures = [client.submit(non_thread_safe_function, arg,
resources={'process': 1}) for arg in args]
If additional tasks are scheduled without the process resource, might they be scheduled on the unused threads? The scenario I'm concerned about is when the {process: 1} consumes multiple threads, effectively starving these additional tasks.
If this is so, this leads me down the road of tagging all my additional tasks (and other workers) to avoid this situation. Consequently, I then lose out on scheduling on the process worker with these tasks when no process tasks are running. (Is this clear?)
I'd be happy to update the documentation to help elucidate this situation, once I feel I fully grok what is going on.
In the worker resources docs, it shows the following scenario.
If additional tasks are scheduled without the
processresource, might they be scheduled on the unused threads? The scenario I'm concerned about is when the{process: 1}consumes multiple threads, effectively starving these additional tasks.If this is so, this leads me down the road of tagging all my additional tasks (and other workers) to avoid this situation. Consequently, I then lose out on scheduling on the
processworker with these tasks when noprocesstasks are running. (Is this clear?)I'd be happy to update the documentation to help elucidate this situation, once I feel I fully grok what is going on.