Skip to content

Clarification on worker resources documentation #3552

@chrisroat

Description

@chrisroat

In the worker resources docs, it shows the following scenario.

dask-worker scheduler:8786 --nprocs 3 --nthreads 2 --resources "process=1"

With the code below, there will be at most 3 tasks running concurrently and each task will run in a separate process:

from distributed import Client
client = Client('scheduler:8786')

futures = [client.submit(non_thread_safe_function, arg,
                         resources={'process': 1}) for arg in args]

If additional tasks are scheduled without the process resource, might they be scheduled on the unused threads? The scenario I'm concerned about is when the {process: 1} consumes multiple threads, effectively starving these additional tasks.

If this is so, this leads me down the road of tagging all my additional tasks (and other workers) to avoid this situation. Consequently, I then lose out on scheduling on the process worker with these tasks when no process tasks are running. (Is this clear?)

I'd be happy to update the documentation to help elucidate this situation, once I feel I fully grok what is going on.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions