Skip to content

Node label/capability based scheduling (for CeleryExecutor) #35174

@hterik

Description

@hterik

Description

Schedule tasks based on labels/capabilities with and+or matching, instead of just queue names.

Use case/motivation

Imagine you have tasks that can only run on specific nodes, for example GPUs, different operating systems or other external HW-peripherals connected.

Some tasks have partial constraints,
Task A: requires gpu
Task B: requires linux
Task C: requires gpu && linux
Task D: requires gpu && windows
Task E: requires gpu || mac-m1-cpu

To serve the above you might have 5 different nodes
Node 1: Linux (can serve task B only)
Node 2: Linux+Gpu (can serve A+B+C+D+E)
Node 3: Windows+Gpu (can serve A+D)

Optimizing this type of planning is, from what i understand, not possible with CeleryExecutor today. Airflow celery workers can only listen to a list of queues. In the scenario above, task D should not be assigned to a single queue, because there are two workers that could potentially execute it. Node2+3 can't listen to a common queue though because they support different features. Instead you would have to compromise and choose one of the nodes for task D.

Is this even possible to solve with Celery? I've scrutinized the underlying Celery documents deeply and it has lots of advanced features for inserting items to the queues using exchanges and routers, but it seems like consumers only read from fixed queue-names.

Am i missing something or is this correct? It feels like this should be a very common scenario.
Compared with other systems, Kubernetes can do it with nodeSelectors, affinity and anti-affinity. Using Airflow KubernetesExecutor these can be injected in pod_override.
Jenkins has label conditions

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions