Skip to content

Change FAQ documentation for max_threads #11266

@shenoykarthikd

Description

@shenoykarthikd

FAQ Documentation for max_threads currently reads as follows:

max_threads: Scheduler will spawn multiple threads in parallel to schedule dags. This is controlled by max_threads with default value of 2. User should increase this value to a larger value (e.g numbers of cpus where scheduler runs - 1) in production.

The example above creates confusion in the minds of new developers as it is incorrectly understood as the maximum number of threads for the scheduler cannot exceed the number of cpus - 1. I have seen many Airflow installations where the value is setup as max number of cpus - 1, while the upper limit of threads should actually be determined by the size of the instance (CPU + Memory) onto which the scheduler is installed. Due to this misunderstanding, I've heard many new Airflow developers say that Airflow is very slow at scheduling DAGs. When I delve deeper into their config I see the max_threads configuration limited to the number of CPUs.

Kindly consider changing this to the below as follows -
max_threads: Scheduler will spawn multiple threads in parallel to schedule dags. This is controlled by max_threads with default value of 2. User should increase this value to a larger value that fits the size of the installed hardware in production.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind:bugThis is a clearly a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions