From 007bca06f7dcefe8d08543dfcedaf07f1bf793c1 Mon Sep 17 00:00:00 2001 From: Andrew Kenworthy Date: Wed, 10 Sep 2025 11:13:27 +0200 Subject: [PATCH 1/4] added note on api workers --- .../airflow/pages/troubleshooting/index.adoc | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/docs/modules/airflow/pages/troubleshooting/index.adoc b/docs/modules/airflow/pages/troubleshooting/index.adoc index bb8fd12c..24cecca8 100644 --- a/docs/modules/airflow/pages/troubleshooting/index.adoc +++ b/docs/modules/airflow/pages/troubleshooting/index.adoc @@ -25,3 +25,29 @@ webservers: # Also add to other roles! ---- See e.g. https://github.com/minio/minio/issues/20845[this MinIO issue] for details. + +== Setting API Workers + +In Airflow the webserver (called the API Server in Airflow 3.x+) can use multiple workers. +This is determined by the environment variable `+AIRFLOW__API__WORKERS+` and is set by default to `4` in Airflow 2.x and `1` in Airflow 3.x+. +The reason for this difference is that Airflow uses a backend library to manage child processes and in 3.x+ this library can cause child processes to be killed if a hard-coded startup timeout is exceeded. +For most cases a default of `1` should be sufficient, but if you run into performance issues and would like to add more workers, you can either change the environment variable using `envOverrides`: + +[source,yaml] +---- +webservers: + envOverrides: + AIRFLOW__API__WORKERS: 2 # something other than the default of 1 +---- + +or modulate multiple worker processes at the level of webserver, keeping the default of a single worker per webserver: + +[source,yaml] +---- +webservers: + roleGroups: + default: + replicas: 2 # add a replica (with a single worker) +---- + +TIP: Our recommendation is to increase the webserver replicas, with each webserver running a single worker, as this removes the risk of running into timeouts or memory issues. From ec84e6c128691eeeb612197296fdf02e7ed0b6ee Mon Sep 17 00:00:00 2001 From: Andrew Kenworthy Date: Wed, 10 Sep 2025 11:17:36 +0200 Subject: [PATCH 2/4] minor change --- docs/modules/airflow/pages/troubleshooting/index.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/airflow/pages/troubleshooting/index.adoc b/docs/modules/airflow/pages/troubleshooting/index.adoc index 24cecca8..c5fffb43 100644 --- a/docs/modules/airflow/pages/troubleshooting/index.adoc +++ b/docs/modules/airflow/pages/troubleshooting/index.adoc @@ -31,7 +31,7 @@ See e.g. https://github.com/minio/minio/issues/20845[this MinIO issue] for detai In Airflow the webserver (called the API Server in Airflow 3.x+) can use multiple workers. This is determined by the environment variable `+AIRFLOW__API__WORKERS+` and is set by default to `4` in Airflow 2.x and `1` in Airflow 3.x+. The reason for this difference is that Airflow uses a backend library to manage child processes and in 3.x+ this library can cause child processes to be killed if a hard-coded startup timeout is exceeded. -For most cases a default of `1` should be sufficient, but if you run into performance issues and would like to add more workers, you can either change the environment variable using `envOverrides`: +For most cases with Airflow 3.x+ a default of `1` should be sufficient, but if you run into performance issues and would like to add more workers, you can either change the environment variable using `envOverrides`: [source,yaml] ---- From 42e0d498b6f42f5051b79b60f09a5fd683bd0bc2 Mon Sep 17 00:00:00 2001 From: Andrew Kenworthy Date: Wed, 10 Sep 2025 11:18:40 +0200 Subject: [PATCH 3/4] changelog --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 53156d1f..9a7e8e05 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,7 @@ - Add a flag to determine if database initialization steps should be executed ([#669]). - Add new roles for dag-processor and triggerer processes ([#679]). +- Added a note on webserver workers to the trouble-shooting section ([#685]). ### Fixed @@ -22,6 +23,7 @@ [#678]: https://github.com/stackabletech/airflow-operator/pull/678 [#679]: https://github.com/stackabletech/airflow-operator/pull/679 [#683]: https://github.com/stackabletech/airflow-operator/pull/683 +[#685]: https://github.com/stackabletech/airflow-operator/pull/685 ## [25.7.0] - 2025-07-23 From 9d2f625296829bd5e176c6ab8f0c4b061b97cb6b Mon Sep 17 00:00:00 2001 From: Andrew Kenworthy Date: Wed, 10 Sep 2025 11:36:50 +0200 Subject: [PATCH 4/4] feedback review: swapped order --- .../airflow/pages/troubleshooting/index.adoc | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/modules/airflow/pages/troubleshooting/index.adoc b/docs/modules/airflow/pages/troubleshooting/index.adoc index c5fffb43..aa2ef848 100644 --- a/docs/modules/airflow/pages/troubleshooting/index.adoc +++ b/docs/modules/airflow/pages/troubleshooting/index.adoc @@ -31,23 +31,23 @@ See e.g. https://github.com/minio/minio/issues/20845[this MinIO issue] for detai In Airflow the webserver (called the API Server in Airflow 3.x+) can use multiple workers. This is determined by the environment variable `+AIRFLOW__API__WORKERS+` and is set by default to `4` in Airflow 2.x and `1` in Airflow 3.x+. The reason for this difference is that Airflow uses a backend library to manage child processes and in 3.x+ this library can cause child processes to be killed if a hard-coded startup timeout is exceeded. -For most cases with Airflow 3.x+ a default of `1` should be sufficient, but if you run into performance issues and would like to add more workers, you can either change the environment variable using `envOverrides`: +For most cases with Airflow 3.x+ a default of `1` should be sufficient, but if you run into performance issues and would like to add more workers, you can either modulate multiple worker processes at the level of webserver, keeping the default of a single worker per webserver: [source,yaml] ---- webservers: - envOverrides: - AIRFLOW__API__WORKERS: 2 # something other than the default of 1 + roleGroups: + default: + replicas: 2 # add a replica (with a single worker) ---- -or modulate multiple worker processes at the level of webserver, keeping the default of a single worker per webserver: +or change the environment variable using `envOverrides`: [source,yaml] ---- webservers: - roleGroups: - default: - replicas: 2 # add a replica (with a single worker) + envOverrides: + AIRFLOW__API__WORKERS: 2 # something other than the default of 1 ---- -TIP: Our recommendation is to increase the webserver replicas, with each webserver running a single worker, as this removes the risk of running into timeouts or memory issues. +TIP: Our strong recommendation is to increase the webserver replicas, with each webserver running a single worker, as this removes the risk of running into timeouts or memory issues.