Migrate executor docs to respective providers (#37728)

apache · Feb 27, 2024 · cd33c2a · cd33c2a
1 parent 5d1b271
commit cd33c2a
Show file tree

Hide file tree

Showing 30 changed files with 124 additions and 83 deletions.
diff --git a/.github/boring-cyborg.yml b/.github/boring-cyborg.yml
@@ -15,7 +15,6 @@
 # specific language governing permissions and limitations
 # under the License.
 ---
-
 # Details: https://github.com/kaxil/boring-cyborg
 
 labelPRBasedOnFilePath:
@@ -154,8 +153,6 @@ labelPRBasedOnFilePath:
     - airflow/example_dags/example_kubernetes_executor.py
     - airflow/providers/cncf/kubernetes/**/*
     - airflow/providers/celery/executors/celery_kubernetes_executor.py
-    - docs/apache-airflow/core-concepts/executor/kubernetes.rst
-    - docs/apache-airflow/core-concepts/executor/celery_kubernetes.rst
     - docs/apache-airflow-providers-cncf-kubernetes/**/*
     - kubernetes_tests/**/*
     - tests/providers/cncf/kubernetes/**/*

diff --git a/airflow/providers/celery/executors/celery_executor.py b/airflow/providers/celery/executors/celery_executor.py
@@ -19,7 +19,7 @@
 
 .. seealso::
     For more information on how the CeleryExecutor works, take a look at the guide:
-    :ref:`executor:CeleryExecutor`
+    :doc:`/celery_executor`
 """
 from __future__ import annotations
 

diff --git a/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py b/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py
@@ -19,7 +19,7 @@
 
 .. seealso::
     For more information on how the KubernetesExecutor works, take a look at the guide:
-    :ref:`executor:KubernetesExecutor`
+    :doc:`/kubernetes_executor`
 """
 from __future__ import annotations
 

diff --git a/...airflow/core-concepts/executor/celery.rst → ...flow-providers-celery/celery_executor.rst b/...airflow/core-concepts/executor/celery.rst → ...flow-providers-celery/celery_executor.rst
@@ -15,9 +15,6 @@
     specific language governing permissions and limitations
     under the License.
 
-
-.. _executor:CeleryExecutor:
-
 Celery Executor
 ===============
 
@@ -36,7 +33,7 @@ change your ``airflow.cfg`` to point the executor parameter to
 For more information about setting up a Celery broker, refer to the
 exhaustive `Celery documentation on the topic <https://docs.celeryq.dev/en/latest/getting-started/>`_.
 
-The configuration parameters of the Celery Executor can be found in :doc:`apache-airflow-providers-celery:configurations-ref`.
+The configuration parameters of the Celery Executor can be found in the Celery provider's :doc:`configurations-ref`.
 
 Here are a few imperative requirements for your workers:
 
@@ -97,7 +94,7 @@ Some caveats:
 - Tasks can consume resources. Make sure your worker has enough resources to run ``worker_concurrency`` tasks
 - Queue names are limited to 256 characters, but each broker backend might have its own restrictions
 
-See :doc:`/administration-and-deployment/modules_management` for details on how Python and Airflow manage modules.
+See :doc:`apache-airflow:administration-and-deployment/modules_management` for details on how Python and Airflow manage modules.
 
 Architecture
 ------------
@@ -173,7 +170,7 @@ The components communicate with each other in many places
 Task execution process
 ----------------------
 
-.. figure:: ../../img/run_task_on_celery_executor.png
+.. figure:: img/run_task_on_celery_executor.png
     :scale: 50 %
 
     Sequence diagram - task execution process
@@ -205,7 +202,7 @@ During this process, two 2 process are created:
 | [11] **WorkerProcess** saves status information in **ResultBackend**.
 | [13] When **SchedulerProcess** asks **ResultBackend** again about the status, it will get information about the status of the task.
 
-.. _executor:CeleryExecutor:queue:
+.. _celery_executor:queue:
 
 Queues
 ------

diff --git a/...e-concepts/executor/celery_kubernetes.rst → ...ers-celery/celery_kubernetes_executor.rst b/...e-concepts/executor/celery_kubernetes.rst → ...ers-celery/celery_kubernetes_executor.rst
@@ -15,9 +15,6 @@
     specific language governing permissions and limitations
     under the License.
 
-
-.. _executor:CeleryKubernetesExecutor:
-
 CeleryKubernetes Executor
 =========================
 
@@ -36,7 +33,7 @@ An executor is chosen to run a task based on the task's queue.
 ``CeleryKubernetesExecutor`` inherits the scalability of the ``CeleryExecutor`` to
 handle the high load at the peak time and runtime isolation of the ``KubernetesExecutor``.
 
-The configuration parameters of the Celery Executor can be found in :doc:`apache-airflow-providers-celery:configurations-ref`.
+The configuration parameters of the Celery Executor can be found in the Celery provider's :doc:`configurations-ref`.
 
 
 When to use CeleryKubernetesExecutor

diff --git a/...rflow/img/run_task_on_celery_executor.png → ...elery/img/run_task_on_celery_executor.png b/...rflow/img/run_task_on_celery_executor.png → ...elery/img/run_task_on_celery_executor.png
diff --git a/docs/apache-airflow-providers-celery/index.rst b/docs/apache-airflow-providers-celery/index.rst
@@ -29,6 +29,16 @@
     Changelog <changelog>
     Security <security>
 
+
+.. toctree::
+    :hidden:
+    :maxdepth: 1
+    :caption: Executors
+
+    CeleryExecutor details <celery_executor>
+    CeleryKubernetesExecutor details <celery_kubernetes_executor>
+
+
 .. toctree::
     :hidden:
     :maxdepth: 1

diff --git a/...ache-airflow/img/arch-diag-kubernetes.png → ...f-kubernetes/img/arch-diag-kubernetes.png b/...ache-airflow/img/arch-diag-kubernetes.png → ...f-kubernetes/img/arch-diag-kubernetes.png
diff --git a/...che-airflow/img/arch-diag-kubernetes2.png → ...-kubernetes/img/arch-diag-kubernetes2.png b/...che-airflow/img/arch-diag-kubernetes2.png → ...-kubernetes/img/arch-diag-kubernetes2.png
diff --git a/docs/apache-airflow/img/k8s-failed-pod.png → ...rs-cncf-kubernetes/img/k8s-failed-pod.png b/docs/apache-airflow/img/k8s-failed-pod.png → ...rs-cncf-kubernetes/img/k8s-failed-pod.png
diff --git a/docs/apache-airflow/img/k8s-happy-path.png → ...rs-cncf-kubernetes/img/k8s-happy-path.png b/docs/apache-airflow/img/k8s-happy-path.png → ...rs-cncf-kubernetes/img/k8s-happy-path.png
diff --git a/docs/apache-airflow-providers-cncf-kubernetes/index.rst b/docs/apache-airflow-providers-cncf-kubernetes/index.rst
@@ -29,6 +29,14 @@
     Changelog <changelog>
     Security <security>
 
+.. toctree::
+    :hidden:
+    :maxdepth: 1
+    :caption: Executors
+
+    KubernetesExecutor details <kubernetes_executor>
+    LocalKubernetesExecutor details <local_kubernetes_executor>
+
 .. toctree::
     :hidden:
     :maxdepth: 1

diff --git a/...low/core-concepts/executor/kubernetes.rst → ...s-cncf-kubernetes/kubernetes_executor.rst b/...low/core-concepts/executor/kubernetes.rst → ...s-cncf-kubernetes/kubernetes_executor.rst
@@ -16,7 +16,7 @@
     under the License.
 
 
-.. _executor:KubernetesExecutor:
+.. _KubernetesExecutor:
 
 Kubernetes Executor
 ===================
@@ -38,12 +38,12 @@ KubernetesExecutor requires a non-sqlite database in the backend.
 
 When a DAG submits a task, the KubernetesExecutor requests a worker pod from the Kubernetes API. The worker pod then runs the task, reports the result, and terminates.
 
-.. image:: ../../img/arch-diag-kubernetes.png
+.. image:: img/arch-diag-kubernetes.png
 
 
 One example of an Airflow deployment running on a distributed set of five nodes in a Kubernetes cluster is shown below.
 
-.. image:: ../../img/arch-diag-kubernetes2.png
+.. image:: img/arch-diag-kubernetes2.png
 
 Consistent with the regular Airflow architecture, the Workers need access to the DAG files to execute the tasks within those DAGs and interact with the Metadata repository. Also, configuration information specific to the Kubernetes Executor, such as the worker namespace and image information, needs to be specified in the Airflow Configuration file.
 
@@ -56,7 +56,7 @@ Additionally, the Kubernetes Executor enables specification of additional featur
 .. Airflow_Worker -> Kubernetes: Pod completes with state "Succeeded" and k8s records in ETCD
 .. Kubernetes -> Airflow_Scheduler: Airflow scheduler reads "Succeeded" from k8s watcher thread
 .. @enduml
-.. image:: ../../img/k8s-happy-path.png
+.. image:: img/k8s-happy-path.png
 
 Configuration
 -------------
@@ -272,7 +272,7 @@ In the case where a worker dies before it can report its status to the backend D
 ..
 .. @enduml
 
-.. image:: ../../img/k8s-failed-pod.png
+.. image:: img/k8s-failed-pod.png
 
 
 A Kubernetes watcher is a thread that can subscribe to every change that occurs in Kubernetes' database. It is alerted when pods start, run, end, and fail.

diff --git a/...re-concepts/executor/local_kubernetes.rst → ...-kubernetes/local_kubernetes_executor.rst b/...re-concepts/executor/local_kubernetes.rst → ...-kubernetes/local_kubernetes_executor.rst
@@ -16,7 +16,7 @@
     under the License.
 
 
-.. _executor:LocalKubernetesExecutor:
+.. _LocalKubernetesExecutor:
 
 LocalKubernetes Executor
 =========================

diff --git a/docs/apache-airflow-providers-cncf-kubernetes/operators.rst b/docs/apache-airflow-providers-cncf-kubernetes/operators.rst
@@ -35,7 +35,7 @@ you to create and run Pods on a Kubernetes cluster.
   - :ref:`EksPodOperator <howto/operator:EksPodOperator>` operator for `AWS Elastic Kubernetes Engine <https://aws.amazon.com/eks/>`__.
 
 .. note::
-  The :doc:`Kubernetes executor <apache-airflow:core-concepts/executor/kubernetes>` is **not** required to use this operator.
+  The :doc:`Kubernetes executor <kubernetes_executor>` is **not** required to use this operator.
 
 How does this operator work?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -602,7 +602,7 @@ you to create and run Jobs on a Kubernetes cluster.
   - ``GKEStartJobOperator`` operator for `Google Kubernetes Engine <https://cloud.google.com/kubernetes-engine/>`__.
 
 .. note::
-  The :doc:`Kubernetes executor <apache-airflow:core-concepts/executor/kubernetes>` is **not** required to use this operator.
+  The :doc:`Kubernetes executor <kubernetes_executor>` is **not** required to use this operator.
 
 How does this operator work?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

diff --git a/docs/apache-airflow/administration-and-deployment/kubernetes.rst b/docs/apache-airflow/administration-and-deployment/kubernetes.rst
@@ -32,7 +32,7 @@ We maintain an :doc:`official Helm chart <helm-chart:index>` for Airflow that he
 Kubernetes Executor
 ^^^^^^^^^^^^^^^^^^^
 
-The :doc:`Kubernetes Executor </core-concepts/executor/kubernetes>` allows you to run all the Airflow tasks on
+The :doc:`Kubernetes Executor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>` allows you to run all the Airflow tasks on
 Kubernetes as separate Pods.
 
 KubernetesPodOperator

diff --git a/...pache-airflow/administration-and-deployment/logging-monitoring/check-health.rst b/...pache-airflow/administration-and-deployment/logging-monitoring/check-health.rst
@@ -135,7 +135,7 @@ HTTP monitoring for Celery Cluster
 
 You can optionally use Flower to monitor the health of the Celery cluster. It also provides an HTTP API that you can use to build a health check for your environment.
 
-For details about installation, see: :ref:`executor:CeleryExecutor`. For details about usage, see: `The Flower project documentation <https://flower.readthedocs.io/>`__.
+For details about installation, see: :doc:`apache-airflow-providers-celery:celery_executor`. For details about usage, see: `The Flower project documentation <https://flower.readthedocs.io/>`__.
 
 CLI Check for Celery Workers
 ----------------------------

diff --git a/docs/apache-airflow/administration-and-deployment/production-deployment.rst b/docs/apache-airflow/administration-and-deployment/production-deployment.rst
@@ -60,8 +60,8 @@ Airflow uses :class:`~airflow.executors.sequential_executor.SequentialExecutor`
 nature, the user is limited to executing at most one task at a time. ``Sequential Executor`` also pauses
 the scheduler when it runs a task, hence it is not recommended in a production setup. You should use the
 :class:`~airflow.executors.local_executor.LocalExecutor` for a single machine.
-For a multi-node setup, you should use the :doc:`Kubernetes executor <../core-concepts/executor/kubernetes>` or
-the :doc:`Celery executor <../core-concepts/executor/celery>`.
+For a multi-node setup, you should use the :doc:`Kubernetes executor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>` or
+the :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`.
 
 
 Once you have configured the executor, it is necessary to make sure that every node in the cluster contains
@@ -170,19 +170,19 @@ of the executor you use:
   the tasks it runs - then you will need to clear and restart those tasks manually after the upgrade
   is completed (or rely on ``retry`` being set for stopped tasks).
 
-* For the :doc:`Celery executor <../core-concepts/executor/celery>`, you have to first put your workers in
+* For the :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`, you have to first put your workers in
   offline mode (usually by setting a single ``TERM`` signal to the workers), wait until the workers
   finish all the running tasks, and then upgrade the code (for example by replacing the image the workers run
   in and restart the workers). You can monitor your workers via ``flower`` monitoring tool and see the number
   of running tasks going down to zero. Once the workers are upgraded, they will be automatically put in online
   mode and start picking up new tasks. You can then upgrade the ``Scheduler`` in a rolling restart mode.
 
-* For the :doc:`Kubernetes executor <../core-concepts/executor/kubernetes>`, you can upgrade the scheduler
+* For the :doc:`Kubernetes executor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>`, you can upgrade the scheduler
   triggerer, webserver in a rolling restart mode, and generally you should not worry about the workers, as they
   are managed by the Kubernetes cluster and will be automatically adopted by ``Schedulers`` when they are
   upgraded and restarted.
 
-* For the :doc:``CeleryKubernetesExecutor <../core-concepts/executor/celery-kubernetes>``, you follow the
+* For the :doc:``CeleryKubernetesExecutor <apache-airflow-providers-celery:celery_kubernetes_executor>``, you follow the
   same procedure as for the ``CeleryExecutor`` - you put the workers in offline mode, wait for the running
   tasks to complete, upgrade the workers, and then upgrade the scheduler, triggerer and webserver in a
   rolling restart mode - which should also adopt tasks run via the ``KubernetesExecutor`` part of the

diff --git a/docs/apache-airflow/best-practices.rst b/docs/apache-airflow/best-practices.rst
@@ -82,7 +82,7 @@ it difficult to check the logs of that Task from the Webserver. If that is not d
 Communication
 --------------
 
-Airflow executes tasks of a DAG on different servers in case you are using :doc:`Kubernetes executor </core-concepts/executor/kubernetes>` or :doc:`Celery executor </core-concepts/executor/celery>`.
+Airflow executes tasks of a DAG on different servers in case you are using :doc:`Kubernetes executor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>` or :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`.
 Therefore, you should not store any file or config in the local filesystem as the next task is likely to run on a different server without access to it — for example, a task that downloads the data file that the next task processes.
 In the case of :class:`Local executor <airflow.executors.local_executor.LocalExecutor>`,
 storing a file on disk can make retries harder e.g., your task requires a config file that is deleted by another task in DAG.

diff --git a/...-airflow/core-concepts/executor/debug.rst → docs/apache-airflow/core-concepts/debug.rst b/...-airflow/core-concepts/executor/debug.rst → docs/apache-airflow/core-concepts/debug.rst
@@ -15,44 +15,10 @@
     specific language governing permissions and limitations
     under the License.
 
-.. _executor:DebugExecutor:
-
-Debug Executor (deprecated)
-===========================
+.. _concepts:debugging:
 
-The :class:`~airflow.executors.debug_executor.DebugExecutor` is meant as
-a debug tool and can be used from IDE. It is a single process executor that
-queues :class:`~airflow.models.taskinstance.TaskInstance` and executes them by running
-``_run_raw_task`` method.
-
-Due to its nature the executor can be used with SQLite database. When used
-with sensors the executor will change sensor mode to ``reschedule`` to avoid
-blocking the execution of DAG.
-
-Additionally ``DebugExecutor`` can be used in a fail-fast mode that will make
-all other running or scheduled tasks fail immediately. To enable this option set
-``AIRFLOW__DEBUG__FAIL_FAST=True`` or adjust ``fail_fast`` option in your ``airflow.cfg``.
-For more information on setting the configuration, see :doc:`../../howto/set-config`.
-
-**IDE setup steps:**
-
-1. Add ``main`` block at the end of your DAG file to make it runnable.
-
-It will run a backfill job:
-
-.. code-block:: python
-
-  if __name__ == "__main__":
-      from airflow.utils.state import State
-
-      dag.clear()
-      dag.run()
-
-
-2. Setup ``AIRFLOW__CORE__EXECUTOR=DebugExecutor`` in run configuration of your IDE. In
-   this step you should also setup all environment variables required by your DAG.
-
-3. Run / debug the DAG file.
+Debugging Airflow DAGs
+======================
 
 Testing DAGs with dag.test()
 *****************************
@@ -74,7 +40,7 @@ and that's it! You can add argument such as ``execution_date`` if you want to te
 you can run or debug DAGs as needed.
 
 Comparison with DebugExecutor
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-----------------------------
 
 The ``dag.test`` command has the following benefits over the :class:`~airflow.executors.debug_executor.DebugExecutor`
 class, which is now deprecated:
@@ -102,3 +68,42 @@ Run ``python -m pdb <path to dag file>.py`` for an interactive debugging experie
   -> bash_command='echo 1',
   (Pdb) run_this_last
   <Task(EmptyOperator): run_this_last>
+
+.. _executor:DebugExecutor:
+
+Debug Executor (deprecated)
+***************************
+
+The :class:`~airflow.executors.debug_executor.DebugExecutor` is meant as
+a debug tool and can be used from IDE. It is a single process executor that
+queues :class:`~airflow.models.taskinstance.TaskInstance` and executes them by running
+``_run_raw_task`` method.
+
+Due to its nature the executor can be used with SQLite database. When used
+with sensors the executor will change sensor mode to ``reschedule`` to avoid
+blocking the execution of DAG.
+
+Additionally ``DebugExecutor`` can be used in a fail-fast mode that will make
+all other running or scheduled tasks fail immediately. To enable this option set
+``AIRFLOW__DEBUG__FAIL_FAST=True`` or adjust ``fail_fast`` option in your ``airflow.cfg``.
+For more information on setting the configuration, see :doc:`../../howto/set-config`.
+
+**IDE setup steps:**
+
+1. Add ``main`` block at the end of your DAG file to make it runnable.
+
+It will run a backfill job:
+
+.. code-block:: python
+
+  if __name__ == "__main__":
+      from airflow.utils.state import State
+
+      dag.clear()
+      dag.run()
+
+
+2. Setup ``AIRFLOW__CORE__EXECUTOR=DebugExecutor`` in run configuration of your IDE. In
+   this step you should also setup all environment variables required by your DAG.
+
+3. Run / debug the DAG file.
diff --git a/docs/apache-airflow/core-concepts/executor/index.rst b/docs/apache-airflow/core-concepts/executor/index.rst
@@ -52,19 +52,15 @@ There are two types of executors - those that run tasks *locally* (inside the ``
 .. toctree::
     :maxdepth: 1
 
-    debug
     local
     sequential
 
 **Remote Executors**
 
-.. toctree::
-    :maxdepth: 1
-
-    celery
-    celery_kubernetes
-    kubernetes
-    local_kubernetes
+* :doc:`CeleryExecutor <apache-airflow-providers-celery:celery_executor>`
+* :doc:`CeleryKubernetesExecutor <apache-airflow-providers-celery:celery_kubernetes_executor>`
+* :doc:`KubernetesExecutor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>`
+* :doc:`KubernetesLocalExecutor <apache-airflow-providers-cncf-kubernetes:local_kubernetes_executor>`
 
 
 .. note::

diff --git a/docs/apache-airflow/core-concepts/index.rst b/docs/apache-airflow/core-concepts/index.rst
@@ -51,3 +51,10 @@ Here you can find detailed documentation about each one of the core concepts of
     xcoms
     variables
     params
+
+**Debugging**
+
+.. toctree::
+    :maxdepth: 1
+
+    debug