Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions airflow-core/docs/core-concepts/dags.rst
Original file line number Diff line number Diff line change
Expand Up @@ -870,3 +870,42 @@ Here's a simple example using the existing email Notifier:
This example will send an email notification if the Dag hasn't finished 30 minutes after it was queued.

For more information on implementing and configuring Deadline Alerts, see :doc:`/howto/deadline-alerts`.


Testing a Dag
-------------

In order to verify the syntax/functionality of Dag code, there is a handy option available.

``dag`` objects have a ``test()`` function, which can be invoked either directly or within a test suite.


Simulated dag run
~~~~~~~~~~~~~~~~~

A simplest way to check validity of a ``dag`` is to use the following snippet at the end of the dag definition module:

.. code:: python

if __name__ == "__main__":
dag.test()

and execute the module as a Python script (assuming sufficient environment such as ``AIRFLOW_HOME`` is provided).

The ``dag.test()`` call will invoke a simulated execution flow, similar to a ``LocalExecutor`` execution session,
and run the Dag contents as in an Airflow submission.


Real execution flow
~~~~~~~~~~~~~~~~~~~

In case you would like to test the Dag against a real Airflow Executor, the same mechanism can be used. Addressing the
call with the ``use_executor`` flag, the Airflow Executor of currently applied Airflow Configuration will be invoked, and
run the workloads of the Dag.

.. code:: python

dag.test(use_executor=True)

Using the call within a pytest test suite, you may benefit of the Airflow pytest plugin's ``conf_vars`` fixture, that
allows easy alternation of pre-defined configuration values.
100 changes: 96 additions & 4 deletions contributing-docs/testing/system_tests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,22 @@ The purpose of these tests is to:
- serve both as examples and test files.
- the excerpts from these system tests are used to generate documentation


Usage options
-------------

The System tests allow for two ways of execution: invoking an Airflow Executor of the developer's choice, or to simulate
the executor's actions (based on base, generic executor behavior) rather focusing on the Dags execution only.

The former behavior can be particularly useful:
- to test custom Airflow Executors against a "real-life" environment
- to test custom Airflow Operators (Decorators, etc.) against a "real-life" environment

The latter can be useful in attempt to concentrate on the Dag execution within the tests, avoiding interference coming
from additional layers of complexity.

The Airflow System tests can be very useful as much for automated pipelines as for manual executions.

Configuration
-------------

Expand All @@ -54,6 +70,41 @@ set it before running test command.

SYSTEM_TESTS_ENV_ID=SOME_RANDOM_VALUE

Executor
........

Regarding the usage of Executors: by default Airflow System tests are not using an Airflow executor. But instead
simulate the core behavior of a simple, generic one (like ``LocalExecutor``).

In case you would like to have an actual Executor involved, you need to set the following variable:

.. code-block:: bash

export _AIRFLOW__SYSTEM_TEST_USE_EXECUTOR = 1


Custom configuration file
.........................

.. note:: This section mostly corresponds when developing custom providers, outside of the Apache Airflow codebase

The Airflow configuration system works in a way, that for test environments specific test configuration file locations
are enforced to avoid accidentally loading configuration of valid Airflow installations. There are pytest fixtures
available for Airflow Unit tests, however those are not possible to be used for Airflow System tests.

An option to load a custom configuration file for your System Tests may be to add something similar to ``conftest.py``, to
an early stage of pytest session initialization (for example in the ``pytest_configure`` function):

.. code-block:: python

from airflow.configuration import conf

conf.read_file(open(f"{YOUR_TEST_CONFIG_FILE}"))

Custom configuration allows an easy way to load numerous custom options at once (typically ones that may correspond
to custom providers).


Running the System Tests
------------------------

Expand All @@ -72,11 +123,16 @@ your Airflow instance as Dags and they will be automatically triggered. If the s
how to set up the environment is documented in each provider's system tests directory. Make sure that all resource
required by the tests are also imported.

Running via Pytest
..................
Running via Pytest + Breezee
............................

Running system tests with pytest is the easiest with `Breeze <https://github.com/apache/airflow/blob/main/dev/breeze/doc/README.rst>`_.
Breeze makes sure that the environment is pre-configured, and all additional required services are started,
off-loading the developer from the overhead of test environment setup.

Running manually via Pytest
...........................

Running system tests with pytest is the easiest with Breeze. Thanks to it, you don't need to bother about setting up
the correct environment, that is able to execute the tests.
You can either run them using your IDE (if you have installed plugin/widget supporting pytest) or using the following
example of command:

Expand All @@ -92,6 +148,42 @@ For providers:

pytest --system providers/google/tests/system/google/cloud/bigquery/example_bigquery_queries.py

NOTE: If using an actual Executor, you may need to make sure that the Airflow API server is running as well.
(In order to benefit from default configuration, it should be a run locally.)

.. code-block:: bash

uv run airflow api-server

Running manually via Pytest outside of the Airflow codebase
...........................................................

.. note:: This section mostly corresponds when developing custom providers, outside of the Apache Airflow codebase

1. Airflow Plugin

You will need to use the Airflow pytest plugin, which is available as part of the ``apache-airflow-devel-common``
package. It's a good choice to install the package (as it makes the plugin and related dependences easily available).
However the plugin it is generally used within the Airflow codebase (not as a package), so you may be facing issues
using it on its own, outside of the Airflow codebase. In this case you could clone the Airflow git repository, and set
the following environment variables to point to this location. This should ensure a safe and stable use of the pytest
plugin.

.. code-block:: bash

export AIRFLOW_SOURCES=<CLONE_OF_THE_AIRFLOW_GIT_REPO>

2. Airflow API server

In case you want to run tests agasint an Airflow Executor, you will need to have the Airflow API server available.

NOTE: You have to make sure that the API server is sharing certain configuration as the test environment. This is
particularly important so that the Airflow Task SDK may be able to communicate to the API Server.

(For example, the same JWT secret must be used on both sides for token encryption/description. By default the Airflow
test environment is using temporary JWT secrets generated on-the-fly. If you want to keep control over these settings,
the best solution is to follow instructions from the :ref:`Custom configuration file` section.


Running via Breeze
..................
Expand Down