Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions airflow-core/docs/best-practices.rst
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,8 @@ When you execute that code you will see:

This means that the ``get_array`` is not executed as top-level code, but ``get_task_id`` is.

.. _best_practices/code_quality_and_linting:

Code Quality and Linting
------------------------

Expand Down
163 changes: 163 additions & 0 deletions airflow-core/docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,169 @@ There are several reasons why Dags might disappear from the UI. Common causes in
* **Time synchronization issues** - Ensure all nodes (database, schedulers, workers) use NTP with <1s clock drift.


.. _faq:dag-version-inflation:

Why does my Dag version keep increasing?
-----------------------------------------

Every time the Dag processor parses a Dag file, it serializes the Dag and compares the result with the
version stored in the metadata database. If anything has changed, Airflow creates a new Dag version.

**Dag version inflation** occurs when the version number increases indefinitely without the Dag author
making any intentional changes.

What goes wrong
"""""""""""""""

When Dag versions increase without meaningful changes:

* The metadata database accumulates unnecessary Dag version records, increasing storage and query overhead.
* The UI shows a misleading history of Dag changes, making it harder to identify real modifications.
* The scheduler and API server may consume more memory as they load and cache a growing number of Dag versions.

Common causes
"""""""""""""

Version inflation is caused by using values that change at **parse time** — that is, every time the Dag
processor evaluates the Dag file — as arguments to Dag or Task constructors. The most common patterns are:

**1. Using ``datetime.now()`` or ``pendulum.now()`` as ``start_date``:**

.. code-block:: python

from datetime import datetime

from airflow.sdk import DAG

with DAG(
dag_id="bad_example",
# BAD: datetime.now() produces a different value on every parse
start_date=datetime.now(),
schedule="@daily",
):
...

Every parse produces a different ``start_date``, so the serialized Dag is always different from the
stored version.

**2. Using random values in Dag or Task arguments:**

.. code-block:: python

import random

from airflow.sdk import DAG
from airflow.providers.standard.operators.python import PythonOperator

with DAG(dag_id="bad_random", start_date="2024-01-01", schedule="@daily") as dag:
PythonOperator(
# BAD: random value changes every parse
task_id=f"task_{random.randint(1, 1000)}",
python_callable=lambda: None,
)

**3. Assigning runtime-varying values to variables used in constructors:**

.. code-block:: python

from datetime import datetime

from airflow.sdk import DAG
from airflow.providers.standard.operators.python import PythonOperator

# BAD: the variable captures a parse-time value, then is passed to the DAG
default_args = {"start_date": datetime.now()}

with DAG(dag_id="bad_defaults", default_args=default_args, schedule="@daily") as dag:
PythonOperator(task_id="my_task", python_callable=lambda: None)

Even though ``datetime.now()`` is not called directly inside the Dag constructor, it flows in through
``default_args`` and still causes a different serialized Dag on every parse.

**4. Using environment variables or file contents that change between parses:**

.. code-block:: python

import os

from airflow.sdk import DAG
from airflow.providers.standard.operators.bash import BashOperator

with DAG(dag_id="bad_env", start_date="2024-01-01", schedule="@daily") as dag:
BashOperator(
task_id="echo_build",
# BAD if BUILD_NUMBER changes on every deployment or parse
bash_command=f"echo {os.environ.get('BUILD_NUMBER', 'unknown')}",
)

How to avoid version inflation
""""""""""""""""""""""""""""""

* **Use fixed ``start_date`` values.** Always set ``start_date`` to a static ``datetime`` literal:

.. code-block:: python

import datetime

from airflow.sdk import DAG

with DAG(
dag_id="good_example",
start_date=datetime.datetime(2024, 1, 1),
schedule="@daily",
):
...

* **Keep all Dag and Task constructor arguments deterministic.** Arguments passed to Dag and Operator
constructors must produce the same value on every parse. Move any dynamic computation into the
``execute()`` method or use Jinja templates, which are evaluated at task execution time rather than
parse time.

* **Use Jinja templates for dynamic values:**

.. code-block:: python

from airflow.providers.standard.operators.bash import BashOperator

BashOperator(
task_id="echo_date",
# GOOD: the template is resolved at execution time, not parse time
bash_command="echo {{ ds }}",
)

* **Use Airflow Variables with templates instead of top-level lookups:**

.. code-block:: python

from airflow.providers.standard.operators.bash import BashOperator

BashOperator(
task_id="echo_var",
# GOOD: Variable is resolved at execution time via template
bash_command="echo {{ var.value.my_variable }}",
)

Dag version inflation detection
""""""""""""""""""""""""""""""""

Starting from Airflow 3.2, the Dag processor performs **AST-based static analysis** on every Dag file
before parsing to detect runtime-varying values in Dag and Task constructors. When a potential issue is
found, it is surfaced as a **Dag warning** visible in the UI.

You can control this behavior with the
:ref:`dag_version_inflation_check_level <config:dag_processor__dag_version_inflation_check_level>`
configuration option:

* ``off`` — Disables the check entirely. No errors or warnings are generated.
* ``warning`` (default) — Dags load normally but warnings are displayed in the UI when issues are detected.
* ``error`` — Treats detected issues as Dag import errors, preventing the Dag from loading.

Additionally, you can catch these issues earlier in your development workflow by using the
`AIR302 <https://docs.astral.sh/ruff/rules/airflow3-dag-dynamic-value/>`_ ruff rule, which detects
dynamic values in Dag and Task constructors as part of static linting. See
:ref:`best_practices/code_quality_and_linting` for how to set up ruff with Airflow-specific rules.


Dag construction
^^^^^^^^^^^^^^^^

Expand Down
Loading