Skip to content

Serialized DAG fileloc can remain stale when DAG content hash is unchanged, causing callback/history issues with versioned DAG bundles #66301

@hkc-8010

Description

@hkc-8010

Apache Airflow version

3.2.0

What happened?

We hit a case where a DAG was being actively parsed from the current bundle path, but the serialized DAG metadata still pointed to an older bundle path.

In our case, DAG-level on_failure_callback behavior was failing before a manual serialized-DAG refresh, then started working again after we deleted the serialized row and let Airflow recreate it.

There is still some uncertainty about the exact internal callback execution path, but the stale serialized metadata was definitely present, and refreshing it correlated directly with the fix.

We also saw repeated API/UI-side errors for the same DAG version while this was happening:

2026-05-01T17:58:24.234995Z [error    ] No serialized dag found        [airflow.api_fastapi.core_api.routes.ui.grid] dag_id=fail_check loc=grid.py:113 request_id=8557f53b-6c9a-4482-ba94-bc70c3623e36 version_id=UUID('019db6f1-958c-7317-98c2-472393931510') version_number=1
2026-05-01T17:58:24.233481Z [error    ] No serialized dag found        [airflow.api_fastapi.core_api.routes.ui.grid] dag_id=fail_check loc=grid.py:113 request_id=8557f53b-6c9a-4482-ba94-bc70c3623e36 version_id=UUID('019db6f1-958c-7317-98c2-472393931510') version_number=1
2026-05-01T17:58:24.071765Z [error    ] No serialized dag found        [airflow.api_fastapi.core_api.routes.ui.grid] dag_id=fail_check loc=grid.py:113 request_id=e21b1ab4-042d-4b77-b2c6-36bd4e05e8cc version_id=UUID('019db6f1-958c-7317-98c2-472393931510') version_number=1
2026-05-01T17:58:24.069796Z [error    ] No serialized dag found        [airflow.api_fastapi.core_api.routes.ui.grid] dag_id=fail_check loc=grid.py:113 request_id=e21b1ab4-042d-4b77-b2c6-36bd4e05e8cc version_id=UUID('019db6f1-958c-7317-98c2-472393931510') version_number=1
2026-05-01T17:57:01.513703Z [error    ] No serialized dag found        [airflow.api_fastapi.core_api.routes.ui.grid] dag_id=fail_check loc=grid.py:113 request_id=ece1848c-2ea4-4355-9757-440b8eb725fd version_id=UUID('019db6f1-958c-7317-98c2-472393931510') version_number=1
2026-05-01T17:57:01.511211Z [error    ] No serialized dag found        [airflow.api_fastapi.core_api.routes.ui.grid] dag_id=fail_check loc=grid.py:113 request_id=ece1848c-2ea4-4355-9757-440b8eb725fd version_id=UUID('019db6f1-958c-7317-98c2-472393931510') version_number=1
2026-05-01T17:56:44.651543Z [error    ] No serialized dag found        [airflow.api_fastapi.core_api.routes.ui.grid] dag_id=fail_check loc=grid.py:113 request_id=6cbfde23-d947-45c9-a95c-5d071e00ff09 version_id=UUID('019db6f1-958c-7317-98c2-472393931510') version_number=1
2026-05-01T17:56:44.650352Z [error    ] No serialized dag found        [airflow.api_fastapi.core_api.routes.ui.grid] dag_id=fail_check loc=grid.py:113 request_id=6cbfde23-d947-45c9-a95c-5d071e00ff09 version_id=UUID('019db6f1-958c-7317-98c2-472393931510') version_number=1

What you think should happen instead?

If a DAG's fileloc changes, the serialized DAG row should be updated to reflect the new path even when the serialized DAG content hash is unchanged.

At minimum, fileloc should not stay stale indefinitely when the DAG is being actively reparsed from a new location.

How to reproduce

We do not yet have a minimal standalone repro, but this is the observed pattern:

  1. Run Airflow with versioned or changing DAG bundle paths.
  2. Keep the DAG definition effectively unchanged so the serialized DAG hash remains the same.
  3. Move the same DAG to a new bundle path.
  4. Observe that DagModel.fileloc and current parsing reflect the new path, but SerializedDagModel may still point to the old path.
  5. DAG-level callback behavior and some UI/version-history lookups may then behave incorrectly.

Operating System

Linux

Versions of Apache Airflow Providers

Not central to the issue

Deployment

Custom / Astro Hosted runtime based on Airflow 3.2.0

Deployment details

Observed on Airflow 3.2.0 with versioned DAG bundle paths.

Anything else?

A few concrete observations from the incident:

  • DagModel.fileloc was current and pointed at the active bundle path.
  • DagModel.last_parsed_time was current.
  • SerializedDagModel.last_updated was still old.
  • SerializedDagModel still pointed at the old bundle path.
  • The old bundle path no longer existed on the current pod.
  • DAG-level failure callbacks did not appear to fire for failed runs before refresh.
  • After deleting the serialized DAG row and letting Airflow recreate it, the serialized fileloc updated to the current path and the next customer test worked.

The serialization logic appears relevant here. In SerializedDagModel.write_dag, the row is skipped when dag_hash and processor_subdir are unchanged:

if (
    serialized_dag_db is not None
    and serialized_dag_db.dag_hash == new_serialized_dag.dag_hash
    and serialized_dag_db.processor_subdir == new_serialized_dag.processor_subdir
):
    log.debug("Serialized DAG (%s) is unchanged. Skipping writing to DB", dag.dag_id)
    return False

But fileloc is set separately from dag.fileloc when building the serialized row:

self.fileloc = dag.fileloc
self.fileloc_hash = DagCode.dag_fileloc_hash(self.fileloc)

So if only fileloc changes while the serialized DAG content hash stays the same, the row may never be rewritten.

Are you willing to submit PR?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:corearea:serializationkind:bugThis is a clearly a bugpriority:highHigh priority bug that should be patched quickly but does not require immediate new release

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions