Skip to content

Scheduler can't creating DAG runs if some were externally triggered #10779

@baolsen

Description

@baolsen

Apache Airflow version: 1.10.8

Environment:

  • Cloud provider or hardware configuration: 4 VCPU 8GB RAM VM
  • OS (e.g. from /etc/os-release): RHEL 7.7
  • Kernel (e.g. uname -a): Linux 3.10.0-957.el7.x86_64 Improving the search functionality in the graph view #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
  • Others:

What happened:

Airflow Scheduler is unable to create any DAG Runs if there is an existing externally triggered DAG Run for the same execution date as the scheduler would have created. (This prevents creation of future DAG Runs as well).

This happens if a DAG is sometimes scheduled externally and by Airflow. This can happen if migrating from an external scheduler to Airflow scheduler, for example.

What you expected to happen:

I think the Airflow scheduler should skip over any externally triggered DAG Runs when creating DAG runs, provided the execution date is the same, and it should not produce an error. It should be able to create the future DAG Runs (ones not created yet via external process).

The cause seems to be these 2 lines:

external_trigger=False,

if not dag_run.external_trigger

How to reproduce it:

  1. Create any DAG with Schedule = None, start_date = few days ago. Eg 2020-08-06.
  2. Create an externally triggered DAG Run using browse -> DAG Runs -> Create, for an execution date which aligns with the start date and midnight, eg 2020-08-06 00:00:00.
  3. Change the DAG from Schedule = None to Schedule = @daily (for example)
  4. The scheduler will be unable to create DAG Runs due to duplicate key on dag_id + execution_date, with the externally-created DAG Run.

Example from the scheduler logs for the associated DAG:
sqlalchemy.exc.IntegrityError: (pyodbc.IntegrityError) ('23000', "[23000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Violation of UNIQUE KEY constraint 'UQ__dag_run__F78A98990F629538'. Cannot insert duplicate key in object 'dbo.dag_run'. The duplicate key value is (some_dag, 2020-08-06 00:00:00.000000). (2627) (SQLExecDirectW)")

  1. We can modify the DAG Run to no longer have the "externally triggered" flag, using the Browse -> DAG Runs UI.
    Then the Airflow scheduler is able to detect the externally created DAG Run, and skip over it, and create new DAG Runs.

Anything else we need to know:

I'd like to understand if there is a good reason for filtering out externally triggered DAG runs when counting the active DAGs in the scheduler code (linked above). It seems to be there intentionally. I looked in the git blame to try and understand, but it goes back 4+ years like that so I couldn't find out. Hope someone familiar with the scheduler can comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:Schedulerincluding HA (high availability) schedulerkind:bugThis is a clearly a bugpending-responsestaleStale PRs per the .github/workflows/stale.yml policy file

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions