Skip to content

Shedule details in default_args does not get passed to DAG model #26644

@ddeepwell

Description

@ddeepwell

Apache Airflow version

2.4.0

What happened

I have defined a number of things relating to the DAG in the default_args dictionary and expected these parameters to be applied to the DAG in addition to the individual operators within the dag.

My major problem was with "catchup", "schedule_interval", and "start_date". I had specified values for these in "default_args", but the DAG details did not match. This led to runs being scheduled when I did not expect them to.

What you think should happen instead

I might be misunderstanding the use of default_args, but I had expected it to be a means to specify defaults for the DAG. I expected the DAG to have the values of "catchup", "schedule_interval", and "start_date" from "default_args" rather than the defaults.

How to reproduce

Here is my DAG file

from datetime import datetime
from airflow.operators.dummy import DummyOperator
from airflow.models.dag import DAG

# Create defaults settings
default_args = {
    'owner': 'airflow',
    'schedule_interval': "@hourly",
    'start_date': datetime(2022, 9, 10, 10, 0, 0),
    'catchup': False,
}

# Executing Tasks and TaskGroups
with DAG(
        dag_id = "test_pipeline",
        # schedule_interval = default_args['schedule_interval'],
        # start_date = default_args['start_date'],
        # catchup = default_args['catchup'],
        default_args = default_args,
    ) as dag:

    task1 = DummyOperator(task_id = 'task1')
    task2 = DummyOperator(task_id = 'task2')

    task1 >> task2

However, the details page on the UI shows that the catchup, start_date, and schedule_interval was not changed from the defaults.

airflow-screen-shot

This is likely the expected behaviour, but it is easy to assume that items in the default_args will be applied to the DAG, especially since it is passed in as an argument when the DAG is generated.

Operating System

MacOS 12.6 and Rocky Linux 8.6

Versions of Apache Airflow Providers

No response

Deployment

Virtualenv installation

Deployment details

No response

Anything else

I would suggest that either values from "default_args" are used to population specifications of the DAG (where appropriate), or change the name of "default_args" to make the misunderstanding more difficult to make.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions