Skip to content

Add scheduler by Dataset and/or interval #43518

@FelipeArisi

Description

@FelipeArisi

Description

In recent features it is possible to add dataset scheduling with conditional expressions (https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/datasets.html#advanced-dataset-scheduling-with-conditional-expressions)

The idea would be to add a time-based schedule to these expressions as well. For example: schedule=['0 0 1 * *' | Dataset(path)]

Use case/motivation

There are some events involving many Datasets as triggers. If a Dataset is late or gives an error, the entire pipeline will depend on it. The idea would be to add a time condition to start the DAG anyway.

Another problem is, if all the datasets are ready and the pipeline is ready, the pipeline must run after a certain time.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions