-
Notifications
You must be signed in to change notification settings - Fork 16.6k
Closed
Labels
kind:featureFeature RequestsFeature Requests
Description
Description
catchup and backfill provided good flexibility to run Dags in pasts. the only thing that might concerning on those context is that they might produces a lot of DagRun. this situation could leads to some performance issues such as remaining slot availability and long time to finish.
Assume that there is a DAG with SqlOperator using this sql template that runs every 15 mins:
SELECT * FROM table WHERE created_at BETWEEN {{ prev_data_interval_start_success }} AND {{ ts }}In case of any interruption on the scheduler level such as 1 day, this catchup process would creates 96 DagRun.
or also assume running this Dag with backfill like this:
airflow backfill DAG --start-date=today --end-date=prev_weekthat would creates 672 DagRun!
with simple feature that only runs 1 instance of DagRun and fulfils these parameters by the starts of the gap and end of the gap, this issue will be solve.
Use case/motivation
- reduce number of DagRuns in case of
catchupandbackfill - decrease completion time of
backfillandcatchup
Related issues
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
kind:featureFeature RequestsFeature Requests