-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Docs to compare SubDAGs and TaskGroups #12741
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
|
than the other. The SubDagOperator launches a DAG as a separate entity from the original graph. This design pattern | ||
offers flexibility to create SubDAGs with different schedulers and executors at the cost of greater complexity and | ||
maintenance burden. TaskGroups creates a UI grouping concept on the same original DAG which simplifies logic and | ||
maintenance for less flexibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How users can schedule SubDAGs using different scheduler or executor?
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
I'd recommend stronger language against subdags. I know @mistercrunch has an opinion on this that we could add. |
Should we complete the review before 2.0.0rc1 tomorrow? @ryw @VeenaArv @turbaszek -> this one needs some love I think. |
+----------------------+----------------------+ | ||
| Honors all pool | Does not honor pool | | ||
| configurations | configurations | | ||
+----------------------+----------------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very helpful, but one thing still not clear to me from these docs: Since SubDAGs can have different schedules, how does a SubDAG's execution trigger or block executions of tasks further down in the parent DAG?
For example, if a SubDAG is scheduled to execute every 1 hour, and the parent DAG is scheduled every 20 minutes, will the SubDAG be executed every 20 minutes?
Or if the schedules were reversed (SubDAG every 20 minutes and parent DAG every hour), how do the SubDAG's multiple executions in that hour factor into the parent DAG's eventual execution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two parts here:
- Parent DAG "SubDAG task" — This is the link between the two: the parent and the SubDAG. It is runs a "Sensor" underneath. I think you can even specify the poking interval.
- SubDAG "DAG" — Can be added to the
global
scope to make it available in the main screen, have different schedule intervals... just like a regular DAG. Only limitation is the name (AFAIK).
SubDAG could easily be renamed as ExternalDagSensor
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
Hi @VeenaArv, are you still interested in getting this merged? |
closes: 12298
link: #12298
Adds a section under TaskGroup to compare it to SubDAGs.