-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-2923][AIRFLOW-1784] Implement LatestOnlyOperator as BaseBranchOperator #5970
Conversation
4b5fe80
to
78184b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks good to me but can you also update the docs accordingly? :)
Your changes on the one hand are fixing the issues but on the other hand are changing the expected behavior related to the docs
task3 is downstream of task1 and task2 and because of the default trigger_rule being all_success will receive a cascaded skip from task1. task4 is downstream of task1 and task2. It will be first skipped directly by LatestOnlyOperator, even its trigger_rule is set to all_done.
(https://airflow.apache.org/concepts.html#latest-run-only)
In both cases it should skip downstream tasks regardless of its trigger_rules.
Can you also fix its pylint issues if there are any. It is currently not being checked by pylint due to pylint_todo.txt
. Do you mind removing the related lines in there (latest_only_operator
and test_latest_only_operator
) and fix issues if there are any? :) Thanks.
I think it would also be nice if we have some documentation about the expected behavior of dagruns executed with |
@m1racoli are you working on this PR? |
Hey @OmerJog, |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
The LatestOnlyOperator is indeed inconsistent with how other operators handle the tasks downstream i'm really looking forward for it to be fixed |
78184b8
to
24fbf5d
Compare
24fbf5d
to
f224d38
Compare
f224d38
to
ceaf360
Compare
Documentation updated and pylint issues fixed. |
Codecov Report
@@ Coverage Diff @@
## master #5970 +/- ##
==========================================
+ Coverage 85.34% 85.34% +<.01%
==========================================
Files 791 791
Lines 40128 40127 -1
==========================================
+ Hits 34247 34248 +1
+ Misses 5881 5879 -2
Continue to review full report at Codecov.
|
@m1racoli do you think we should add a note to the Updating.md so that people are aware that now the trigger rules will be checked correctly/differently ? |
``task2``. It will be first skipped directly by ``LatestOnlyOperator``, | ||
even its ``trigger_rule`` is set to ``all_done``. | ||
``task2``, but it will not be skipped, since its ``trigger_rule`` is set to | ||
``all_done``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should also be added to the Updating.md (see my comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is something users should be aware of when updating Airflow. Don't you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point and thanks! Will update Updating.md.
3dfafa2
to
eba7aaf
Compare
Comments for updating have been added. |
UPDATING.md
Outdated
|
||
In previous versions the `LatestOnlyOperator` forcefully skippded all (direct and undirect) downstream tasks on it's own. From this version on the operator will **only skip direct downstream** tasks and the scheduler will handle skipping any further downstream dependencies. | ||
|
||
No change is needed, if only the default trigger rule `all_success` is beeing used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No change is needed, if only the default trigger rule `all_success` is beeing used. | |
No change is needed if only the default trigger rule `all_success` is being used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
UPDATING.md
Outdated
@@ -943,6 +943,16 @@ The following metrics are deprecated and won't be emitted in Airflow 2.0: | |||
- `dag.loading-duration.<basename>` -- use `dag_processing.last_duration.<basename>` instead | |||
- `dag_processing.last_runtime.<basename>` -- use `dag_processing.last_duration.<basename>` instead | |||
|
|||
### Changes to skipping behaviour of LatestOnlyOperator | |||
|
|||
In previous versions the `LatestOnlyOperator` forcefully skippded all (direct and undirect) downstream tasks on it's own. From this version on the operator will **only skip direct downstream** tasks and the scheduler will handle skipping any further downstream dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In previous versions the `LatestOnlyOperator` forcefully skippded all (direct and undirect) downstream tasks on it's own. From this version on the operator will **only skip direct downstream** tasks and the scheduler will handle skipping any further downstream dependencies. | |
In previous versions, the `LatestOnlyOperator` forcefully skipped all (direct and undirect) downstream tasks on its own. From this version on the operator will **only skip direct downstream** tasks and the scheduler will handle skipping any further downstream dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
…chOperator LatestOnlyOperator is a special case of a BranchOperator, thus it should inherit from it. This fixes an issue where the skipping behaviour of LatestOnlyOperator is inconsistent with other operators, by forcefully skipping all downstream tasks recursively ignoring trigger rules.
eba7aaf
to
d600f6e
Compare
Good work @m1racoli |
…chOperator (apache#5970) LatestOnlyOperator is a special case of a BranchOperator, thus it should inherit from it. This fixes an issue where the skipping behaviour of LatestOnlyOperator is inconsistent with other operators, by forcefully skipping all downstream tasks recursively ignoring trigger rules.
LatestOnlyOperator is a special case of a BranchOperator, thus it should inherit from it.
This fixes an issue where the skipping behaviour of LatestOnlyOperator is inconsistent with other operators,
by forcefully skipping all downstream tasks recursively ignoring trigger rules.
Make sure you have checked all steps below.
Jira
Description
LatestOnlyOperator is a special case of a BranchOperator, thus it should inherit from it.
This fixes an issue where the skipping behaviour of LatestOnlyOperator is inconsistent with other operators by forcefully skipping all downstream tasks recursively, ignoring trigger rules.
Tests
Extended
tests.operators.test_latest_only_operator.py
to cover downstream children with trigger rules. Furthermore fixed the test to not skip task in externally triggered DagRuns.Commits
Documentation
Code Quality
flake8