-
Notifications
You must be signed in to change notification settings - Fork 16.3k
[AIRFLOW-936] Add clear/mark success for DAG in the UI #2339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@allisonwang, thanks for your PR! By analyzing the history of the files in this pull request, we identified @mistercrunch, @bolkedebruin and @patrickleotardif to be potential reviewers. |
Codecov Report
@@ Coverage Diff @@
## master #2339 +/- ##
==========================================
- Coverage 69.5% 69.27% -0.23%
==========================================
Files 145 145
Lines 11066 11121 +55
==========================================
+ Hits 7691 7704 +13
- Misses 3375 3417 +42
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add else and add logging say cannot find dag_run in DB or find more than one dag_run given dag_id and execution_date in DB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can't be more than one dag_run
The schema for dag_run has
PRIMARY KEY (`id`), UNIQUE KEY `dag_id` (`dag_id`,`execution_date`), UNIQUE KEY `dag_id_2` (`dag_id`,`run_id`),
So it should be either 0 or 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think more checks are good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know! I will make a note in else branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it's just code bloat this point given it's impossible to get multiple results. I think the better thing to do is just loop over the results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated. Why add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bolkedebruin This is for appending results from this function. Without expunge sqlalchemy will through UnboundedExceptionError since the session is closed before function returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are not setting a dag state but a dagrun state please rename.
|
I haven't really dived in, but does it make sense that clear and success are separate endpoints? |
tests/api/common/mark_tasks.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. You are testing dagrun. So can rename tests.
dc7268b to
f6d0aa6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import appears unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import also appears unused
|
It seems that if clear_dagrun can be part of clear, dag_success could be part of success |
|
And maybe it should be dagrun_success |
|
@saguziel I think dagrun/task clearing should be separate functions, having them in the same function is the equivalent to having 15 different classes and then having a function that switches off of the class type, instead of just having 15 different functions to call in the first place. |
airflow/www/views.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: kill whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this method seems to imply that commit should always be true. Maybe there should be another method that gets a list of task instances that would be modified without altering them?
airflow/www/views.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a different endpoint for dagruns/tasks, although they could reuse most of the same logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to call clear_task from clear_dagrun, likewise for success
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Edit"
"Clear"
and
"Mark Success"
are fine for the button names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: replace "dag" with "dagrun"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: edit_dagrun
|
@aoen That is a pattern though. We save some code duplication, and we don't have 15 classes. Especially since we aren't actually clearing the dag_run, but rather the tasks that belong to it. |
airflow/www/views.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do a top level import instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, put this in models.py. It's okay to have the stub here but the logic of setting the dagrun to the state should be in models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saguziel There is a function called set_dag_run_state in models.py, but the logics are quite different. set_dag_run_state here basically called set_state above which is designed for mark success functionality and considers a lot more edge cases. But definitely the API functions need refactoring in the future to be consistent with other parts of the codebase.
|
Spoke with @saguziel offline, we are in agreement that dagrun/task clearing should be separate functions that call some common functions. |
ea4d82e to
70c0c9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like these were accidental removals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aoen Hmm maybe my editor (Atom) removed the white space automatically but the Apache license content should be exactly the same.
airflow/www/views.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section can be factored out with the /clear endpoint code.
also I think the recursive flag needs to be supported here too for the subdag operator (same as /clear).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aoen clear here default to include subdags https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L3149 but adding it definitely makes it more readable :) Will refactor out the logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand we should allow the option to specify recursive the same way it can be specified for task-clearing (for the subdag operator), i.e. I think it should be an option in the modal.
airflow/www/views.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to use keyword arguments like recursive=True here (that way it's clear what each argument is). The recursive value should not always be True but have it's value derived from the recursive button in the model, same as for individual task instance clearing. LGTM after that though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aoen Right we can make the clear DAG run UI more consistent with clear tasks, i.e. support past / future, recursive/ non-recursive, etc. Here I feel letting recursive always be true make sense otherwise clear DAG run will clear all operator task instances except for subdag which could be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in version 2 :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think past/future are a bit different than non-recursive (which set of dagruns are affected vs how affected tasks are changed), but I am OK leaving the recursive option as a TODO (it's just not that great that clearing dagruns doesn't have functional parity with clearing task instances).
|
LGTM |
Dear Airflow maintainers,
Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
JIRA
Description
This PR adds a modal popup when clicking circle DAG icon in Airflow tree view UI. It adds the functionalities to clear/mark success of the entire DAG run. This behavior is equivalent to individually clear/mark each task instance in the DAG run. The original logic of editing DAG run page is moved to the button "Edit DAG Run".
Tests
Tests are added under
test/api/common/mark_tasks.pyto test marking entire DAG as success. There is no test for clear the DAG since it is directly implemented inviews.py.Commits
@aoen @saguziel