Skip to content

Conversation

@allisonwang
Copy link
Contributor

@allisonwang allisonwang commented Jun 1, 2017

Dear Airflow maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

  • My PR addresses Airflow 936 issues and references them in the PR title.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

This PR adds a modal popup when clicking circle DAG icon in Airflow tree view UI. It adds the functionalities to clear/mark success of the entire DAG run. This behavior is equivalent to individually clear/mark each task instance in the DAG run. The original logic of editing DAG run page is moved to the button "Edit DAG Run".

screen shot 2017-06-07 at 10 19 59 am
screen shot 2017-06-01 at 10 42 14 am

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Tests are added under test/api/common/mark_tasks.py to test marking entire DAG as success. There is no test for clear the DAG since it is directly implemented in views.py.

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

@aoen @saguziel

@mention-bot
Copy link

@allisonwang, thanks for your PR! By analyzing the history of the files in this pull request, we identified @mistercrunch, @bolkedebruin and @patrickleotardif to be potential reviewers.

@codecov-io
Copy link

codecov-io commented Jun 1, 2017

Codecov Report

Merging #2339 into master will decrease coverage by 0.22%.
The diff coverage is 46.03%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2339      +/-   ##
==========================================
- Coverage    69.5%   69.27%   -0.23%     
==========================================
  Files         145      145              
  Lines       11066    11121      +55     
==========================================
+ Hits         7691     7704      +13     
- Misses       3375     3417      +42
Impacted Files Coverage Δ
airflow/api/common/experimental/mark_tasks.py 82.97% <13.33%> (-13.23%) ⬇️
airflow/www/views.py 67.62% <56.25%> (-1.16%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b13cd6d...61a470e. Read the comment docs.

Copy link

@amaliujia amaliujia Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add else and add logging say cannot find dag_run in DB or find more than one dag_run given dag_id and execution_date in DB?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can't be more than one dag_run
The schema for dag_run has
PRIMARY KEY (`id`), UNIQUE KEY `dag_id` (`dag_id`,`execution_date`), UNIQUE KEY `dag_id_2` (`dag_id`,`run_id`),

So it should be either 0 or 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think more checks are good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know! I will make a note in else branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it's just code bloat this point given it's impossible to get multiple results. I think the better thing to do is just loop over the results

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated. Why add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bolkedebruin This is for appending results from this function. Without expunge sqlalchemy will through UnboundedExceptionError since the session is closed before function returns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not setting a dag state but a dagrun state please rename.

@saguziel
Copy link
Contributor

saguziel commented Jun 2, 2017

I haven't really dived in, but does it make sense that clear and success are separate endpoints?

Copy link

@amaliujia amaliujia Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. You are testing dagrun. So can rename tests.

@allisonwang allisonwang force-pushed the master branch 3 times, most recently from dc7268b to f6d0aa6 Compare June 2, 2017 21:24
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import appears unused

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import also appears unused

@saguziel
Copy link
Contributor

saguziel commented Jun 5, 2017

It seems that if clear_dagrun can be part of clear, dag_success could be part of success

@saguziel
Copy link
Contributor

saguziel commented Jun 5, 2017

And maybe it should be dagrun_success

@aoen
Copy link
Contributor

aoen commented Jun 5, 2017

@saguziel I think dagrun/task clearing should be separate functions, having them in the same function is the equivalent to having 15 different classes and then having a function that switches off of the class type, instead of just having 15 different functions to call in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: kill whitespace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this method seems to imply that commit should always be true. Maybe there should be another method that gets a list of task instances that would be modified without altering them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a different endpoint for dagruns/tasks, although they could reuse most of the same logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to call clear_task from clear_dagrun, likewise for success

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Edit"
"Clear"
and
"Mark Success"

are fine for the button names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: replace "dag" with "dagrun"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: edit_dagrun

@saguziel
Copy link
Contributor

saguziel commented Jun 5, 2017

@aoen That is a pattern though. We save some code duplication, and we don't have 15 classes. Especially since we aren't actually clearing the dag_run, but rather the tasks that belong to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do a top level import instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, put this in models.py. It's okay to have the stub here but the logic of setting the dagrun to the state should be in models

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saguziel There is a function called set_dag_run_state in models.py, but the logics are quite different. set_dag_run_state here basically called set_state above which is designed for mark success functionality and considers a lot more edge cases. But definitely the API functions need refactoring in the future to be consistent with other parts of the codebase.

@aoen
Copy link
Contributor

aoen commented Jun 5, 2017

Spoke with @saguziel offline, we are in agreement that dagrun/task clearing should be separate functions that call some common functions.

@allisonwang allisonwang force-pushed the master branch 2 times, most recently from ea4d82e to 70c0c9d Compare June 7, 2017 17:35
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these were accidental removals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aoen Hmm maybe my editor (Atom) removed the white space automatically but the Apache license content should be exactly the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section can be factored out with the /clear endpoint code.

also I think the recursive flag needs to be supported here too for the subdag operator (same as /clear).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aoen clear here default to include subdags https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L3149 but adding it definitely makes it more readable :) Will refactor out the logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand we should allow the option to specify recursive the same way it can be specified for task-clearing (for the subdag operator), i.e. I think it should be an option in the modal.

Copy link
Contributor

@aoen aoen Jun 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to use keyword arguments like recursive=True here (that way it's clear what each argument is). The recursive value should not always be True but have it's value derived from the recursive button in the model, same as for individual task instance clearing. LGTM after that though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aoen Right we can make the clear DAG run UI more consistent with clear tasks, i.e. support past / future, recursive/ non-recursive, etc. Here I feel letting recursive always be true make sense otherwise clear DAG run will clear all operator task instances except for subdag which could be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in version 2 :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think past/future are a bit different than non-recursive (which set of dagruns are affected vs how affected tasks are changed), but I am OK leaving the recursive option as a TODO (it's just not that great that clearing dagruns doesn't have functional parity with clearing task instances).

@aoen
Copy link
Contributor

aoen commented Jun 13, 2017

LGTM

@asfgit asfgit closed this in 3c450fb Jun 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants