Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-6451] self._print_stat() in dag_processing.py should be skippable #7096

Merged
merged 4 commits into from
Jan 9, 2020

Conversation

tooptoop4
Copy link
Contributor

@tooptoop4 tooptoop4 commented Jan 8, 2020


Issue link: AIRFLOW-6451

  • Description above provides context of the change
  • Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
  • Unit tests coverage for changes (not needed for documentation changes)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

* For document-only changes commit message can start with [AIRFLOW-XXXX].


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@codecov-io
Copy link

codecov-io commented Jan 8, 2020

Codecov Report

Merging #7096 into master will decrease coverage by 0.28%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7096      +/-   ##
==========================================
- Coverage   85.15%   84.87%   -0.29%     
==========================================
  Files         680      680              
  Lines       38824    38822       -2     
==========================================
- Hits        33061    32949     -112     
- Misses       5763     5873     +110
Impacted Files Coverage Δ
airflow/utils/dag_processing.py 87.95% <100%> (-0.05%) ⬇️
airflow/kubernetes/volume_mount.py 44.44% <0%> (-55.56%) ⬇️
airflow/kubernetes/volume.py 52.94% <0%> (-47.06%) ⬇️
airflow/kubernetes/pod_launcher.py 45.25% <0%> (-46.72%) ⬇️
airflow/kubernetes/refresh_config.py 50.98% <0%> (-23.53%) ⬇️
...rflow/contrib/operators/kubernetes_pod_operator.py 78.75% <0%> (-20%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe20ef6...e6c6ce8. Read the comment docs.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks!

@potiuk potiuk merged commit 77b1bdc into apache:master Jan 9, 2020
@@ -691,6 +688,7 @@ def start(self):
"have been processed %s times", self._max_runs)
break

# TODO can this be removed?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tooptoop4 Please don't add comments like this -- it just makes it harder and confusing for the next person who comes along.

The answer is no - async mode is still needed -- when using SQLlite we run "synchronously" as it doesn't have well/at all with multiple access, so we want to "stop" as soon as we're done, not sleep. poll_time=None in sync mode, which says wait until a message is sent on L639.

Please remove this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashb async mode is not sqllite. so i am questioning why non-sqllite need that IF block at all. ie can non-sqllite use original poll time set on L626

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Async mode is set to true when sqlite is in use.

Copy link
Contributor Author

@tooptoop4 tooptoop4 Jan 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashb I think u are confused here, see https://github.com/apache/airflow/blob/master/airflow/jobs/scheduler_job.py#L1485
# When using sqlite, we do not use async_mode
# so the scheduler job and DAG parser don't access the DB at the same time.
async_mode = not self.using_sqlite

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whops yes totally backwards. Sqlite is sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But no, I don't think this can be removed -- the default timeout is 0 which means this would sit in a busy/CPU consuming loop.

@ashb
Copy link
Member

ashb commented Jan 9, 2020

Instead of a new config option what do you think of using the existing print_stats_interval config option to be set to 0 to disable it? WDYT @potiuk @tooptoop4 ?

@tooptoop4
Copy link
Contributor Author

Instead of a new config option what do you think of using the existing print_stats_interval config option to be set to 0 to disable it? WDYT @potiuk @tooptoop4 ?

makes sense @ashb , can I link a new PR to same JIRA or should I create new JIRA?

@ashb
Copy link
Member

ashb commented Jan 10, 2020

@tooptoop4 Great. I'll revert this change and you can re-use the same Jira. (The reason for reverting is so that if we want to cherry-pick the change to 1.10 then we only have one commit to backport, otherwise we'd get a conflict)

ashb added a commit that referenced this pull request Jan 10, 2020
… be skippable by config option (#7096)"

This reverts commit 77b1bdc.
ashb added a commit that referenced this pull request Jan 10, 2020
… be skippable by config option (#7096)" (#7129)

This reverts commit 77b1bdc.

Reverts #7096 to do in a slightly different way (without a new config option), and reverting this so that the new change is easier to backport to 1.10 releases.
@tooptoop4
Copy link
Contributor Author

#7134 raised

galuszkak pushed a commit to FlyrInc/apache-airflow that referenced this pull request Mar 5, 2020
galuszkak pushed a commit to FlyrInc/apache-airflow that referenced this pull request Mar 5, 2020
… be skippable by config option (apache#7096)" (apache#7129)

This reverts commit 77b1bdc.

Reverts apache#7096 to do in a slightly different way (without a new config option), and reverting this so that the new change is easier to backport to 1.10 releases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants