-
Notifications
You must be signed in to change notification settings - Fork 16.4k
[AIRFLOW-1423] Add logs to the scheduler DAG run decision logic #2455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2455 +/- ##
==========================================
+ Coverage 69.4% 69.44% +0.04%
==========================================
Files 146 146
Lines 11289 11298 +9
==========================================
+ Hits 7835 7846 +11
+ Misses 3454 3452 -2
Continue to review full report at Codecov.
|
airflow/jobs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
info level plz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above everywhere ( put dag_id here)
airflow/jobs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug level plz
airflow/jobs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug level plz
airflow/jobs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug level plz
airflow/jobs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug lvl plz
airflow/jobs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug lvl please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put dag name and runs in the message too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saguziel I did not put the dag_id in any of the logs to remain in line with the other logs and also because they are already split by their dag name.
Do you still want me to do this? If so, would you like all logs to include the dag_id to remain consistent?
|
Thanks for your reply @bolkedebruin but I really think this is worth an INFO log level since it's crucial to understand the decision making of the scheduler in production environment. |
airflow/jobs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably should be is not None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, actually it's is None instead, committing
|
@bolkedebruin @saguziel I've updated the PR with requested changes. |
|
@ultrabug I disagree with your assessment of the requirement for the high log levels. The scheduler is already quite chatty. Please lower the levels. |
|
@bolkedebruin okay, can we settle to at least keep the last one (next scheduled execution time) to info? This one is very valuable. Thanks in advance |
|
Sure no problem. |
One of the most frustrating topic for users is usually related to their understanding on the scheduler decisions about running a DAG or not. It would be wise to add more logs in the jobs creation decision so that it gets more clear whether a DAG is run or not and why.
|
@bolkedebruin done, thanks. |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
@ultrabug was there something else needs to be done in this PR ? |
| # return if already reached maximum active runs and no timeout setting | ||
| if len(active_runs) >= dag.max_active_runs and not dag.dagrun_timeout: | ||
| self.logger.info( | ||
| "Dag reached maximum of {} active runs (no timeout)". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such formatting of messages causes that a string object is created, which then may not be used anywhere, when the login level will be too low.
Formatting parameters should be passed as arguments to the info method.
Example:
self.log.info("The Table '%s' does not exists already.", self.table_id)
```|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
@OmerJog I'll happily update this PR shall it be acknowledged first indeed. |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
@mik-laj @bolkedebruin @saguziel will be happy to correct anything if there's a chance it gets some attention, answer would be appreciated thanks. If not, please close this PR yourself so we at least know where we stand. Cheers mates. |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
One of the most frustrating topic for users is usually related
to their understanding on the scheduler decisions about running
a DAG or not.
It would be wise to add more logs in the jobs creation decision
so that it gets more clear whether a DAG is run or not and why.
Dear Airflow maintainers,
Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
JIRA
Description
One of the most frustrating topic for users is usually related to their understanding on the scheduler decisions about running a DAG or not.
It would be wise to add more logs in the jobs creation decision so that it gets more clear whether a DAG is run or not and why.
This patch adds such simple and useful logs.
Tests
Logs only
Commits