Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DAG run state not updated while DAG is paused #16343

Merged
merged 15 commits into from Jun 17, 2021

Conversation

@ephraimbuddy
Copy link
Contributor

@ephraimbuddy ephraimbuddy commented Jun 9, 2021

Closes: #15439

The state of a DAG run does not update while the DAG is paused.
The tasks continue to run if the DAG run was kicked off before
the DAG was paused and eventually finish and are marked correctly.
The DAG run state does not get updated and stays in Running state until the DAG is unpaused.

This change fixes it by running a check at intervals, updating states(if possible)
of DagRuns that the tasks have finished running while the DAG is paused


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@ephraimbuddy ephraimbuddy marked this pull request as ready for review Jun 9, 2021
ashb
ashb approved these changes Jun 9, 2021
@github-actions
Copy link

@github-actions github-actions bot commented Jun 9, 2021

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@ephraimbuddy ephraimbuddy force-pushed the update-dagrun-state-dag-paused branch from 25af23f to cc76e09 Jun 9, 2021
@ashb
Copy link
Member

@ashb ashb commented Jun 10, 2021

I wonder what is more efficient: doing this periodically (for paused dags, where the state is likely to never change) or expanding on the "mini scheduler run" to do a simpler version of dag_run.update_state() when the task that just finished was one of the leaf tasks in the dag.

@ephraimbuddy
Copy link
Contributor Author

@ephraimbuddy ephraimbuddy commented Jun 10, 2021

I wonder what is more efficient: doing this periodically (for paused dags, where the state is likely to never change) or expanding on the "mini scheduler run" to do a simpler version of dag_run.update_state() when the task that just finished was one of the leaf tasks in the dag.

Nice but I think it may not work if the user disables mini scheduling?

@ashb
Copy link
Member

@ashb ashb commented Jun 10, 2021

Nice but I think it may not work if the user disables mini scheduling?

Yes, but we'll likely remove that setting in a version or two -- it was mostly an escape hatch in case it had un-forseen bugs.

@ephraimbuddy
Copy link
Contributor Author

@ephraimbuddy ephraimbuddy commented Jun 10, 2021

Nice but I think it may not work if the user disables mini scheduling?

Yes, but we'll likely remove that setting in a version or two -- it was mostly an escape hatch in case it had un-forseen bugs.

Should I add it as a separate check outside the mini scheduling?

@ephraimbuddy ephraimbuddy force-pushed the update-dagrun-state-dag-paused branch 4 times, most recently from 1a941b1 to 5f386be Jun 11, 2021
@ephraimbuddy ephraimbuddy reopened this Jun 11, 2021
@ephraimbuddy ephraimbuddy force-pushed the update-dagrun-state-dag-paused branch 2 times, most recently from 0fcc3f1 to 8d420f1 Jun 14, 2021
airflow/jobs/local_task_job.py Outdated Show resolved Hide resolved
airflow/jobs/scheduler_job.py Outdated Show resolved Hide resolved
airflow/jobs/local_task_job.py Outdated Show resolved Hide resolved
airflow/jobs/local_task_job.py Outdated Show resolved Hide resolved
airflow/jobs/local_task_job.py Outdated Show resolved Hide resolved
tests/jobs/test_local_task_job.py Outdated Show resolved Hide resolved
@ephraimbuddy ephraimbuddy force-pushed the update-dagrun-state-dag-paused branch 2 times, most recently from a06e51e to 3bf9b22 Jun 15, 2021
@ephraimbuddy ephraimbuddy reopened this Jun 15, 2021
@ephraimbuddy ephraimbuddy force-pushed the update-dagrun-state-dag-paused branch from 3bf9b22 to 0c8d695 Jun 15, 2021
@ephraimbuddy ephraimbuddy reopened this Jun 16, 2021
@ephraimbuddy ephraimbuddy force-pushed the update-dagrun-state-dag-paused branch from 1ee7b0f to 17b92bd Jun 17, 2021
@ephraimbuddy ephraimbuddy reopened this Jun 17, 2021
airflow/jobs/local_task_job.py Outdated Show resolved Hide resolved
@ephraimbuddy ephraimbuddy force-pushed the update-dagrun-state-dag-paused branch from 17b92bd to 263767a Jun 17, 2021
ashb
ashb approved these changes Jun 17, 2021
@ephraimbuddy ephraimbuddy reopened this Jun 17, 2021
@ephraimbuddy ephraimbuddy reopened this Jun 17, 2021
@ephraimbuddy ephraimbuddy force-pushed the update-dagrun-state-dag-paused branch 2 times, most recently from 859f92b to 6e41893 Jun 17, 2021
@ephraimbuddy ephraimbuddy merged commit 3834df6 into apache:main Jun 17, 2021
46 checks passed
@ephraimbuddy ephraimbuddy deleted the update-dagrun-state-dag-paused branch Jun 17, 2021
jhtimmins added a commit to astronomer/airflow that referenced this issue Jun 22, 2021
The state of a DAG run does not update while the DAG is paused.
The tasks continue to run if the DAG run was kicked off before
the DAG was paused and eventually finish and are marked correctly.
The DAG run state does not get updated and stays in Running state until the DAG is unpaused.

This change fixes it by running a check on task exit to update state(if possible)
 of the DagRun if the task was able to finish the DagRun while the DAG is paused

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
(cherry picked from commit 3834df6)
@ashb ashb added this to the Airflow 2.1.1 milestone Jun 22, 2021
ashb added a commit that referenced this issue Jun 22, 2021
The state of a DAG run does not update while the DAG is paused.
The tasks continue to run if the DAG run was kicked off before
the DAG was paused and eventually finish and are marked correctly.
The DAG run state does not get updated and stays in Running state until the DAG is unpaused.

This change fixes it by running a check on task exit to update state(if possible)
 of the DagRun if the task was able to finish the DagRun while the DAG is paused

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
(cherry picked from commit 3834df6)
jhtimmins added a commit that referenced this issue Aug 13, 2021
jhtimmins added a commit that referenced this issue Aug 13, 2021
The state of a DAG run does not update while the DAG is paused.
The tasks continue to run if the DAG run was kicked off before
the DAG was paused and eventually finish and are marked correctly.
The DAG run state does not get updated and stays in Running state until the DAG is unpaused.

This change fixes it by running a check on task exit to update state(if possible)
 of the DagRun if the task was able to finish the DagRun while the DAG is paused

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
(cherry picked from commit 3834df6)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

2 participants