-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Fix DAG processor OOM || Avoid loading all TaskInstances when checking DagVersion in write_dag #60937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix DAG processor OOM || Avoid loading all TaskInstances when checking DagVersion in write_dag #60937
Conversation
8f6d407 to
a40e17c
Compare
a40e17c to
7b46446
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes the DAG serialization process to prevent out-of-memory (OOM) issues in the DAG processor. The change replaces an inefficient joinedload operation that loads all TaskInstances into memory with a lightweight EXISTS query that uses constant memory, regardless of the number of task instances.
Changes:
- Replaced
joinedload(DagVersion.task_instances)with anexists()query to check for task instance existence - Added explicit
TaskInstanceimport to support the new query pattern - Removed unused
joinedloadimport from sqlalchemy.orm
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ite-dag-exists-query
This reverts commit b82284d.
Backport failed to create: v3-1-test. View the failure log Run details
You can attempt to backport this manually by running: cherry_picker 235595b v3-1-testThis should apply the commit to the v3-1-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continueIf you don't have cherry-picker installed, see the installation guide. |
|
Auto backport failed. Manual backport to v3-1-test #60962 |
…g DagVersion in write_dag (apache#60937) Fix DAG processor OOM || Avoid loading all TaskInstances when checking DagVersion in write_dag (apache#60937)


During DAG serialization,
write_daguses joinedload to check if any taskinstances exist for a DAG version. This loads all task instances into memory
just to answer a boolean question.
For long-running deployments with many DAG runs, this can cause high memory
usage during serialization.
This PR optimizes the check by using an EXISTS query instead, which has
constant memory usage regardless of how many task instances exist.
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.