Optimize fail-fast check to avoid loading SerializedDAG
#56694
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When a task fails and
fail_fastis enabled, the API-server needs to stop remaining tasks. Previously, this required loading the entire 5-50 MB SerializedDAG for every task failure (although it comes from cache -- but it is likely that if multiple replicas are run -- it might not have it in local cache) to check thefail_fastsetting.This change adds
fail_fastcolumn to the dag table and checks it with a simple database lookup first. TheSerializedDAGis only loaded whenfail_fast=True(affecting ~1% of DAGs), avoiding unnecessary memory and I/O overhead in 99% of cases.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.