Fix stale upstream_failed propagation after external task state changes#65924
Fix stale upstream_failed propagation after external task state changes#65924shaealh wants to merge 5 commits into
Conversation
|
48 checks passed, can a reviewer provide sign off to merge? Thank you! |
|
@shaealh — Removing the The label's contract is that the PR is ready for maintainer review — a regression like this means the PR temporarily isn't. Rebase your branch onto the latest
No rush. Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you. |
Fixes #63697
When checking premature task instances, the scheduler can use a stale list of finished task instances captured earlier in the scheduling loop. If another process changes task states meanwhile, this can cause the scheduler to propagate upstream_failed based on outdated upstream state.
This changes the premature-TI dependency check to re-query finished task instances before allowing flag_upstream_failed=True dependency evaluation to write terminal states.
Added a regression test that simulates separate scheduler/API SQLAlchemy sessions.