Skip to content

Fix scheduler/triggerer deadlock on deferrable task instances#65920

Open
shaealh wants to merge 3 commits into
apache:mainfrom
shaealh:shaealh/65818
Open

Fix scheduler/triggerer deadlock on deferrable task instances#65920
shaealh wants to merge 3 commits into
apache:mainfrom
shaealh:shaealh/65818

Conversation

@shaealh
Copy link
Copy Markdown
Contributor

@shaealh shaealh commented Apr 27, 2026

Regarding Issue #65818
related: #65836

This changes the scheduler trigger-timeout path and trigger cleanup path to lock candidate task_instance rows in deterministic primary-key order before updating them.

On MySQL/InnoDB, the previous bulk updates could reach overlapping task_instance rows through different indexes, allowing the scheduler and triggerer to acquire row/gap locks in different orders. With HA schedulers and deferrable tasks, that can deadlock.

The new flow is:

  1. Select matching task_instance.id rows ordered by primary key.
  2. Lock those rows with FOR UPDATE SKIP LOCKED.
  3. Update only the selected IDs.
  4. Repeat in bounded batches.

This keeps the existing predicates and update values intact while making lock acquisition consistent across the scheduler and triggerer writers.

Tests added:

  • scheduler timeout path processes more than one batch
  • trigger cleanup path clears trigger IDs across more than one batch

Tests run:

  • ruff check airflow-core/src/airflow/jobs/scheduler_job_runner.py airflow-core/src/airflow/models/trigger.py airflow-core/tests/unit/jobs/test_scheduler_job.py airflow-core/tests/unit/models/test_trigger.py
  • python -m compileall -q airflow-core/src/airflow/jobs/scheduler_job_runner.py airflow-core/src/airflow/models/trigger.py airflow-core/tests/unit/jobs/test_scheduler_job.py airflow-core/tests/unit/models/test_trigger.py
  • AIRFLOW_HOME=/tmp/airflow-65818-test-home PATH=/tmp/airflow-test-bin:$PATH .venv/bin/python -m pytest airflow-core/tests/unit/models/test_trigger.py -k 'clean_unused' --with-db-init
  • AIRFLOW_HOME=/tmp/airflow-65818-test-home PATH=/tmp/airflow-test-bin:$PATH .venv/bin/python -m pytest airflow-core/tests/unit/jobs/test_scheduler_job.py -k 'timeout_triggers' --with-db-init

@shaealh
Copy link
Copy Markdown
Contributor Author

shaealh commented Apr 27, 2026

Hi team, can I get an approval to merge? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler area:Triggerer backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants