Skip to content

fix(scheduler): add row lock to AssetPartitionDagRun fetch to prevent…#68061

Open
bujjibabukatta wants to merge 2 commits into
apache:mainfrom
bujjibabukatta:fix/#68045
Open

fix(scheduler): add row lock to AssetPartitionDagRun fetch to prevent…#68061
bujjibabukatta wants to merge 2 commits into
apache:mainfrom
bujjibabukatta:fix/#68045

Conversation

@bujjibabukatta
Copy link
Copy Markdown
Contributor

When running 2+ schedulers (HA mode), DAGs using PartitionedAssetTimetable
or CronPartitionTimetable intermittently produce duplicate DagRuns for the
same asset event.

PR #60773 fixed this race for non-partitioned asset scheduling by adding
with_row_locks to the AssetDagRunQueue fetch. However, partitioned assets
flow through a separate code path — _create_dagruns_for_partitioned_asset_dags
— which reads AssetPartitionDagRun rows with a plain SELECT, no lock.
Two schedulers can read the same unprocessed rows within ~40ms of each other
and independently create a DagRun for each.

Confirmed reproducible on MWAA 3.2.1 (which ships #60773) with 2 schedulers
and 2+ consumer DAGs on the same asset.

Fix

Add with_row_locks(..., skip_locked=True) to the AssetPartitionDagRun
query in _create_dagruns_for_partitioned_asset_dags, mirroring exactly what
#60773 did for AssetDagRunQueue.

SELECT ... FOR UPDATE SKIP LOCKED ensures that when Scheduler A locks the
APDR rows it is processing, Scheduler B's identical query returns zero rows
and exits cleanly — no duplicate DagRun is created.

with_row_locks is already imported in this file so no new imports are needed.

Testing

Related


@boring-cyborg boring-cyborg Bot added the area:Scheduler including HA (high availability) scheduler label Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Duplicate DAG runs for PartitionedAssetTimetable with multiple schedulers (HA) — race not covered by #60773

2 participants