You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure what the formatting here is and if there is a template for me to use to apologies if the setup is unclear. I tried to include all relevant information needed.
The problem
In bigger deployments a lot of DAGs end up on the same cron time. @daily always resolves to 0 0 * * *, so every daily DAG fires at midnight at the exact same moment. On a production MWAA environment I work on we had around 26 daily DAGs all firing at 00:00 and it caused task failures from the contention at that boundary.
Right now there is no native way to spread them out. You either:
hand pick a unique minute for every DAG, which drifts and collides as the number of DAGs grows, or
hash the dag id into a literal cron string yourself, which works but throws away the @daily intent and is not reusable.
Neither feels like something every team should be reinventing on their own.
What I would like to propose
A way to add deterministic jitter to a schedule. Something like a JitteredCronTimetable, or a jitter option on the existing cron timetables, that offsets each DAG by a stable function of its dag id inside a configurable window (say up to 60 minutes), while keeping the data_interval and logical_date semantics exactly the same.
The important part is that it stays deterministic rather than random, so a given DAG always lands in the same slot and runs stay stable and predictable across scheduler restarts and timetable serialization.
Prior art
I looked and could not find anything that already does this. No native option, and no community plugin that I could find. Other schedulers tend to offer something in this space (Kubernetes CronJobs for example).
Questions before I build anything
Is this something you would want in core, or is it better as a community plugin?
If core, what shape do you prefer, a new Timetable or a jitter option on the existing cron timetables?
Does this need an AIP, or is it small enough to go straight to a PR?
I already have a working product at my current workplace in production and I am happy to bring tests and docs.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
Not sure what the formatting here is and if there is a template for me to use to apologies if the setup is unclear. I tried to include all relevant information needed.
The problem
In bigger deployments a lot of DAGs end up on the same cron time.
@dailyalways resolves to0 0 * * *, so every daily DAG fires at midnight at the exact same moment. On a production MWAA environment I work on we had around 26 daily DAGs all firing at00:00and it caused task failures from the contention at that boundary.Right now there is no native way to spread them out. You either:
@dailyintent and is not reusable.Neither feels like something every team should be reinventing on their own.
What I would like to propose
A way to add deterministic jitter to a schedule. Something like a
JitteredCronTimetable, or a jitter option on the existing cron timetables, that offsets each DAG by a stable function of its dag id inside a configurable window (say up to 60 minutes), while keeping thedata_intervalandlogical_datesemantics exactly the same.The important part is that it stays deterministic rather than random, so a given DAG always lands in the same slot and runs stay stable and predictable across scheduler restarts and timetable serialization.
Prior art
I looked and could not find anything that already does this. No native option, and no community plugin that I could find. Other schedulers tend to offer something in this space (Kubernetes CronJobs for example).
Questions before I build anything
I already have a working product at my current workplace in production and I am happy to bring tests and docs.
Beta Was this translation helpful? Give feedback.
All reactions