Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the timezone aware triggers resilient to delays #86

Merged
merged 8 commits into from
Sep 3, 2019

Conversation

PeterJCLaw
Copy link
Contributor

The cron processing includes doing the work of each scheduled job inline at the point they detect they need to run. This works well for the jobs where the scheduling is entirely handled by the scheduler, because they set a next time to run and are then run at some point after that time occurs.

However for the timezone aware triggers we're somewhat abusing that mechanism to do our own timezone awareness filtering. As a result, our reliance on being run at exactly (within a minute of)
the right time meant that the triggers were missing a significant proportion of the actual times at which they should have run their processing.

This change makes the triggers stateful -- aware of the last real time at which they ran and thus able to account for those delays. They now check all the minutes between when they last ran and the current time, thus ensuring that delays still result in the trigger running.

There is a side effect of this approach -- that delays may cause a trigger to only run once where it might ideally run several times. This is a worthwhile trade-off for now and is certainly better than not running at all.

The cron processing includes doing the work of each scheduled job
inline at the point they detect they need to run. This works well
for the jobs where the scheduling is entirely handled by the
scheduler, because they set a next time to run and are then run
at some point after that time occurs.

However for the timezone aware triggers we're somewhat abusing
that mechanism to do our own timezone awareness filtering. As a
result, our reliance on being run at exactly (within a minute of)
the right time meant that the triggers were missing a significant
proportion of the actual times at which they should have run
their processing.

This change makes the triggers stateful -- aware of the last
real time at which they ran and thus able to account for those
delays. They now check all the minutes between when they last
ran and the current time, thus ensuring that delays still
result in the trigger running.

There is a side effect of this approach -- that delays may cause
a trigger to only run once where it might ideally run several
times. This is a worthwhile trade-off for now and is certainly
better than not running at all.
@coveralls
Copy link

coveralls commented Aug 30, 2019

Coverage Status

Coverage increased (+0.003%) to 99.457% when pulling 2492f58 on fix-timezone-aware-trigger-delay-handling into 490a9b6 on master.

routemaster/cron_processors.py Outdated Show resolved Hide resolved
Use a timezone aware range comparison rather than evaluating every
minute within the range. This should be faster while still being
correct.
This copies the same time-range approach used to optimise the
TimezoneAwareProcessor into the MetadataTimezoneAwareProcessor.
The performance increase here is less stark (I suspect because
iterating over all the timezones dominates), however this is likely
to still be an improvement. In any case it's beneficial to have
both processors working in the same way.
It's not really possible for the processor to run twice at the
same instant, so change the tests not to do that.
Internal errors would break the overall cron processing, so we want
to avoid that.
@PeterJCLaw PeterJCLaw merged commit 2492f58 into master Sep 3, 2019
@PeterJCLaw PeterJCLaw deleted the fix-timezone-aware-trigger-delay-handling branch September 3, 2019 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants