New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bursts of scheduled tasks due to scheduler drift compensation #26910
Comments
That behavior is at least partly specified in
I wonder, though, if that behavior is really needed by most callers. It would be good if we could offer another API that just discards the missed runs and just carries on scheduling the next task with the original interval. Would be good if we could make this even the default behavior (but how to do so compatibly?). Assuming that we need to keep the current behavior as well for compatibility reasons, could we improve it to lower the impact of the burst? Delaying the missed runs might just be shifting the problem to a later point if the system currently cannot keep up anyway. |
With the config I added in the PR it's possible to disable the drift compensation completely by setting it to |
How safe is that? E.g. if you implement a clock with scheduled ticks, that will go wrong if there are GC pauses which are too long. The existing code, explicitly tries to keep an invariant that if you schedule something you will (in steady state) get Users might not have implemented a simple clock but something similar based on that invariant, so I'm not so sure we can just change the behavior in that regard. |
I think that |
ok, if that shall continue to be a strong invariant of Then we would also need to add another Having both alternatives in end-user api feels like too many alternatives for something that very few would care about. Maybe we could change behavior in 2.6 but with a config that switch back to old behavior? |
I did some more research of what we have in Akka and how it's named and documented in the JDK. In Akka:
In JDK:
Given these new insights I have changed my mind. My first approach is not good since it's a mix of the two. We can't change the existing semantics, at least not for In For the other APIs we can add new methods and maybe it's best to deprecate the old one since its semantics was not well defined regarding this and I think most users don't want current fixed-rate semantics and the deprecation warning can be a good signal for that there is a choice to be made. Adding two new methods such as |
Great overview!
Shouldn't it be the end goal also for Is the concern about Scheduler that we might introduce a new method whose name is already taken by an implementing class? Otherwise, it seems we could also think about deprecating the current schedule method (somewhat drastically) and the provide default implementations of the new |
I agree, but
That is one concern, but another more important is that adding a new abstract method will break existing implementations if we can't provide a default implementation of it. Looking closer... I thought However, that would not be binary backwards compatible given how traits are mixed into the concrete classes, or did this change in 2.12 compiler? (Akka calling that method on a concrete class that was compiled with older Akka version). That might be alright breakage (I don't expect many external implementations)? Another observation is that implementations of |
I think that probably has changed with 2.12 but we would have to test.
That might actually also work with 2.12 but it seems we also offer |
ok, seems promising and we have a few things to try out |
* previous `schedule` method is trying to maintain a fixed average frequency over time, but that can undersired bursts of scheduled tasks after a long GC or if the JVM process has been suspended, same with all other periodic scheduled message sending via various Timer APIs * most of the time "fixed delay" is more desirable * we can't just change because it's too big behavioral change and some might depend on previous behavior * deprecate the old `schedule` and introduce new `scheduleWithFixedDelay` and `scheduleAtFixedRate`, when fixing the deprecation warning users should make a concious decision of which behavior to use (scheduleWithFixedDelay in most cases)
* previous `schedule` method is trying to maintain a fixed average frequency over time, but that can result in undesired bursts of scheduled tasks after a long GC or if the JVM process has been suspended, same with all other periodic scheduled message sending via various Timer APIs * most of the time "fixed delay" is more desirable * we can't just change because it's too big behavioral change and some might depend on previous behavior * deprecate the old `schedule` and introduce new `scheduleWithFixedDelay` and `scheduleAtFixedRate`, when fixing the deprecation warning users should make a concious decision of which behavior to use (scheduleWithFixedDelay in most cases)
* previous `schedule` method is trying to maintain a fixed average frequency over time, but that can result in undesired bursts of scheduled tasks after a long GC or if the JVM process has been suspended, same with all other periodic scheduled message sending via various Timer APIs * most of the time "fixed delay" is more desirable * we can't just change because it's too big behavioral change and some might depend on previous behavior * deprecate the old `schedule` and introduce new `scheduleWithFixedDelay` and `scheduleAtFixedRate`, when fixing the deprecation warning users should make a concious decision of which behavior to use (scheduleWithFixedDelay in most cases)
* previous `schedule` method is trying to maintain a fixed average frequency over time, but that can result in undesired bursts of scheduled tasks after a long GC or if the JVM process has been suspended, same with all other periodic scheduled message sending via various Timer APIs * most of the time "fixed delay" is more desirable * we can't just change because it's too big behavioral change and some might depend on previous behavior * deprecate the old `schedule` and introduce new `scheduleWithFixedDelay` and `scheduleAtFixedRate`, when fixing the deprecation warning users should make a concious decision of which behavior to use (scheduleWithFixedDelay in most cases)
* previous `schedule` method is trying to maintain a fixed average frequency over time, but that can result in undesired bursts of scheduled tasks after a long GC or if the JVM process has been suspended, same with all other periodic scheduled message sending via various Timer APIs * most of the time "fixed delay" is more desirable * we can't just change because it's too big behavioral change and some might depend on previous behavior * deprecate the old `schedule` and introduce new `scheduleWithFixedDelay` and `scheduleAtFixedRate`, when fixing the deprecation warning users should make a concious decision of which behavior to use (scheduleWithFixedDelay in most cases) * Streams * SchedulerSpec * test both fixed delay and fixed rate * TimerSpec * FSM and PersistentFSM * mima * runnable as second parameter list, also in typed.Scheduler * IllegalStateException vs SchedulerException * deprecated annotations * api and reference docs, all places * migration guide
scheduleWithFixedDelay vs scheduleAtFixedRate, #26910
Jackson whitelist for deserialization of unbound class, #26910
When scheduling periodic tasks/messages the scheduler compensates for drift:
akka/akka-actor/src/main/scala/akka/actor/LightArrayRevolverScheduler.scala
Lines 105 to 107 in e071077
That is probably good for small variations, but it results in bursts when the process is suspended for long time (because of GC, or simulations with
kill -STOP
, and I can imagine that processes are suspended in virtualized or container environments)Example to reproduce:
use
kill -STOP pid
, wait for 10 seconds and thenkill -CONT pid
Example output (with some extra logging in the scheduler)
I think this is what causing the problem reported in #26786
but I can imagine all kind of things that will add massive load on the system when waking up and running all scheduled tasks repeatedly to catch up with missed ticks..
The text was updated successfully, but these errors were encountered: