-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typed child actors are not stopped when parent actor is restarted by supervision strategy #25556
Comments
Thanks for reporting |
@svroonland I think this is actually expected behavior. I found this comment in a test: akka/akka-actor-typed-tests/src/test/scala/akka/actor/typed/SupervisionSpec.scala Lines 395 to 396 in f12f686
Unless there are objections I will continue this by adding that documentation, including an example. |
Not sure I find it intuitive behavior to not restart the children, but I trust you have broader considerations of course. |
One drawback with always stopping the children is that any direct communication with the children from the outside is stopped. The actor references are obsolete and anyone using them must watch and re-establish new ones, or all such interaction must go via the parent. The ActorRef of the parent is still valid after a restart, but not the ActorRefs of the children. Might be a surprise for many? |
I see another problem: if the child actor is stopped in the PreStart message handler, the parent actor apparently does not wait for the stop to finish. If upon restart the parent tries to recreate the child actor with the same name fast enough, you still get the exception. |
Have you verified that or is it a theory? |
I have this problem in my code currently and I believe I can come up with a code sample shortly. |
The supervisor is supposed to do 10 retries before stopping, but I'm getting an exception at the first try.
|
Ah, I mixed up things. It's when Hmm, stopping children and then starting new with same name for a restart might still be something we want to support. I have to think about this. |
Thanks! I would really like to see it implemented as in untyped actors, perhaps with an option not to stop children on parent restart. Otherwise while migrating code to typed actors, one would have an upleasant surprise. |
I have analyzed how it's implemented in untyped. It's documented here: https://doc.akka.io/docs/akka/current/general/supervision.html#what-restarting-means Here is a summary of the interaction. Given an actor hierarchy of grandparent, parent, child where an exception is thrown in parent and parent is restarted.
In Typed the grandparent isn't involved since the supervision strategy is taking care of exception in the failing actor itself. We must probably hook into some of these steps from the supervision strategy ( We also have to override preRestart to not stop children by default , since that is supposed to be done explicitly from PreRestart signal if desired. One question is if we want to keep the restart of surviving children or if we should keep them untouched? I'd like @rkuhn 's advice on this issue so that we don't have an accidental misunderstanding of desired semantics. |
Why change the already expected behavior (from untyped)? If the user does nothing, the children should be stopped by default. I presume, in most cases the parent fully controls the life-cycle of its children, the explicit stop would be semantically redundant. Optionally the user could signal that the children should not be stopped, but in my opinion it is really optional. |
I tried a few things. It's not that easy to hook into untyped stuff since we implement supervision and restart quite differently. Then I started to think about implementing the "wait for children to stop" with ordinary typed things in the Restarter. Would have to use StashBuffer instead of An annoying detail is that I can't know which children were stopped in the PreRestart and which were kept so can't know when all expected @chbatey wdyt? |
Instead of implementing the same semantics, we could also think about whether something else would be more appropriate, also given the theme of simplifying the actor lifecycle by switching to Behavior decorators and making the local decision without inspecting the exception itself. One thing to keep in mind is that this is a library facility that could as well live in user code, no special magic shall be included—otherwise we break the premise of Behaviors being pure functions of the incoming message and the context (declaring method invocations on the context as permitted effects). So we could add a second Restarter (or differentiate by a construction parameter) that terminates all child actors, switches to stashing behavior, awaits all Terminated messages, then injects the stash into the initial behavior. The simplification here being that either all or none of the child actors are kept. Users who want something else can copy the code and modify it according to their needs. |
I agree that it's probably better to do this in Typed land, and starting with an option to stop all children is probably covering what most will ever need. |
Closed the docs PR. When implementing this for real some of the samples from #25570 might be used and updated. |
Sorry this notification got lost in the PR rebuild storm. |
For the record: also +1 for keeping this in "typed land" :) |
I know. I was only trying things out. We'll do the real thing after your refactoring |
* stash messages and signals while waiting for children to be stopped * handle watch of other actors * exception from unstash * exception from first setup * merge RestartSupervisor and BackoffSupervisor * remove unused PreStart * docs
* stash messages and signals while waiting for children to be stopped * handle watch of other actors * exception from unstash * exception from first setup * merge RestartSupervisor and BackoffSupervisor * API change: restartWithLimit => restart.withLimit * remove unused PreStart * docs
* stash messages and signals while waiting for children to be stopped * handle watch of other actors * exception from unstash * exception from first setup * merge RestartSupervisor and BackoffSupervisor * API change: restartWithLimit => restart.withLimit * remove unused PreStart * docs
* stash messages and signals while waiting for children to be stopped * handle watch of other actors * exception from unstash * exception from first setup * merge RestartSupervisor and BackoffSupervisor * API change: restartWithLimit => restart.withLimit * remove unused PreStart * docs
* stash messages and signals while waiting for children to be stopped * handle watch of other actors * exception from unstash * exception from first setup * merge RestartSupervisor and BackoffSupervisor * API change: restartWithLimit => restart.withLimit * remove unused PreStart * docs * move BubblingSample to separate class * fix: fail after more than limit in restart.withLimit when deferred factory throws * match case RestartOrBackoff instead
I'm a newbie to AKKA and I just investigated a problem that is related to this issue. I finally solved it by using SupervisorStrategy.restart().withStopChildren(true) I think you really should update the docs: The "What Restarting Means" section (https://doc.akka.io/docs/akka/current/general/supervision.html) is in the Chapter "General Concepts". People starting with Akka will probably read this (regardless of whether they use "untyped" or "typed"). The documentation says: "call the old instance’s preRestart hook (defaults to sending termination requests to all children and calling postStop)" Regards |
Do you see the same problem in the latest documentation: https://doc.akka.io/docs/akka/2.6/general/supervision.html Since it seems that you are developing something new I'd recommend that you use version 2.6.0-RC1. |
I won't switch to RC1 yet because I'm not sure if we will go into production before akka 2.6 :-) But I will at least check the Doc for 2.6 from now on. The respective chapter is not misleading anymore. Thanks! |
I'm spawning a child in a typed actor and using a restart supervision strategy on the parent. When the parent fails and is restarted, an exception "not unique actor name [child actor name here]" is thrown.
Apparently the child is not stopped as part of the restart procedure. This is unlike with untyped actors where, when an actor is restarted, its children are stopped and recreated. For typed actors I would also expect behavior as documented here for untyped actors: https://doc.akka.io/docs/akka/2.5/general/supervision.html#what-restarting-means
(I can provide some test cases for untyped and typed on request)
The text was updated successfully, but these errors were encountered: