Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TransparentExponentialBackoffSupervisor #18776

Merged
merged 1 commit into from
Nov 7, 2015
Merged

Conversation

henrymai
Copy link
Contributor

See this link for details:
#18766

@akka-ci
Copy link

akka-ci commented Oct 25, 2015

Can one of the repo owners verify this patch?

@patriknw
Copy link
Member

Can this be worked in to the BackoffSupervisor in a backwards compatible way? I don't think we should provide two classes for almost the same thing.

@patriknw
Copy link
Member

Refs #18487

@henrymai
Copy link
Contributor Author

I don't think there's a straight forward way to work my code into the BackoffSupervisor because it works in a fundamentally different way than BackoffSupervisor. Mine relies on the normal implicit supervision behavior where it uses a SupervisorStrategy.Decider to decide when to restart a child actor, whereas the existing BackoffSupervisor relies on the child being terminated in order to know to restart the child. This means that I actually support normal child termination without overloading the meaning as a signal to the supervisor to restart (see the Spec.scala for details).

There might be a way for me to do it the other way around, where I can provide a different props method that will pass a flag into the constructor, so that the actor can mimic old behavior as part of my supervisor. However, I don't see much value in doing so, because the old behavior is strictly less desirable than the new behavior. It would entail adding extra cruft into my implementation to support a behavior that is unwanted. Additionally the message interface that I expose is also different, in that I do not want to allow explicit retrieval of the child ActorRef.

I would prefer to just to keep these two separate and deprecate the existing BackoffSupervisor over time instead of attempting to merge the two together. It is cleaner this way and won't break any existing users of BackoffSupervisor.

maybeDirective.getOrElse(defaultDirective) match {
case Restart ⇒
self ! RestartChild
Stop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Stop may overtake the RestartChild in the mailbox in the presence of other messages, leading to the termination of this supervisor actor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I was afraid that might be the case.
But thankfully, I actually had an alternate implementation before that did a Resume here instead of a Stop (still sending a RestartChild message) and then in the RestartChild handler, it would terminate the child before the become(waitingToRestart(childRef, numRestarts)).
I'll switch it to this implementation later tonight, but I was wondering if you could elaborate a little more on why the Stop can overtake the RestartChild message in the presence of other messages (just for my own edification).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, thinking about it, we might accidentally ensure that this cannot happen: Stop means a Terminate sysmsg to the child actor, which triggers a ChildTerminated sysmsg back to us, but processing that will just enqueue Terminated as a normal message at the back of the mailbox behind the RestartChild message—this is done so that any message from the child actor that was emitted before termination is also processed before the Terminated message and it saves the day here.

Still, it would be more obvious if you just switched the behavior to waitingToRestart right here and got rid of the RestartChild message altogether—initial creation can just happen in preStart.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but processing that will just enqueue Terminated as a normal message at the back of the mailbox behind the RestartChild message

Right, that's what I thought from reading the akka code.

Still, it would be more obvious if you just switched the behavior to waitingToRestart right here and got rid of the RestartChild message altogether—initial creation can just happen in preStart.

The problem with switching the behavior at that location is that I don't have access to numRestarts at that point (unless I make it a var). It is my preference to not use vars whenever I can avoid it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This—in a nutshell—is the reason for making all lifecycle events (like child failure) normal messages in Akka Typed. I’d probably use a var here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated the patch to address this concern. Still managed to avoid using a var with the same lines of code :)

@rkuhn
Copy link
Contributor

rkuhn commented Oct 26, 2015

I agree with @henrymai, this implementation approach is superior to the existing one.

val childRef = actorOf(props)
watch(childRef)
unstashAll()
become(watching(childRef, 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like this impl.

@henrymai henrymai force-pushed the master branch 7 times, most recently from b006216 to 0759450 Compare October 27, 2015 05:34
@@ -61,6 +61,21 @@ object BackoffSupervisor {

private case object StartChild extends DeadLetterSuppression
private case class ResetRestartCount(current: Int) extends DeadLetterSuppression

def calculateDelay(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be private[akka] and marked as INTERNAL API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest patch

@patriknw
Copy link
Member

Can't we try harder to align the two backoff supervisors? I think it will be confusing for the users with two slightly different implementations. This implementation is not compatible with Akka Persistence (which must unconditionally stops the actor when there is a journal failure).

One idea is that Restart means ordinary restart and Stop means stop-backoff-start. Then we don't change semantics of restart and the needed semantics for Akka Persistence comes naturally.

@henrymai
Copy link
Contributor Author

Hi Patrik,
Thanks for the review.

I think it will be confusing for the users with two slightly different implementations.

Can we solve that through documentation?

This implementation is not compatible with Akka Persistence (which must unconditionally stops the actor when there is a journal failure).

My implementation is targeting the case of standard actor and supervision semantics. Is there a reason why Akka Persistence went the non standard route of shutting down the child rather than throw a specific Exception/Throwable; allowing a supervising actor to decide to Stop for that specific exception?

One idea is that Restart means ordinary restart and Stop means stop-backoff-start. Then we don't change semantics of restart and the needed semantics for Akka Persistence comes naturally.

Unless I'm mistaken, persistFailure() will issue a context.stop(self), so it won't actually end up going through the SupervisorStrategy.decider. Meaning even if we decided to change the semantics of this patch to perform a "stop-backoff-start" for the Stop directive, we won't actually end up going through that path for Akka Persistence.

Is it not possible to just allow existing Akka Persistence users to continue using the BackoffSupervisor and explicitly document that the TransparentExponentialBackoffSupervisor is not suitable for usage with Akka Persistence?

@rkuhn
Copy link
Contributor

rkuhn commented Oct 28, 2015

As I said, @henrymai has a valid point here, the signaling mechanism is different so I don’t see how or why we should unify these. In particular I don’t like that self-stopping is part of the usage contract for BackoffSupervisor—that might work for specific cases (in particular “immortal” actors that come back after termination) but it is not the best general solution.

@patriknw
Copy link
Member

@henrymai the reason persistent actors do context.stop(self) is that we can't enforce that users have installed a proper supervision strategy that performs the stopping (and stopping is unconditional as I mentioned earlier).

Ok, I'm in minority in thinking it will be confusing. Please work on the naming and documentation to make it clear then that there are two different utilities for this.

@henrymai henrymai force-pushed the master branch 2 times, most recently from 64b554b to 5b66b22 Compare November 1, 2015 20:25
@henrymai
Copy link
Contributor Author

henrymai commented Nov 3, 2015

@rkuhn @patriknw is there anything else that needs to be done for this to be merged?

@rkuhn
Copy link
Contributor

rkuhn commented Nov 7, 2015

Thanks for the contribution, @henrymai ! (and sorry for the long review delay)

rkuhn added a commit that referenced this pull request Nov 7, 2015
Add TransparentExponentialBackoffSupervisor
@rkuhn rkuhn merged commit 61c257b into akka:master Nov 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants