-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing remote deployment, see #2983 #1094
Conversation
Started jenkins job akka-pr-validator at https://jenkins.akka.io/job/akka-pr-validator/414/ |
So isn't there a similar problem in |
jenkins job akka-pr-validator: Success - https://jenkins.akka.io/job/akka-pr-validator/414/ |
private def removeChildWhenToAddressTerminated(child: ActorRef): Unit = | ||
private def removeChildWhenToAddressTerminated(child: ActorRef): Unit = { | ||
// FIXME Needs to call handleChildTerminated, and handle the return value, which is a list of SystemMessages to be processed. | ||
// How to do that? Note that it must be done before processing of the Terminated message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a ticket and not a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we actually solved it during the meeting today, will update
there will be a ticket for the better long term solution though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :-)
…er, see #2983 * The problem is that we do remote deployment to a node that isn't alive and with ordinary remoting that is not detected at all, as we know. With cluster this was taken care of by a later AddressTerminated and ChildTerminated generated by RemoteDeploymentWatcher. With the new RemoteDeadLetters the additional watch triggers an immediate Terminate which is captured by RemoteDeploymentWatcher but not acted upon since it's not an addressTerminated. RemoteDeploymentWatcher unwatch and will therefor not act on later AddressTerminated. * The long term solution is to have reliable system messages and remote supervision without explicit watch, so that we know that the remote deployment fails. * This short term solution is to let RemoteDeploymentWatcher always generate ChildTerminated, also for non-addressTerminated. * It's possibly racy since ChildTerminated is not idempotent.
The replacement for removeChildWhenToAddressTerminated works fine. |
Started jenkins job akka-pr-validator at https://jenkins.akka.io/job/akka-pr-validator/414/ |
jenkins job akka-pr-validator: Success - https://jenkins.akka.io/job/akka-pr-validator/414/ |
|
||
.. note:: Creating a remote deployed child actor with the same name as the terminated | ||
actor is not fully supported. There is a race condition that potentially removes the new | ||
actor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this wants to be a warning …
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I'll change that and then merge.
I think cherry-pick of the last two commits to release-2.1 should be no problem. Ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, backporting to 2.1 should only be done once we are certain how to proceed; in 2.1 links between systems are not bullet-proof yet with or without this patch, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
LGTM! |
LGTM! |
* Replaces the previous half-baked removeChildWhenToAddressTerminated
LGTM |
…-patriknw Failing remote deployment, see #2983
remoting that is not detected at all, as we know. With cluster this was taken care of by
a later AddressTerminated and ChildTerminated generated by
RemoteDeploymentWatcher. With the new RemoteDeadLetters the additional watch
triggers an immediate Terminate which is captured by RemoteDeploymentWatcher
but not acted upon since it's not an addressTerminated. RemoteDeploymentWatcher
unwatch and will therefor not act on later AddressTerminated.
without explicit watch, so that we know that the remote deployment fails.
ChildTerminated, also for non-addressTerminated.