New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extends collection of shutdown Reasons (e.g. ClusterJoinUnsuccessfulReason) #26570
Extends collection of shutdown Reasons (e.g. ClusterJoinUnsuccessfulReason) #26570
Conversation
/** | ||
* Scala API: The shutdown was initiated by a failure to join a seed node. | ||
*/ | ||
object JoiningSeedNodesUnsuccessfulReason extends Reason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems extremely specific, should it rather be ClusterJoinUnsuccessful
or something like that, or is it important that it was specifically joining to seed nodes that failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, could be slightly less specific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
/** | ||
* Java API: The shutdown was initiated by a configuration clash within the existing cluster and the joining node | ||
*/ | ||
def incompatibleConfigurationDetectedReason: Reason = IncompatibleConfigurationDetectedReason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like it makes sense to make it public
Test FAILed. |
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems good to add those
@@ -501,7 +501,7 @@ private[cluster] class ClusterCoreDaemon(publisher: ActorRef, joinConfigCompatCh | |||
seedNodes.mkString(", "), | |||
ShutdownAfterUnsuccessfulJoinSeedNodes) | |||
joinSeedNodesDeadline = None | |||
CoordinatedShutdown(context.system).run(CoordinatedShutdown.ClusterDowningReason) | |||
CoordinatedShutdown(context.system).run(CoordinatedShutdown.ClusterJoinUnsuccessfulReason) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we have ClusterDowningReason
configured to use exit code -1
(abnormal termination). Shouldn't ClusterJoinUnsuccessfulReason also be configured to use -1
.
I guess -1 will have the effect that an orchestration platform will start it again, which is probably desired also for unsuccessful join?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I think the exit status
was relevant in ConductR but I don't think it is in Kubernetes. IDK about other orchestrators.
I'll fix this PR but I think we could consider deprecating the feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exit code is useful. k8 isn't the only thing. A bash script can check exit code for decision of restarting process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, in the case of IncompatibleConfigurationDetectedReason
restarting won't help so I don't think a custom exit code is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, IncompatibleConfigurationDetectedReason
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test PASSed. |
The Travis job failed this morning but I think the cause was a failing bintray. Can someone relaunch, please? |
PLS BUILD |
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Related to lagom/lagom#1806