New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add named exception to detect when a cluster node has been quarantined by others #18758
Labels
Milestone
Comments
|
👍 |
|
I faced the same problem and came out with the following solution:
Hence any actor could subscribe to those events and in case of quarantine - do something, e.g - trigger system restart. I found this pretty flexible in terms of usage, however need somebody to review the approach and the proposed patch. Thanks. |
rkuhn
added a commit
that referenced
this issue
Dec 20, 2015
…opagation-18758 #18758 Send appropriate events on remote actor system shutdown and quarantine
|
Perhaps this one could be mark as fixed? |
|
Indeed, thanks! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When a cluster node gets auto-downed/quarantined, it has to restart its own actor system before it can rejoin the cluster. To do this we need a way to reliably detect when a node has been quarantined by other nodes.
Currently, the only way to do this appears to be doing string matching on the exception message of an
AssociationError, which is fragile. Ideally there should be a named exception that we can check, so that the node can restart itself as appropriate.The text was updated successfully, but these errors were encountered: