Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster LWRP :join action leaves Rabbit stopped on join error #344

Closed
CVTJNII opened this issue Feb 19, 2016 · 0 comments
Closed

Cluster LWRP :join action leaves Rabbit stopped on join error #344

CVTJNII opened this issue Feb 19, 2016 · 0 comments

Comments

@CVTJNII
Copy link
Contributor

CVTJNII commented Feb 19, 2016

If the cluster LWRP envounters an error when joining a cluster it will leave the Rabbit stopped. This behavior is due to the stop/join/start method in the join action:

https://github.com/jjasghar/rabbitmq/blob/master/providers/cluster.rb#L205-L207

And that the join_cluster method calls Chef::Application.fatal! on error:

https://github.com/jjasghar/rabbitmq/blob/master/providers/cluster.rb#L178

This is an issue if the join fails for something benign, like trying to cluster to a node that isn't up yet. As Rabbit is down subsequent runs will fail, even if whatever caused the error was transient. I believe better and more expected behavior would be to throw an exception in join_cluster() so that action :join can ensure start_app is always called, then calling Chef::Application.fatal! in an exception handler.

CVTJNII added a commit to CVTJNII/rabbitmq that referenced this issue Feb 19, 2016
jjasghar pushed a commit that referenced this issue Jun 2, 2016
Restart Rabbit on cluster join error, resolves #344
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant