Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster LWRP :join action leaves Rabbit stopped on join error #344

Closed
CVTJNII opened this issue Feb 19, 2016 · 0 comments

Comments

Projects
None yet
1 participant
@CVTJNII
Copy link
Contributor

commented Feb 19, 2016

If the cluster LWRP envounters an error when joining a cluster it will leave the Rabbit stopped. This behavior is due to the stop/join/start method in the join action:

https://github.com/jjasghar/rabbitmq/blob/master/providers/cluster.rb#L205-L207

And that the join_cluster method calls Chef::Application.fatal! on error:

https://github.com/jjasghar/rabbitmq/blob/master/providers/cluster.rb#L178

This is an issue if the join fails for something benign, like trying to cluster to a node that isn't up yet. As Rabbit is down subsequent runs will fail, even if whatever caused the error was transient. I believe better and more expected behavior would be to throw an exception in join_cluster() so that action :join can ensure start_app is always called, then calling Chef::Application.fatal! in an exception handler.

CVTJNII added a commit to CVTJNII/rabbitmq that referenced this issue Feb 19, 2016

@jjasghar jjasghar closed this in #346 Jun 2, 2016

jjasghar pushed a commit that referenced this issue Jun 2, 2016

JJ Asghar
Merge pull request #346 from CVTJNII/restart_Rabbit_on_cluster_error
Restart Rabbit on cluster join error, resolves #344
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.