Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Cluster LWRP :join action leaves Rabbit stopped on join error #344
If the cluster LWRP envounters an error when joining a cluster it will leave the Rabbit stopped. This behavior is due to the stop/join/start method in the join action:
And that the join_cluster method calls Chef::Application.fatal! on error:
This is an issue if the join fails for something benign, like trying to cluster to a node that isn't up yet. As Rabbit is down subsequent runs will fail, even if whatever caused the error was transient. I believe better and more expected behavior would be to throw an exception in join_cluster() so that action :join can ensure start_app is always called, then calling Chef::Application.fatal! in an exception handler.