Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed rabbitmq cluster_status parsing when node list takes multiple lines. #290

Merged

Conversation

@jperville
Copy link
Contributor

commented Jul 27, 2015

I was wondering why on some nodes of my rabbitmq cluster, chef would try to reconfigure the cluster_node_type at every chef-run, resulting in breaking all existing connections to the broker.

The actual reason was a subtle issue while parsing the output of rabbitmqctl cluster_status when the node list fits on multiple lines. Read on.

How to reproduce

Have a cluster with enough nodes that the list of nodes or running_nodes in the rabbitmqctl cluster_status output takes more than one line, like this:

root@staging3-failover2:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@staging3-failover2' ...
[{nodes,[{disc,['rabbit@staging3-backends','rabbit@staging3-failover1',
                'rabbit@staging3-failover2']}]},
 {running_nodes,['rabbit@staging3-backends','rabbit@staging3-failover1',
                 'rabbit@staging3-failover2']},
 {cluster_name,<<"staging3">>},
 {partitions,[]}]

In the above example, the 'rabbit@staging3-failover2' entry is listed on a different line than the other two nodes.

The following is the parsed version of the above cluster_status, taken from the chef debug log:

[2015-07-27T09:32:41+00:00] DEBUG: [rabbitmq_cluster] rabbitmqctl cluster_status : [{nodes,[{disc,['rabbit@staging3-backends','rabbit@staging3-failover1', 'rabbit@staging3-failover2']}]}, {running_nodes,['rabbit@staging3-backends','rabbit@staging3-failover1', 'rabbit@staging3-failover2']}, {cluster_name,<<"staging3">>}, {partitions,[]}]

Note the extra space left of 'rabbit@staging3-failover2', after the comma.

The same chef debug log displays the following list of disc nodes:

[2015-07-27T09:32:41+00:00] DEBUG: [rabbitmq_cluster] disc_nodes : ["rabbit@staging3-backends", "rabbit@staging3-failover1", " rabbit@staging3-failover2"]

Note the space in the " rabbit@staging3-failover2" string ; this extra space prevents the rabbit@staging3-failover2 node to be detected as a disc node (see below for the current implementation of the current_cluster_node_type method from the cluster provider).

# Get cluster_node_type of current node
def current_cluster_node_type(node_name, cluster_status)
  var_cluster_node_type = ''
  if disc_nodes(cluster_status).include?(node_name) # <--- the extra space breaks this test!
    var_cluster_node_type = 'disc'
  elsif ram_nodes(cluster_status).include?(node_name)
    var_cluster_node_type = 'ram'
  end
  Chef::Log.debug("[rabbitmq_cluster] current cluster node type : #{var_cluster_node_type}")
  var_cluster_node_type
end

With a node_name value of "rabbit@staging3-failover2" and the above parsed list of disc nodes (containing an invalid " rabbit@staging3-failover2" entry with leading space), the node_name will never be identified as being a disc node and the var_cluster_node_type returned will be an empty string. The consequence is that the change_cluster_node_type will be reconfigured at every chef-run, resulting in gratuitous restart of rabbitmq connection.

The fix

This PR fixes the issue in the cluster provider by making the cluster_status method ignore spaces that follow a newline when parsing the rabbitmqctl cluster_status output.

No automated test but I believe that the PR should be quite safe.

@kbogtob

This comment has been minimized.

Copy link

commented Jul 29, 2015

Seems okay for me. +1

jjasghar pushed a commit that referenced this pull request Aug 14, 2015
JJ Asghar
Merge pull request #290 from jperville/fix-parsing-multiline-cluster-…
…status

Fixed rabbitmq cluster_status parsing when node list takes multiple lines.

@jjasghar jjasghar merged commit c236683 into rabbitmq:master Aug 14, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jperville jperville deleted the PerfectMemory:fix-parsing-multiline-cluster-status branch Aug 17, 2015

@jperville jperville restored the PerfectMemory:fix-parsing-multiline-cluster-status branch Aug 17, 2015

@jperville

This comment has been minimized.

Copy link
Contributor Author

commented Aug 17, 2015

Thank you for the merge @jjasghar. Waiting for new official release impatiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.