Peer discovery with consul and UbuntuFAN overlay network. #11161
Replies: 4 comments 14 replies
-
Okay, a little update, running the above configuration does work with
When running on 3.12.13 I discovered that my consul acl token was lacking some privileges in the underlying policy. I did not have session write and key write on the rabbitmq/ prefix. There is no error with that when running on 3.13. After updating the policy, my configuration works on 3.12.13, no connection issues and no Retrying with the correct policy on 3.13 does not give any different results. As a side note, my recommendation would be to document the required consul policy. I am willing to create a PR for this. But my original post still stands. |
Beta Was this translation helpful? Give feedback.
-
Okay, I believe I hit what will be resolved via #11045. |
Beta Was this translation helpful? Give feedback.
-
Thank you @frederikbosch for the detailed analysis and the ping in #11045 as I didn't see your question yet :-) I don't know Consul well; in fact, I discover how it works while working on this regression. I will re-read what you wrote carefully and may come back with questions to better understand your use case. |
Beta Was this translation helpful? Give feedback.
-
Hey @frederikbosch, were you eventually able to sort this out? M. |
Beta Was this translation helpful? Give feedback.
-
In our cluster we are using UbuntuFAN as an overlay network. Given a VM with IP 172.16.50.111, all docker containers are assigned IP in the range 250.50.111.0/24, with the VM itself - the docker host - as 250.50.111.0.
I am trying to get RabbitMQ (3.13.1) working on a 3 node cluster (worker1, worker2 and worker4) that uses peer discovery via Consul.
docker:
The above hostname is a FQDN that resolves to an actual IP address using Consul's DNS service.
rabbitmq.conf
enabled_plugins
env
Now when the three workers boot, they do not detect peers. They all result in standalone instances. Those standalone instance are registering in Consul. When I enabled debug logging, I discovered that all instances are querying Consul correctly and they extract peers from the service registry.
It says
Peer discovery: not satisfyied with discovered peers: the list does not contain this node
. But I do not see this node registering in Consul at that specific moment. The node registers in consul when it has started as a standalone node, not earlier on.After the nodes have booted (as standalone instance), I have tried to let them join another node manually.
Whether I use IP or FQDN does not matter. The result is the same. If I install telnet in the container and I try to connect to the EPMD port on the other node, this connection can be established. The other node (worker1 in this case) does not show any logs regarding the try of worker2 to join its cluster.
So even if I would be registering the rabbitmq service in Consul beforehand, using Nomad, and as such make sure this node is in the list, then connecting and joining fails anyhow. I ran out of options, so hopefully someone here can help out.
Beta Was this translation helpful? Give feedback.
All reactions