-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS not resolved in '-raft-adv-addr' after leader goes down #695
Comments
Found this workaround https://github.com/techyugadi/kubestash/blob/master/stateful/rqlite/rqlitests.yml#L80 Followers remove themselves from the raft nodes list before dying, but this will fail if the leader is down or not responding. Also feels like a hack for covering the failure of not re-resolving |
So this is a Hashicorp Raft thing, not an rqlite thing. For some reason this is the way its always worked -- it doesn't keep hostnames at the Raft layer, but keeps resolved IP addresses. What you are indicating with "Should be DNS" is coming from Hashicorp code. I'm not entirely sure why it works like this, but it always has. The code in question is what powers Hashicorp Consul, which is a well established piece of software. Perhaps some research on how Consul handles this might be the answer? Presumably whatever is the right way to handle nodes coming back up with different IP addresses with Consul can be applied to rqlite. |
Specifically: https://godoc.org/github.com/hashicorp/raft#Server |
Here is the rqlite code that creates that output you reference: https://github.com/rqlite/rqlite/blob/v5.6.0/store/store.go#L394 Note the call to |
I'll double-check my work, just to be sure. It's been a while since I looked at the networking layer of rqlite. |
Well, well, I forgot how my own code works: https://github.com/rqlite/rqlite/blob/master/cluster/join.go#L58 The rqlite layer does resolve addresses before sending the details to the node it is joining. However I still believe Hashicorp Raft takes this address, resolves it, and stores that in its internal config. |
Specific source code in v5.6.0: https://github.com/rqlite/rqlite/blob/v5.6.0/cluster/join.go#L58 |
Trying out removing the resolution in this PR: #697 |
I have removed the resolve operation. However the original statement I made about the Hashicorp Raft layer still applies. rqlite now calls this function: https://godoc.org/github.com/hashicorp/raft#Raft.AddVoter with whatever is the advertised address for the joining node (hostname or IP -- hostname in your case). See: Line 666 in d3d8bea
I might be misinterpreting what I am seeing, I'll continue looking into this. |
Wow that was quick, thanks so much for looking into it! I had a suspicion it was the raft lib usage, was about to dig deeper so I'm happy to see it's sorted. I assume this would go into v5.6.1 ? I'll test it out tomorrow morning if you can publish the Docker image. |
I can release a new version, but I'm not convinced you'll see anything different. When I test with this change in place, it makes no difference. The Raft layer is still using IP addresses. That is why I need to look into it more, and see if there is still something I'm doing wrong. In the meantime you might like to do some research and see how the community works with Hashicorp Consul, since it is built on the same Raft library. |
I have consul running in the same cluster so I'll check its configuration and report back if I find any resolution. Would still be worth testing with the new version. |
Thanks @adrianchifor -- that would be great. You're working in an area I don't know a huge amount about (rqlite and k8s) so any guidance you can provide to make rqlite work better in this area would be much appreciated. |
I looked into the Hashicorp Raft code, and when it comes to the leader it deals in network addresses, not hostnames. You can see this here: Creation of Raft node: https://github.com/hashicorp/raft/blob/v1.2.0/api.go#L445 This second line is a call to the Go networking library, specifically Addr() at: https://golang.org/pkg/net/#Listener This returns a network address, not a hostname. So the latest change I put in place will "fix" it for followers, but not for the leader. I'm not sure why the Raft library works like this. |
Here is the status output will my latest changes in place (on master): "nodes": [
{
"id": "localhost:4002",
"addr": "127.0.0.1:4002"
},
{
"id": "localhost:4004",
"addr": "localhost:4004"
}
], As you can see node 2's |
One clean way to deal with this, assuming the nodes know where the leader is when they come up, is to attempt to explicitly rejoin using the same node ID but the new IP address. When the leader sees a node come with a known ID, but new IP address, it will perform a remove of that node first, and then re-add the node -- all as part of the Join operation. You can see that logic here: https://github.com/rqlite/rqlite/blob/v5.6.0/store/store.go#L647 This means you don't need to worry about the leader being up when the nodes die -- you can clean up when you re-join (and if the leader isn't up when you attempt to re-join, well your join-attempt is moot). FWIW, it's safe to tell a node to join a cluster it's already a member of. Might that help? |
I'm going to revert the previous changes for now, as folks may be relying on the current behaviour, and the change isn't doing what we hope. I feel like I'm missing something here, and there is a different way to address the issue you are seeing. |
Fixed in 7.3.0. |
I've managed to create an rqlite (5.5.0) cluster on Kubernetes (as a StatefulSet) and it comes up perfectly fine. It can even handle followers going down and it re-registers them with the new pod IPs, which is great.
However, when killing all nodes, the leader comes back and tries to ping the old nodes which no longer exist as the IPs changed, and everything gets stuck. I was expecting the leader to resolve DNS again for the other raft nodes and try to re-establish the cluster, but it looks like it only resolves DNS the first time or when followers re-join.
-raft-adv-addr
should probably be resolved on-demand and the IP not saved to the raft DB.Leader errors after killing all nodes and getting new IPs:
node0
node1
andnode2
/status?pretty
Is this expected behavior and maybe I'm just using the flags wrong? Any advice would be much appreciated!
The text was updated successfully, but these errors were encountered: