DNS not resolved in '-raft-adv-addr' after leader goes down #695

adrianchifor · 2020-11-26T15:08:22Z

I've managed to create an rqlite (5.5.0) cluster on Kubernetes (as a StatefulSet) and it comes up perfectly fine. It can even handle followers going down and it re-registers them with the new pod IPs, which is great.

However, when killing all nodes, the leader comes back and tries to ping the old nodes which no longer exist as the IPs changed, and everything gets stuck. I was expecting the leader to resolve DNS again for the other raft nodes and try to re-establish the cluster, but it looks like it only resolves DNS the first time or when followers re-join. -raft-adv-addr should probably be resolved on-demand and the IP not saved to the raft DB.

Leader errors after killing all nodes and getting new IPs:

2020-11-26T09:05:49.321Z [INFO]  raft: Node at [::]:4002 [Candidate] entering Candidate state in term 29043
2020-11-26T09:05:49.322Z [ERROR] raft: Failed to make RequestVote RPC to {Voter node1 100.72.3.247:4002}: dial tcp 100.72.3.247:4002: connect: no route to host
2020-11-26T09:05:49.322Z [ERROR] raft: Failed to make RequestVote RPC to {Voter node1 100.72.3.247:4002}: dial tcp 100.72.3.247:4002: connect: no route to host
2020-11-26T09:05:49.323Z [ERROR] raft: Failed to make RequestVote RPC to {Voter node2 100.74.1.150:4002}: dial tcp 100.74.1.150:4002: connect: no route to host
2020-11-26T09:05:49.323Z [ERROR] raft: Failed to make RequestVote RPC to {Voter node2 100.74.1.150:4002}: dial tcp 100.74.1.150:4002: connect: no route to host
2020-11-26T09:05:50.589Z [WARN]  raft: Election timeout reached, restarting election

node0

rqlited \
  -node-id=node0 \
  -http-addr=0.0.0.0:4001 \
  -raft-addr=0.0.0.0:4002 \
  -http-adv-addr=rqlite-0.rqlite:4001 \
  -raft-adv-addr=rqlite-0.rqlite:4002 \
  /data

node1 and node2

rqlited \
  -node-id=node1 \
  -http-addr=0.0.0.0:4001 \
  -raft-addr=0.0.0.0:4002 \
  -http-adv-addr=rqlite-1.rqlite:4001 \
  -raft-adv-addr=rqlite-1.rqlite:4002 \
  -join=http://rqlite-0.rqlite:4001 \
  /data

rqlited \
  -node-id=node2 \
  -http-addr=0.0.0.0:4001 \
  -raft-addr=0.0.0.0:4002 \
  -http-adv-addr=rqlite-2.rqlite:4001 \
  -raft-adv-addr=rqlite-2.rqlite:4002 \
  -join=http://rqlite-0.rqlite:4001 \
  /data

/status?pretty

...
"leader": {
    "addr": "[::]:4002",
    "node_id": "node0"
},
"metadata": {
    "node0": {
        "api_addr": "rqlite-0.rqlite:4001",
        "api_proto": "http"
    },
    "node1": {
        "api_addr": "rqlite-1.rqlite:4001",
        "api_proto": "http"
    },
    "node2": {
        "api_addr": "rqlite-2.rqlite:4001",
        "api_proto": "http"
    }
},
"node_id": "node0",
"nodes": [
    {
        "id": "node0",
        "addr": "[::]:4002"
    },
    {
        "id": "node1",
        "addr": "100.72.3.247:4002"  <--- Should be DNS
    },
    {
        "id": "node2",
        "addr": "100.74.1.150:4002"  <--- Should be DNS
    }
],
...

Is this expected behavior and maybe I'm just using the flags wrong? Any advice would be much appreciated!

The text was updated successfully, but these errors were encountered:

adrianchifor · 2020-11-26T15:43:03Z

Found this workaround https://github.com/techyugadi/kubestash/blob/master/stateful/rqlite/rqlitests.yml#L80

Followers remove themselves from the raft nodes list before dying, but this will fail if the leader is down or not responding.

Also feels like a hack for covering the failure of not re-resolving -raft-adv-addr DNS.

otoolep · 2020-11-26T19:50:22Z

So this is a Hashicorp Raft thing, not an rqlite thing. For some reason this is the way its always worked -- it doesn't keep hostnames at the Raft layer, but keeps resolved IP addresses.

What you are indicating with "Should be DNS" is coming from Hashicorp code.

I'm not entirely sure why it works like this, but it always has. The code in question is what powers Hashicorp Consul, which is a well established piece of software. Perhaps some research on how Consul handles this might be the answer? Presumably whatever is the right way to handle nodes coming back up with different IP addresses with Consul can be applied to rqlite.

otoolep · 2020-11-26T19:51:24Z

Specifically: https://godoc.org/github.com/hashicorp/raft#Server

otoolep · 2020-11-26T19:54:26Z

Here is the rqlite code that creates that output you reference:

https://github.com/rqlite/rqlite/blob/v5.6.0/store/store.go#L394

Note the call to GetConfiguration(). At no point does rqlite resolve hostnames and pass the resultant IP addresses to the Raft layer. It is the Hashicorp layer doing that.

otoolep · 2020-11-26T19:58:18Z

I'll double-check my work, just to be sure. It's been a while since I looked at the networking layer of rqlite.

otoolep · 2020-11-26T20:06:46Z

Well, well, I forgot how my own code works:

https://github.com/rqlite/rqlite/blob/master/cluster/join.go#L58

The rqlite layer does resolve addresses before sending the details to the node it is joining. However I still believe Hashicorp Raft takes this address, resolves it, and stores that in its internal config.

otoolep · 2020-11-26T20:11:09Z

Specific source code in v5.6.0: https://github.com/rqlite/rqlite/blob/v5.6.0/cluster/join.go#L58

otoolep · 2020-11-26T20:11:28Z

Trying out removing the resolution in this PR: #697

otoolep · 2020-11-26T20:26:38Z

I have removed the resolve operation. However the original statement I made about the Hashicorp Raft layer still applies. rqlite now calls this function:

https://godoc.org/github.com/hashicorp/raft#Raft.AddVoter

with whatever is the advertised address for the joining node (hostname or IP -- hostname in your case). See:

rqlite/store/store.go

Line 666 in d3d8bea

f = s.raft.AddVoter(raft.ServerID(id), raft.ServerAddress(addr), 0, 0)

I might be misinterpreting what I am seeing, I'll continue looking into this.

adrianchifor · 2020-11-26T22:02:35Z

Wow that was quick, thanks so much for looking into it! I had a suspicion it was the raft lib usage, was about to dig deeper so I'm happy to see it's sorted.

I assume this would go into v5.6.1 ? I'll test it out tomorrow morning if you can publish the Docker image.

otoolep · 2020-11-27T14:21:24Z

I can release a new version, but I'm not convinced you'll see anything different. When I test with this change in place, it makes no difference. The Raft layer is still using IP addresses. That is why I need to look into it more, and see if there is still something I'm doing wrong.

In the meantime you might like to do some research and see how the community works with Hashicorp Consul, since it is built on the same Raft library.

https://www.consul.io/docs/agent/options.html

adrianchifor · 2020-11-27T14:32:41Z

I have consul running in the same cluster so I'll check its configuration and report back if I find any resolution. Would still be worth testing with the new version.

otoolep · 2020-11-28T21:46:04Z

Thanks @adrianchifor -- that would be great. You're working in an area I don't know a huge amount about (rqlite and k8s) so any guidance you can provide to make rqlite work better in this area would be much appreciated.

otoolep · 2020-11-29T21:17:06Z

I looked into the Hashicorp Raft code, and when it comes to the leader it deals in network addresses, not hostnames. You can see this here:

Creation of Raft node: https://github.com/hashicorp/raft/blob/v1.2.0/api.go#L445
Where a node sets its own "Server Address": https://github.com/hashicorp/raft/blob/v1.2.0/api.go#L489

This second line is a call to the Go networking library, specifically Addr() at:

https://golang.org/pkg/net/#Listener

This returns a network address, not a hostname.

So the latest change I put in place will "fix" it for followers, but not for the leader. I'm not sure why the Raft library works like this.

otoolep · 2020-11-29T21:19:10Z

Here is the status output will my latest changes in place (on master):

        "nodes": [
            {
                "id": "localhost:4002",
                "addr": "127.0.0.1:4002"
            },
            {
                "id": "localhost:4004",
                "addr": "localhost:4004"
            }
        ],

As you can see node 2's addr is set to the hostname, but addr for the first node (leader, which was brought up first) is a network address.

otoolep · 2020-11-29T21:25:31Z

One clean way to deal with this, assuming the nodes know where the leader is when they come up, is to attempt to explicitly rejoin using the same node ID but the new IP address. When the leader sees a node come with a known ID, but new IP address, it will perform a remove of that node first, and then re-add the node -- all as part of the Join operation. You can see that logic here:

https://github.com/rqlite/rqlite/blob/v5.6.0/store/store.go#L647

This means you don't need to worry about the leader being up when the nodes die -- you can clean up when you re-join (and if the leader isn't up when you attempt to re-join, well your join-attempt is moot).

FWIW, it's safe to tell a node to join a cluster it's already a member of.

Might that help?

otoolep · 2020-11-29T21:30:46Z

I'm going to revert the previous changes for now, as folks may be relying on the current behaviour, and the change isn't doing what we hope. I feel like I'm missing something here, and there is a different way to address the issue you are seeing.

otoolep · 2022-02-05T14:17:11Z

Fixed in 7.3.0.

otoolep mentioned this issue Nov 29, 2020

Revert resolving Raft addresses #700

Merged

otoolep mentioned this issue Jan 31, 2021

Cluster tried to verify peer by IP but not my specified Hostname, hence valid SSL cert delcared as invalid #744

Closed

otoolep mentioned this issue Feb 4, 2022

Don't resolve any addresses #993

Merged

otoolep linked a pull request Feb 4, 2022 that will close this issue

Don't resolve any addresses #993

Merged

otoolep closed this as completed in #993 Feb 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNS not resolved in '-raft-adv-addr' after leader goes down #695

DNS not resolved in '-raft-adv-addr' after leader goes down #695

adrianchifor commented Nov 26, 2020 •

edited

adrianchifor commented Nov 26, 2020

otoolep commented Nov 26, 2020 •

edited

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

adrianchifor commented Nov 26, 2020

otoolep commented Nov 27, 2020

adrianchifor commented Nov 27, 2020

otoolep commented Nov 28, 2020

otoolep commented Nov 29, 2020

otoolep commented Nov 29, 2020

otoolep commented Nov 29, 2020

otoolep commented Nov 29, 2020

otoolep commented Feb 5, 2022

DNS not resolved in '-raft-adv-addr' after leader goes down #695

DNS not resolved in '-raft-adv-addr' after leader goes down #695

Comments

adrianchifor commented Nov 26, 2020 • edited

adrianchifor commented Nov 26, 2020

otoolep commented Nov 26, 2020 • edited

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

otoolep commented Nov 26, 2020

adrianchifor commented Nov 26, 2020

otoolep commented Nov 27, 2020

adrianchifor commented Nov 27, 2020

otoolep commented Nov 28, 2020

otoolep commented Nov 29, 2020

otoolep commented Nov 29, 2020

otoolep commented Nov 29, 2020

otoolep commented Nov 29, 2020

otoolep commented Feb 5, 2022

adrianchifor commented Nov 26, 2020 •

edited

otoolep commented Nov 26, 2020 •

edited