New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First seed node not rejoining cluster #25361
Comments
|
For reference, same issue in forum: https://discuss.lightbend.com/t/first-seed-node-not-rejoining-cluster/1637 |
|
We reproduced it with akka.remote and akka.cluster debug enabled. I've attached the full logs for each node during the time node1 was restarted. The IPs are: node1: 172.28.5.1 According to the output, the bottom line is that node1 was able to send messages (InitJoin) to nodes 2 and 3 but no messages from nodes 2 and 3 were received by node1 (IntJoinAck, Heartbeat). This differs slightly from the failure case described above in that node1 did actually receive one InitJoinAck message. |
|
@tpantelis Thanks for the logs. I can confirm what you said. node1 doesn't receive the messages from node2 and node3. Since it happens for both node2 and node3 I don't think it's one of those nasty race conditions bugs that we have been hunting over the years in classic remoting. Perhaps you can try this config: That will open a new tcp connection for the replies from node2 and node3 instead of reusing the inbound tcp connection. Might be a workaround, or we might understand something from other error messages (if any). Can we completely exclude that this isn't a bug or misconfiguration in the OpenDailight network stack? Could packets be dropped in one direction of a tcp connection for some reason? Could you use tcpdump/wireshark or similar to investigate what is sent/received on a lower level? |
|
thanks for the reply @patriknw. I am reproducing this with a 3 node OpenDaylight cluster in 3 docker containers on a single laptop. I I also learned today (from @tpantelis) that artery is using UDP so I can also try with that configuration We can see if there is something at a lower level in ODL w/ regard to what is sent/received. This is definitely a fun one :) |
|
FYI Artery can use UDP or TCP. |
|
We were able to get a packet capture. It basically shows what we've seen in the logs, ie node1 is sending out InitJoin messages to 2 & 3 and InitJoinAck messages are coming back. Also akka Heartbeat messages are incoming from 2 & 3 along with our (Opendaylight) AppendEntries heartbeats. So node1 is sending out messages and messages from 2 & 3 are getting to node1 (so there's no network issues) but are not getting processed for some reason. I attached the zipped pcap file. |
|
Hey guys, is there some module we can enable debugs on to confirm at the akka level that it |
|
We've already enabled debug which logs all incoming and outgoing messages. I assume we'd have to get down to the lowest layer which actually reads from the socket. We also have a packet capture for a successful run where it did re-join (I forgot to include that before). In this case, node1 sent an InternalClusterAction$Join message to node2 after receiving the InitJoinAck. |
|
Thanks for those traces and that we are able to narrow in on this. Next step would probably be to add some additional logging to Akka and run with a snapshot of that. Such as in https://github.com/akka/akka/blob/master/akka-remote/src/main/scala/akka/remote/transport/netty/TcpSupport.scala#L50 and https://github.com/akka/akka/blob/master/akka-remote/src/main/scala/akka/remote/Endpoint.scala#L938 |
|
@tpantelis I see that we never got further here. Is this still a problem or should we close? Have you tried with Artery TCP? |
|
The problem has not been seen with Artery UDP (was not tried with Artery TCP). It's still a problem with netty TCP but it doesn't seem there's much point in pursuing that code base. |
|
Thanks for the update. Closing. |
... maybe it could be because migrated from netty to artery seems to have code impact in some cases. |
akka version: 2.5.11
In a 3-node cluster setup, upon killing and restarting the first seed node (as listed in the akka.conf), we sometimes see that the node does not or is not allowed to rejoin the cluster. After the seed node timeout, which we have set to 12s, it joins itself and forms a cluster island. This is automated in a robot test environment.
As an example, we have node1 (first seed node - address 10.30.170.94), node2 (10.30.170.26), and node3 (10.30.170.84). Initially all the nodes are started and connected and the cluster is working correctly. Then a kill -9 is issued to node1. Both node2 and node3 lose connection and mark node1 as unreachable as expected. On node2, we see:
On node3:
After restarting node1, we see that it sends InitJoin messages to node2 and node3 and actually reports an InitJoinAck from node2 but eventually joins itself after 22s:
On node2, we see InitJoin messages incoming and InitJoinAcks outgoing and that it noticed that node1 was restarted with a new incarnation, marked it as Down, and stated it was removing the previous unreachable incarnation:
followed by several more InitJoin/InitJoinAck sequences...
node3 shows similar InitJoin/InitJoinAck sequences.
I can't see any reason from this output why node1 was not allowed to rejoin. Clearly there was network communication between node1 and node2/node3 after the restart and node2 and node3 remained connected (ie there were no disconnect/disassociation log messages reported). So the old incarnation of node1 should be removed (as was reported in the logs) and the new incarnation should be allowed to join.
Note this happens sporadically - most of the time it works as expected.
Also note that this only seems to happen when the node is killed - we haven’t seen it when the node is gracefully shutdown.
The text was updated successfully, but these errors were encountered: