-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPv6 TCP connections broken / ACK packet not forwarded in case of service cluster-ip node/remote-pod #17941
Comments
Thanks for the report, @mdaur !
Do you happen to know if they actually leave the node, meaning dropped somewhere on the network before reaching node2?
Is it correct to assume that you did similar packet by packet analysis with tcpdump above to trace a few subsequent packets that they make it to the node2's backend, right? Meaning, exactly the same SNAT config, service VIP, etc, and they pass through just fine? Also, could you attach the pcap from the tcpdump node1 session for further analysis? One other question to 'cluster-ip node/remote-pod', I presume you run the agent with |
Hello borkmann,
Yes I did the captures for multiple services and of course for multiple times. With the same result all the time. The ACK packet got lost always (for multiple times and different services). In the mentioned scenario node1 and node2 have been connected by a unmanged switch without any filters. It is very unlikely that the packets got dropped on the L2 switch. But yes I missed the opportunity to check with a mirror port if the packets are really not leaving the node by making use of another switch. So far I am pretty sure the packets did not leave the node due to I could figure out to make it work by switching to kernel 5.10.80 (ubuntu mainline) meanwhile.
Yes, see attached pcap ipv6.zip. Packet 9,10 (stream 0 inbound, stream 1 natted) show the lost ACK packet, or at least the packet which did not make it to node2.
Yes, I ran the agent with option bpf-lb-external-clusterip btw. helm value bpf.lbExternalClusterIP:true. |
That helped, I think I can see the issue. See the ICMPv6 one at pkt 6. Looks like router is doing the forwarding for us, but not in subsequent packets. Could you try with latest image or with v1.11.0-rc2 to see if it is fixed there? Both We reworked the neighbor cache there and it should address it. |
@mdaur any progress wrt above? Thx |
I will share my outcome by end of tomorrow. Sorry. |
@borkmann now I can confirm that the icmp6 redirects are not longer there and tcp sessions work well for ipv6 if the traffic need to be forwarded to a pod on another node. e.g. (orign 2003:a:611:9600:c9fc:8043:ad2:4c5d, forwarding node 2003:a:611:9600::21, pod on remote node 2003:a:611:9612::11, service ip 2003:a:611:9605::9abf tcp/2000). Tested with quay.io/cilium/cilium:v1.11.0-rc3 / kernel 5.4.0-90-generic default ubuntu focal amd64) 22:31:06.064025 IP6 2003:a:611:9600:c9fc:8043:ad2:4c5d.33148 > 2003:a:611:9605::9abf.2000: Flags [S], seq 339812777, win 28800, options [mss 1440,sackOK,TS val 3699472835 ecr 0,nop,wscale 6], length 0 22:31:06.064093 IP6 2003:a:611:9600::21.33148 > 2003:a:611:9612::11.2000: Flags [S], seq 339812777, win 28800, options [mss 1440,sackOK,TS val 3699472835 ecr 0,nop,wscale 6], length 0 22:31:06.064457 IP6 2003:a:611:9612::11.2000 > 2003:a:611:9600::21.33148: Flags [S.], seq 2484932767, ack 339812778, win 64260, options [mss 1440,sackOK,TS val 1775479634 ecr 3699472835,nop,wscale 7], length 0 22:31:06.064497 IP6 2003:a:611:9605::9abf.2000 > 2003:a:611:9600:c9fc:8043:ad2:4c5d.33148: Flags [S.], seq 2484932767, ack 339812778, win 64260, options [mss 1440,sackOK,TS val 1775479634 ecr 3699472835,nop,wscale 7], length 0 22:31:06.065251 IP6 2003:a:611:9600:c9fc:8043:ad2:4c5d.33148 > 2003:a:611:9605::9abf.2000: Flags [.], ack 1, win 450, options [nop,nop,TS val 3699472837 ecr 1775479634], length 0 22:31:06.065310 IP6 2003:a:611:9600::21.33148 > 2003:a:611:9612::11.2000: Flags [.], ack 1, win 450, options [nop,nop,TS val 3699472837 ecr 1775479634], length 0 22:31:06.066044 IP6 2003:a:611:9600:c9fc:8043:ad2:4c5d.33148 > 2003:a:611:9605::9abf.2000: Flags [P.], seq 1:93, ack 1, win 450, options [nop,nop,TS val 3699472837 ecr 1775479634], length 92 |
Is there an existing issue for this?
What happened?
I am facing an issue in a dual stack cilium setup (v1.10.5-b0836e8) with full kube-proxy replacement (strict). The nodes make use of a BGP setup and all routes are exchanged (cluster/node-cidr), service CIDR is advertised for ECMP.
Issue: all packet drops after SYN, ACK (only IPv6), tcp session can not be established:
IPv4 (same service, same pod works well):
tcpdump node1 (ingress, ebpf snat works well 2003:a :611:9600::76 (ingress client) -> 2003:a :611:9600::21 (node1) -> 2003:a :611:9611::2 (remote pod)
<- next packets do not arrive at node2 where pod 2003:a :611:9611::2.2000 resides ->
tcpdump node2 (remote-pod 2003:a :611:9611::2 retransmits of SYN/ACK due to missing ACK response)
Summary:
Cilium Version
v1.10.5-b0836e8
Kernel Version
Linux node11 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Ranger K3S v1.21.5+k3s2
Sysdump
No response
Relevant log output
Anything else?
For sure I can share all the debug logs/traces of the cilium-bugtool but for a first touchpoint I think this is a little bit to much. How can I figure out where the ACK packets e.g. in the trace above gets lost?
On the other hand IPv6 UDP services work well (service cluster-ip / host -> remote pod).
Any thoughts how to debug further are highly appreciated.
/martin
Code of Conduct
The text was updated successfully, but these errors were encountered: