Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPF host routing bypasses DNS resolution of dockerd on Kind clusters #23283

Closed
brb opened this issue Jan 24, 2023 · 4 comments · Fixed by #24713
Closed

BPF host routing bypasses DNS resolution of dockerd on Kind clusters #23283

brb opened this issue Jan 24, 2023 · 4 comments · Fixed by #24713
Assignees
Labels
kind/bug/CI This is a bug in the testing code. kind/question Frequently asked questions & answers. This issue will be linked from the documentation's FAQ. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.

Comments

@brb
Copy link
Member

brb commented Jan 24, 2023

While debugging #23171, I've discovered that the BPF masquerade is not properly working on > 5.4 kernels when running in the ci-datapath. The main symptom was failing requests to https://one.one.one.one. A closer look revealed that the DNS resolution for external FQDN was failing.

With the help of tcpdump running on the node of the CoreDNS pods (kind-control-plane), with the iptables masquerading I got the following trace:

eth0  In  IP 10.244.1.217.39869 > 10.244.0.74.53: 33504+ [1au] A? googlel.com. (52)
lxc61ce9e2c0db2 Out IP 10.244.1.217.39869 > 10.244.0.74.53: 33504+ [1au] A? googlel.com. (52)
lxc61ce9e2c0db2 In  IP 10.244.0.74.40127 > 172.12.1.1.53: 57830+ [1au] A? googlel.com. (52)
eth0  Out IP 172.12.1.4.40235 > 8.8.8.8.53: 57830+ [1au] A? googlel.com. (52)
eth0  In  IP 8.8.8.8.53 > 172.12.1.4.40235: 57830 0/1/1 (122)

While with the BPF masquerading - the following:

eth0  In  IP 10.244.1.217.42393 > 10.244.0.74.53: 6280+ [1au] A? googlel.com. (52)
lxc61ce9e2c0db2 In  IP 10.244.0.74.57566 > 172.12.1.1.53: 46622+ [1au] A? googlel.com. (52)
eth0  Out IP 172.12.1.4.57566 > 172.12.1.1.53: 46622+ [1au] A? googlel.com. (52)
  • 10.244.1.217 - the client pod.
  • 10.244.0.74 - the CoreDNS pod (lxc61ce9e2c0db2).
  • 172.12.1.4 - the CoreDNS node IP (eth0).
  • 172.12.1.1 - the Docker bridge IP addr which connects Kind nodes.

From the traces diff it quickly became clear that in the end that it's not the BPF masq matter, but the BPF host routing (with the iptables masquerading it's automatically disabled), and that the following Docker's DNS xlation is bypassed with the BPF host routing:

lxc61ce9e2c0db2 In  IP 10.244.0.74.40127 > 172.12.1.1.53: 57830+ [1au] A? googlel.com. (52)
eth0  Out IP 172.12.1.4.40235 > 8.8.8.8.53: 57830+ [1au] A? googlel.com. (52)

I still don't have a clue how the translation above happens.

@brb brb added kind/bug/CI This is a bug in the testing code. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. labels Jan 24, 2023
@brb brb self-assigned this Jan 24, 2023
@brb brb closed this as completed Jan 24, 2023
@brb brb reopened this Jan 24, 2023
@brb
Copy link
Member Author

brb commented Jan 24, 2023

Looking into the filtered strace output of one of the dockerd threads which handles requests:

1154  recvmsg(69<UDP:[691492]>,  <unfinished ...>
1154  <... recvmsg resumed>{msg_name={sa_family=AF_INET, sin_port=htons(53969), sin_addr=inet_addr("10.244.0.74")}, msg_namelen=112 => 16, msg_iov=[{iov_base="\305\377\1\0\0\1\0\0\0\0\0\1\7googlel\3com\0\0\1\0\1\0\0)"..., iov_len=512}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO, cmsg_data={ipi_ifindex=9, ipi_spec_dst=inet_addr("127.0.0.11"), ipi_addr=inet_addr("127.0.0.11")}}], msg_controllen=32, msg_flags=0}, 0) = 40
1154  futex(0xc0001a8948, FUTEX_WAKE_PRIVATE, 1) = 1
1154  recvmsg(69<UDP:[691492]>,  <unfinished ...>
1154  <... recvmsg resumed>{msg_namelen=112}, 0) = -1 EAGAIN (Resource temporarily unavailable)
1154  setns(14<net:[4026531840]>, CLONE_NEWNET <unfinished ...>
1154  <... setns resumed>)              = 0
1154  openat(AT_FDCWD</>, "/var/run/docker/netns/acf7be30eece", O_RDONLY|O_CLOEXEC <unfinished ...>
1154  <... openat resumed>)             = 26</run/docker/netns/acf7be30eece>
1154  setns(26</run/docker/netns/acf7be30eece>, CLONE_NEWNET) = 0
1154  socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 28<UDP:[3313658]>
1154  setsockopt(28<UDP:[3313658]>, SOL_SOCKET, SO_BROADCAST, [1], 4 <unfinished ...>
1154  <... setsockopt resumed>)         = 0
1154  connect(28<UDP:[3313658]>, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")}, 16 <unfinished ...>
1154  <... connect resumed>)            = 0
1154  epoll_ctl(5<anon_inode:[eventpoll]>, EPOLL_CTL_ADD, 28<UDP:[3313658]>, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=3036353360, u64=139898711118672}} <unfinished ...>
1154  <... epoll_ctl resumed>)          = 0
1154  getsockname(28<UDP:[3313658]>,  <unfinished ...>
1154  <... getsockname resumed>{sa_family=AF_INET, sin_port=htons(39684), sin_addr=inet_addr("172.12.1.4")}, [112 => 16]) = 0
1154  getpeername(28<UDP:[3313658]>,  <unfinished ...>
1154  <... getpeername resumed>{sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")}, [112 => 16]) = 0
1154  setns(14<net:[4026531840]>, CLONE_NEWNET) = 0
1154  close(26</run/docker/netns/acf7be30eece> <unfinished ...>
1154  <... close resumed>)              = 0
1154  write(28<UDP:[3313658]>, "\305\377\1\0\0\1\0\0\0\0\0\1\7googlel\3com\0\0\1\0\1\0\0)"..., 40 <unfinished ...>
1154  <... write resumed>)

It looks that the thread have entered the netns of the kind-control-plane, and then did the DNS resolution.

The missing bit in the puzzle is how the following got redirected to the dockerd socket w/o leaving the netns (dockerd runs in the host netns, while kind-control-plane in its own):

lxc61ce9e2c0db2 In  IP 10.244.0.74.40127 > 172.12.1.1.53: 57830+ [1au] A? googlel.com. (52)

@brb
Copy link
Member Author

brb commented Jan 24, 2023

There we go, the last missing puzzle with the help of pwru:

0xffff92c424982a00      2        [coredns]  nf_csum_update	[nf_nat] sec-arg=0 netns=4026533199 mark=0x10970f00 ifindex=7 proto=8 mtu=1500 len=68 10.244.0.195:60074->172.12.1.1:53(udp)
0xffff92c424982a00      2        [coredns] inet_proto_csum_replace4 sec-arg=18446623969858038272 netns=4026533199 mark=0x10970f00 ifindex=7 proto=8 mtu=1500 len=68 10.244.0.195:60074->172.12.1.1:53(udp)
0xffff92c424982a00      2        [coredns] inet_proto_csum_replace4 sec-arg=18446623969858038272 netns=4026533199 mark=0x10970f00 ifindex=7 proto=8 mtu=1500 len=68 10.244.0.195:60074->172.12.1.1:53(udp)
0xffff92c424982a00      2        [coredns]       udp_v4_early_demux sec-arg=18446623969858038272 netns=4026533199 mark=0x10970f00 ifindex=7 proto=8 mtu=1500 len=68 10.244.0.195:60074->127.0.0.11:38359(udp)

Which happens because of the following rule:

[222:12858] -A DOCKER_OUTPUT -d 172.12.1.1/32 -p udp -j DNAT --to-destination 127.0.0.11:38359

@brb
Copy link
Member Author

brb commented Jan 24, 2023

TL;DR because the BPF host routing makes the packet to bypass that iptables rule, the packet is actually sent to the host netns. As there is no UDP socket listening on :53, the packet gets dropped.

brb added a commit that referenced this issue Jan 25, 2023
Instead of disabling the BPF masquerade. We can reenable it after [1]
has been fixed.

[1]: #23283

Signed-off-by: Martynas Pumputis <m@lambda.lt>
lmb added a commit that referenced this issue Mar 7, 2023
Host routing breaks testing in KIND clusters, see #23283. Instead
of disabling BPF masquerading, force legacy routing.

This will allow running egress gateway tests in CI, which require
BPF masquerading.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
lmb added a commit that referenced this issue Mar 8, 2023
Enable the egress gateway in some datapath workflows. Doing this is
a bit tricky, since EGW relies on BPF masquerading to function. The
latter has been disabled to work around #23283. Instead, we can
force legacy host routing which has a similar effect.

Unfortunately, BPF masquerading doesn't work for IPv6 so enabling
it breaks a bunch of testcases! The solution is to run half of
the tests with BPF masq and EGW disabled, and the other half with
EGW on, BPF masq on and IPv6 masq disabled.

Updates #24151.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
brb pushed a commit that referenced this issue Mar 9, 2023
Enable the egress gateway in some datapath workflows. Doing this is
a bit tricky, since EGW relies on BPF masquerading to function. The
latter has been disabled to work around #23283. Instead, we can
force legacy host routing which has a similar effect.

Unfortunately, BPF masquerading doesn't work for IPv6 so enabling
it breaks a bunch of testcases! The solution is to run half of
the tests with BPF masq and EGW disabled, and the other half with
EGW on, BPF masq on and IPv6 masq disabled.

Updates #24151.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
aspsk added a commit to aspsk/cilium that referenced this issue Apr 3, 2023
We are using our Kind provisioning script to create K8s clusters when testing
in the CI. Recently, we discovered that on some kernels a default DNS resolver,
which is dockerd, is troublesome for the BPF host routing, which we want to
test in the CI (cilium#23283).

Fix this by patching the coredns configmap after creating a kind cluster to
point to the 8.8.8.8 resolver. Alternative fixes (may still be applied later):

  * Pass a custom /etc/resolv.conf to kubelet via --resolv-conf in the Kind /
    kubeadm config.

  * Override /etc/resolv.conf of Kind nodes after creating a cluster (no race
    condition, as CoreDNS pods won't be started, as a CNI is not ready).

  * Patch Kind to allow users to specify custom DNS entries (i.e., docker run
    --dns="1.1.1.1,8.8.8.8").

Fixes: cilium#23283
Fixes: cilium#23330

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
aspsk added a commit to aspsk/cilium that referenced this issue Apr 3, 2023
The cilium#23283 should have been fixed by the 3e8f697 ("contrib/kind: set custom
DNS resolver for Kind nodes") commit, so we can re-enable masquerading by
default and re-enable fast routing.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
aspsk added a commit that referenced this issue Apr 3, 2023
The #23283 should have been fixed by the 3e8f697 ("contrib/kind: set custom
DNS resolver for Kind nodes") commit, so we can re-enable masquerading by
default and re-enable fast routing.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
aspsk added a commit that referenced this issue Apr 4, 2023
We are using our Kind provisioning script to create K8s clusters when testing
in the CI. Recently, we discovered that on some kernels a default DNS resolver,
which is dockerd, is troublesome for the BPF host routing, which we want to
test in the CI (#23283).

Fix this by patching the coredns configmap after creating a kind cluster to
point to the 8.8.8.8 resolver. Alternative fixes (may still be applied later):

  * Pass a custom /etc/resolv.conf to kubelet via --resolv-conf in the Kind /
    kubeadm config.

  * Override /etc/resolv.conf of Kind nodes after creating a cluster (no race
    condition, as CoreDNS pods won't be started, as a CNI is not ready).

  * Patch Kind to allow users to specify custom DNS entries (i.e., docker run
    --dns="1.1.1.1,8.8.8.8").

Fixes: #23283
Fixes: #23330

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
aspsk added a commit that referenced this issue Apr 4, 2023
The #23283 should have been fixed by the 3e8f697 ("contrib/kind: set custom
DNS resolver for Kind nodes") commit, so we can re-enable masquerading by
default and re-enable fast routing.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
brb pushed a commit that referenced this issue May 12, 2023
[ upstream commit 03eeda7 ]

We are using our Kind provisioning script to create K8s clusters when testing
in the CI. Recently, we discovered that on some kernels a default DNS resolver,
which is dockerd, is troublesome for the BPF host routing, which we want to
test in the CI (#23283).

Fix this by patching the coredns configmap after creating a kind cluster to
point to the 8.8.8.8 resolver. Alternative fixes (may still be applied later):

  * Pass a custom /etc/resolv.conf to kubelet via --resolv-conf in the Kind /
    kubeadm config.

  * Override /etc/resolv.conf of Kind nodes after creating a cluster (no race
    condition, as CoreDNS pods won't be started, as a CNI is not ready).

  * Patch Kind to allow users to specify custom DNS entries (i.e., docker run
    --dns="1.1.1.1,8.8.8.8").

Fixes: #23283
Fixes: #23330

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
aanm pushed a commit that referenced this issue May 15, 2023
[ upstream commit 03eeda7 ]

We are using our Kind provisioning script to create K8s clusters when testing
in the CI. Recently, we discovered that on some kernels a default DNS resolver,
which is dockerd, is troublesome for the BPF host routing, which we want to
test in the CI (#23283).

Fix this by patching the coredns configmap after creating a kind cluster to
point to the 8.8.8.8 resolver. Alternative fixes (may still be applied later):

  * Pass a custom /etc/resolv.conf to kubelet via --resolv-conf in the Kind /
    kubeadm config.

  * Override /etc/resolv.conf of Kind nodes after creating a cluster (no race
    condition, as CoreDNS pods won't be started, as a CNI is not ready).

  * Patch Kind to allow users to specify custom DNS entries (i.e., docker run
    --dns="1.1.1.1,8.8.8.8").

Fixes: #23283
Fixes: #23330

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
jrajahalme pushed a commit to jrajahalme/cilium that referenced this issue May 22, 2023
[ upstream commit 03eeda7 ]

We are using our Kind provisioning script to create K8s clusters when testing
in the CI. Recently, we discovered that on some kernels a default DNS resolver,
which is dockerd, is troublesome for the BPF host routing, which we want to
test in the CI (cilium#23283).

Fix this by patching the coredns configmap after creating a kind cluster to
point to the 8.8.8.8 resolver. Alternative fixes (may still be applied later):

  * Pass a custom /etc/resolv.conf to kubelet via --resolv-conf in the Kind /
    kubeadm config.

  * Override /etc/resolv.conf of Kind nodes after creating a cluster (no race
    condition, as CoreDNS pods won't be started, as a CNI is not ready).

  * Patch Kind to allow users to specify custom DNS entries (i.e., docker run
    --dns="1.1.1.1,8.8.8.8").

Fixes: cilium#23283
Fixes: cilium#23330

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Gilberto Bertin <jibi@cilium.io>
@gandro
Copy link
Member

gandro commented Jun 19, 2023

A note for people stumbling upon this issue like I have:

This has only been "fixed" when using our kind.sh script. If one uses plain kind or the helm/kind-action GitHub action (such as we do in some, but not all, GitHub workflows), then one still has to disable BPF host routing for the connectivity test to pass.

@gandro gandro added the kind/question Frequently asked questions & answers. This issue will be linked from the documentation's FAQ. label Nov 7, 2023
giorio94 added a commit that referenced this issue Jan 18, 2024
Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
giorio94 added a commit that referenced this issue Jan 18, 2024
Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
giorio94 added a commit that referenced this issue Feb 2, 2024
Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
github-merge-queue bot pushed a commit that referenced this issue Feb 5, 2024
Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
nbusseneau pushed a commit that referenced this issue Feb 8, 2024
[ upstream commit a1089a7 ]

Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
nbusseneau pushed a commit that referenced this issue Feb 8, 2024
[ upstream commit a1089a7 ]

Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
nbusseneau pushed a commit that referenced this issue Feb 8, 2024
[ upstream commit a1089a7 ]

Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
nbusseneau pushed a commit that referenced this issue Feb 8, 2024
[ upstream commit a1089a7 ]

Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
nbusseneau pushed a commit that referenced this issue Feb 8, 2024
[ upstream commit a1089a7 ]

[ backporter's notes: we keep masquerade set to false on upgrade tests
  for 1.14 due to limitations outlined in
  #14350. However we still
  backport the rest of the changes as regular non-upgrade tests still
  benefit from it. ]

Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
nbusseneau pushed a commit that referenced this issue Feb 8, 2024
[ upstream commit a1089a7 ]

[ backporter's notes: we keep masquerade set to false on upgrade tests
  for 1.14 due to limitations outlined in
  #14350. However we still
  backport the rest of the changes as regular non-upgrade tests still
  benefit from it. ]

Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
julianwiedmann pushed a commit that referenced this issue Feb 9, 2024
[ upstream commit a1089a7 ]

Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
julianwiedmann pushed a commit that referenced this issue Feb 9, 2024
[ upstream commit a1089a7 ]

[ backporter's notes: we keep masquerade set to false on upgrade tests
  for 1.14 due to limitations outlined in
  #14350. However we still
  backport the rest of the changes as regular non-upgrade tests still
  benefit from it. ]

Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: #23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
Pionerd pushed a commit to Pionerd/cilium that referenced this issue Feb 13, 2024
Currently, BPF masquerade was always disabled in the clustermesh
E2E tests due to unintended interactions with Docker iptables
rules breaking DNS resolution [1]. Instead, let's explicitly
configure external upstream DNS servers for coredns, so that we
can also enable this feature when KPR is enabled.

While being there, let's also make the KPR setting explicit,
instead of relying on the Cilium CLI configuration (which is based
on whether the kube-proxy daemonset is present or not).

[1]: cilium#23283

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this issue Mar 4, 2024
cilium#23283 has been resolved, no need
to workaround it any longer.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this issue Mar 4, 2024
cilium#23283 is resolved by now.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
julianwiedmann added a commit to julianwiedmann/cilium that referenced this issue Mar 4, 2024
Clarify that while the issue was closed as resolved, this actually only
applies to scenarios where the kind.sh script is used.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
github-merge-queue bot pushed a commit that referenced this issue Mar 15, 2024
Clarify that while the issue was closed as resolved, this actually only
applies to scenarios where the kind.sh script is used.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
ldelossa pushed a commit to ldelossa/cilium that referenced this issue Mar 16, 2024
Clarify that while the issue was closed as resolved, this actually only
applies to scenarios where the kind.sh script is used.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
rzdebskiy pushed a commit to rzdebskiy/cilium that referenced this issue Apr 3, 2024
Clarify that while the issue was closed as resolved, this actually only
applies to scenarios where the kind.sh script is used.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug/CI This is a bug in the testing code. kind/question Frequently asked questions & answers. This issue will be linked from the documentation's FAQ. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants