New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.15 Backports 2024-02-27 #30997
v1.15 Backports 2024-02-27 #30997
Conversation
d1fb98a
to
1cb3d6d
Compare
[ upstream commit 32543a4 ] In Go 1.22, slices.CompactFunc will clear the slice elements that got discarded. This makes TestSortedUniqueFunc fail if it is run in succession to other tests modifying the input slice. Avoid this case by not modifying the input slice in the test case but make a copy for the sake of the test. Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 3441800 ] In Go 1.22, slices.Delete will clear the slice elements that got discarded. This leads to the slice containing the existing ranges in (*LBIPAM).handlePoolModified to be cleared while being looped over, leading to the following nil dereference in TestConflictResolution: ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ PANIC package: github.com/cilium/cilium/operator/pkg/lbipam • TestConflictResolution ┃ ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1a8c814] goroutine 22 [running]: testing.tRunner.func1.2({0x1d5e400, 0x39e3fe0}) /home/travis/.gimme/versions/go1.22.0.linux.arm64/src/testing/testing.go:1631 +0x1c4 testing.tRunner.func1() /home/travis/.gimme/versions/go1.22.0.linux.arm64/src/testing/testing.go:1634 +0x33c panic({0x1d5e400?, 0x39e3fe0?}) /home/travis/.gimme/versions/go1.22.0.linux.arm64/src/runtime/panic.go:770 +0x124 github.com/cilium/cilium/operator/pkg/lbipam.(*LBRange).EqualCIDR(0x400021d260?, {{0x24f5388?, 0x3fce4e0?}, 0x400012c018?}, {{0x1ea5e20?, 0x0?}, 0x400012c018?}) /home/travis/gopath/src/github.com/cilium/cilium/operator/pkg/lbipam/range_store.go:151 +0x74 github.com/cilium/cilium/operator/pkg/lbipam.(*LBIPAM).handlePoolModified(0x400021d260, {0x24f5388, 0x3fce4e0}, 0x40000ed200) /home/travis/gopath/src/github.com/cilium/cilium/operator/pkg/lbipam/lbipam.go:1392 +0xfa0 github.com/cilium/cilium/operator/pkg/lbipam.(*LBIPAM).poolOnUpsert(0x400021d260, {0x24f5388, 0x3fce4e0}, {{0xffff88e06108?, 0x10?}, {0x4000088808?, 0x40003ea910?}}, 0x40000ed080?) /home/travis/gopath/src/github.com/cilium/cilium/operator/pkg/lbipam/lbipam.go:279 +0xe0 github.com/cilium/cilium/operator/pkg/lbipam.(*LBIPAM).handlePoolEvent(0x400021d260, {0x24f5388?, 0x3fce4e0?}, {{0x214e78e, 0x6}, {{0x400034d1d8, 0x6}, {0x0, 0x0}}, 0x40000ed080, ...}) /home/travis/gopath/src/github.com/cilium/cilium/operator/pkg/lbipam/lbipam.go:233 +0x1d8 github.com/cilium/cilium/operator/pkg/lbipam.(*newFixture).UpsertPool(0x40008bfe18, 0x40002a4b60, 0x40000ed080) /home/travis/gopath/src/github.com/cilium/cilium/operator/pkg/lbipam/lbipam_fixture_test.go:177 +0x148 github.com/cilium/cilium/operator/pkg/lbipam.TestConflictResolution(0x40002a4b60) /home/travis/gopath/src/github.com/cilium/cilium/operator/pkg/lbipam/lbipam_test.go:56 +0x3fc testing.tRunner(0x40002a4b60, 0x22a2558) /home/travis/.gimme/versions/go1.22.0.linux.arm64/src/testing/testing.go:1689 +0xec created by testing.(*T).Run in goroutine 1 /home/travis/.gimme/versions/go1.22.0.linux.arm64/src/testing/testing.go:1742 +0x318 FAIL github.com/cilium/cilium/operator/pkg/lbipam 0.043s Fix this by cloning the slice before iterating over it. Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit cb15333 ] When endpoint is created and `EndpointChangeRequest` contains labels, it might cause the endpoint regeneration to not be triggered as it is only triggered when labels are changed. Unfortunately this does not happen when epTemplate.Labels are set with the same labels as `EndpointChangeRequest`. This commit fixes the above issue by not setting epTemplate.Labels. Fixes: #29776 Signed-off-by: Ondrej Blazek <ondrej.blazek@firma.seznam.cz>
[ upstream commit 7b4c0b0 ] Signed-off-by: Donnie McMahan <jmcmaha1@gmail.com>
[ upstream commit 329fefb ] Controller generate a log for every single reconciliation. This is noisy and doesn't make much sense since users doesn't care about reconciliation happening, but the outcome of the reconciliation. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 4c5f79d ] When users stop selecting the node with CiliumBGPPeeringPolicy, BGP Control Plane removes all running virtual router instances. However, it is only notified with Debug level. Upgrade it to Info level since this is an important information which helps users to investigate session disruption with configuration miss. Also, the log is generated and full reconciliation happens even if there is no previous policy applied. This means when there's no policy applied and any relevant resource (e.g. Service) is updated, it will generate the log and does full withdrawal meaninglessly. Introduce a flag that indicates whether there is a previous policy and conditionally trigger log generation and full withdrawal. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 66e5de6 ] Remove noisy logs generated for every single reconciliation. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit c00330c ] We don't need to show create/update/delete counts because we show logs for all create/update/delete operation anyways. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 4baab3d ] Remove a noisy log which will be generated for every single reconciliation from route policy reconciler. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 148f81f ] Users can now easily check the current peering state with `cilium bgp peers` command. Thus state transition logs become relatively unimportant for users. Downgrade the logs to debug level. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 8fcfad9 ] When the host firewall is enabled in tunneling mode, pod to node traffic needs to be forwarded through the tunnel in order to preserve the security identity (as otherwise the source IP address would be SNATted), which is required to enforce ingress host policies. One tricky case is represented by node (or hostns pod) to pod traffic via services with local ExternalTrafficPolicy, when KPR is disabled. Indeed, in this case, the SYN packet is routed natively (as both the source and the destination are node IPs) to the destination node, and then DNATted to one of the backend IPs, without being SNATted at the same time. Yet, the SYN+ACK packet would then be incorrectly redirected through the tunnel (as the destination is a node IP, associated with a tunnel endpoint in the ipcache), hence breaking the connection, while it should be passed to the stack to be rev DNATted and then forwarded accordingly. In detail, reporting the description from c8052a1, the broken packet path is node1 --VIP--> pod@node2 (VIP is node2IP): - SYN leaves node1 via native device with node1IP -> VIP - SYN is DNATed on node2 to node1IP -> podIP - SYN is delivered to lxc device with node1IP -> podIP - SYN+ACK is sent from lxc device with podIP -> node1IP - SYN+ACK is redirected in BPF directly to cilium_vxlan - SYN+ACK arrives on node1 via tunnel with podIP -> node1IP - RST is sent because podIP doesn't match VIP c8052a1 attempted to fix this issue for the kube-proxy+hostfw (and IPSec) scenarios by always passing the packets to the stack, so that it doesn't bypass conntrack. The IPSec specific workaround got then removed in 0a8f2c4, as that path asymmetry is no longer present. However, always passing packets to the stack breaks the host firewall policy enforcement for pod to node traffic, as at that point there's no route which redirects these packets back to the tunnel to preserve the security identity, and they get simply masqueraded and routed natively. To prevent this issue, let's pass packets to the stack only if they are a reply with destination identity matching a remote node, as in that case they may need to be rev DNATted. There are two possibilities at that point: (a) the destination is a CiliumInternalIP address, and the reply needs to go through the tunnel -- node routes ensure that the packet is first forwarded to cilium_host, before being redirected through the tunnel; (b) the destination is one of the other node addresses, and the reply needs to be forwarded natively according to the local routing table (as node to pod/node traffic never goes through the tunnel unless the source is a CiliumInternalIP address). Overall, this change addresses the externalTrafficPolicy=local service case, while still preserving encapsulation in all other cases. As a side effect, it also improves the performance in the kube-proxy + hostfw case, as pod to pod traffic gets now also redirected immediately through the tunnel, instead of being sent via the stack. Fixes: c8052a1 ("bpf: Do not bypass conntrack if running kube-proxy+hostfw or IPSec") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit 29a7918 ] On IPv6-only clusters, querying localhost for the health check could attempt to check 127.0.0.1, presumable depending on host DNS configuration. As the health check does not listen on IPv4 when .Values.ipv4.enabled is false, this health check could fail. This patch uses the same logic as the bootstrap-config.json file to ensure a valid IP is always used for the health check. Fixes: #30968 Fixes: 859d2a9 ("helm: use /ready from Envoy admin iface for healthprobes on daemonset") Signed-off-by: Andrew Titmuss <iandrewt@icloud.com>
1cb3d6d
to
169de9a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commit looks good, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My part looks good :) Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and LGTM for #30970
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm Thanks.
/test-backport-1.15 |
divisor
forGOMEMLIMIT
#30635 (@jdmcmahan)Once this PR is merged, a GitHub action will update the labels of these PRs: