-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.13 Backports 2023-11-27 #29391
v1.13 Backports 2023-11-27 #29391
Conversation
/test-1.19-4.19 |
/test-1.21-4.19 |
/test-runtime |
[ upstream commit db33679 ] This commit extends the `Configure` method of `RoutingInfo` with a flag to skip the creation of the ingress rule. The ingress rule is needed for endpoints such that those are forwarded via the `main` routing table. But for the `cilium_host` (aka. router) IP, we want to route it via the `local` table (which would be skipped by the ingress rule). Without a lookup in the `local` routing table, Linux will not consider `cilium_host` to be an address of the local host, and for example not respond to ICMP requests. Note that this commit does not yet use `RoutingInfo.Configure` to set up the `cilium_host` IP, this will be done in the next commit. This commit here merely prepares the method for that and does not contain any functional changes by itself (which can be observed by the fact that all callers pass in `host=false`). Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 0fcd1c8 ] On ENI, we install source-based egress routing rules that steer traffic from Cilium-mananged IPs (i.e. pods, but also the health IP, ingress IP and router IP) to the correct egress interface. For pod IPs, this is done in the CNI plugin: https://github.com/cilium/cilium/blob/7875f6acb5a2fd2b0e3e6c993c9995c0d322e55d/plugins/cilium-cni/interface.go#L59-L63 For ingress and health IP, this is done from cilium-agent: https://github.com/cilium/cilium/blob/ed20c8acde8c76d405d6c9fac3c9de44aa3bb403/daemon/cmd/ipam.go#L401-L405 https://github.com/cilium/cilium/blob/e49430286b5d63b00062758a10a2b37458f94525/cilium-health/launch/endpoint.go#L329-L333 For the `cilium_host` (aka router) IP however, this was done differently. Commit f34371c added a new `routing.SetupRules` function that duplicated parts of the `routing.RoutingInfo.Configure` logic, but missed a crucial part: Namely the creation of the per-ENI routing table that the source-based egress rule points towards. This means that if the `cilium_host` IP address was allocated from a different ENI than the pod, health and ingress IP addresses, that the routing table for that ENI was never created. This led to connectivity issues, in particular in combination with IPSec. This commit addresses that issue by having the `cilium_host` IP use the same code path as the other IP users: Using `RoutingInfo.Configure`. This not only fixes the bug, but removes some code that was otherwise only used for the router IP. There is one major difference between other users of `RoutingInfo.Configure` and the newly introduced use for the `cilium_host` IP: For the `cilium_host` IP, we skip the creation of the ingress rule (by passing in `host=true`), as otherwise the `cilium_host` IP would not be considered a local address of the host network namespace. This is consistent with the old `SetupRules` function did also not create such an ingress rule. Long-term, it remains questionable if the setup of egress rules in ENI mode should be left to IPAM clients, as every client seems to do it slightly differently. Maybe this is better done by either the IPAM subsystem or a separate device manager. Fixes: f34371c ("ipam: Add routes for cilium_host ENI address") Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
eb78ce5
to
caa6974
Compare
/test-backport-1.13 Job 'Cilium-PR-K8s-1.22-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.22-kernel-4.19/335/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. Job 'Cilium-PR-K8s-1.24-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.24-kernel-4.19/276/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. |
Looks like it's still failing, which, taking a closer look, seems expected as it's not fixing anything, just adding more tests https://github.com/cilium/cilium/actions/runs/7016632415 😞 |
|
/test-1.22-4.19 |
/test-1.24-4.19 |
Once this PR is merged, a GitHub action will update the labels of these PRs: