k8s, node: Restore router IPs (cilium_host) from K8s resource #16307

christarazi · 2021-05-25T19:48:52Z

Previously, after a node reboot, Cilium would allocate a new router IP
and append it slice of node IPs. Since the node IPs have already been
synced to the K8s resource, meaning there are already IPs present (from
the previous Cilium instance), the router IP is appended to the slice.
In other parts of Cilium, it is assumed that the router IP is the first
node IP (first element of the slice). Since the new router IP has been
appended to the end, it is no longer where it is expected, aka no longer
the first element. This causes a mismatch of which router IP is to be
used. There should only ever be one router IP (one IPv4 or one IPv6).

In case of a node reboot, the router IPs cannot be restored because they
are wiped away due to the Cilium state dir being mounted as a tmpfs [1].
This commit fixes this to restore the router IPs from the K8s resource
(Node or CiliumNode) if they are present in the annotations. This
prevents the possibility of having more than one router IP, as described
above. Note that router IPs from the K8s resource are only restored if
no router IP was found on the filesystem, which is considered the source
of truth. In other words, the filesystem takes precedence over the K8s
resource. The user is warned in cases of a mismatch between the two
different sources. We also check that the IP to be restored is within
the pod / node CIDR range, otherwise we ignore it from restoration.

[1]: Linux distributions mount /run as tmpfs and Cilium's default state
directory is created under /run. (It's worth mentioning that it's
also common for /var/run to be symlinked to /run.)

Fixes: #16279

Fix bug where Cilium allocates a new router (`cilium_host`) IP upon node reboot, breaking connectivity especially with IPsec

nathanjsweet

I can't think of anything important offhand, but I guess there is the possibility that some kind of fast restart with a swapped NIC in a cloud provider could turn this into a race condition. However, that's a very edge case.

brb

What's going to happen if podCIDR will change? Will it still try to restore potentially obsolete router IP?

christarazi · 2021-05-26T17:37:03Z

What's going to happen if podCIDR will change? Will it still try to restore potentially obsolete router IP?

@brb Do we support pod CIDRs changing anyway?

Regardless of that, I'm thinking we need to determine the priority of the router IPs that we restore. I'm assuming that the local filesystem is the most up-to-date source of truth. An approach would be to load the router IPs from the K8s resource like this PR already does, then read the filesystem state. If the fs state contains a router IP as well, then compare it against the one from the K8s resource. If there's a mismatch, then we should warn loudly and we will probably want to switch over to it anyway.

With a system like the above in-place, then we are not actually changing the behavior of Cilium by much. Only the case where the router IP doesn't exist on the filesystem, but does exist in the K8s resource, which occurs on a node reboot.

Edit: Implemented the above

christarazi · 2021-05-26T19:19:36Z

User confirmed that this fixes their issue: #16279 (comment)

christarazi · 2021-05-26T19:31:53Z

test-me-please

Edit: Legit failures, fixed in the below push

christarazi · 2021-05-27T18:51:38Z

test-me-please

Previously, after a node reboot, Cilium would allocate a new router IP and append it slice of node IPs. Since the node IPs have already been synced to the K8s resource, meaning there are already IPs present (from the previous Cilium instance), the router IP is appended to the slice. In other parts of Cilium, it is assumed that the router IP is the first node IP (first element of the slice). Since the new router IP has been appended to the end, it is no longer where it is expected, aka no longer the first element. This causes a mismatch of which router IP is to be used. There should only ever be one router IP (one IPv4 or one IPv6). In case of a node reboot, the router IPs cannot be restored because they are wiped away due to the Cilium state dir being mounted as a tmpfs [1]. This commit fixes this to restore the router IPs from the K8s resource (Node or CiliumNode) if they are present in the annotations. This prevents the possibility of having more than one router IP, as described above. Note that router IPs from the K8s resource are only restored if no router IP was found on the filesystem, which is considered the source of truth. In other words, the filesystem takes precedence over the K8s resource. The user is warned in cases of a mismatch between the two different sources. We also check that the IP to be restored is within the pod / node CIDR range, otherwise we ignore it from restoration. [1]: Linux distributions mount /run as tmpfs and Cilium's default state directory is created under /run. (It's worth mentioning that it's also common for /var/run to be symlinked to /run.) Fixes: cilium#16279 Signed-off-by: Chris Tarazi <chris@isovalent.com>

christarazi · 2021-06-01T23:54:59Z

Pushed a small change that shouldn't require a re-run. CI has passed already.

christarazi added sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. kind/bug This is a bug in the Cilium logic. area/daemon Impacts operation of the Cilium daemon. needs-backport/1.10 labels May 25, 2021

maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 25, 2021

christarazi added the needs-backport/1.9 label May 25, 2021

maintainer-s-little-helper bot added this to Needs backport from master in 1.9.8 May 25, 2021

christarazi added the release-note/bug This PR fixes an issue in a previous release of Cilium. label May 25, 2021

maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 25, 2021

christarazi mentioned this pull request May 25, 2021

Connectivity issues with IPsec after node reboot #16279

Closed

nathanjsweet reviewed May 25, 2021

View reviewed changes

brb reviewed May 26, 2021

View reviewed changes

christarazi force-pushed the pr/christarazi/restore-router-ips-k8s branch 4 times, most recently from 5f6fd2b to 0dc6e98 Compare May 26, 2021 19:18

christarazi marked this pull request as ready for review May 26, 2021 19:18

christarazi requested a review from a team as a code owner May 26, 2021 19:18

christarazi requested review from a team and nathanjsweet May 26, 2021 19:18

maintainer-s-little-helper bot assigned nathanjsweet May 26, 2021

christarazi changed the title ~~k8s, node: Restore router IPs (cilium host) from K8s resource~~ k8s, node: Restore router IPs (cilium_host) from K8s resource May 26, 2021

christarazi requested a review from a team May 26, 2021 19:19

christarazi force-pushed the pr/christarazi/restore-router-ips-k8s branch 2 times, most recently from d2b80f6 to 77bf5af Compare May 27, 2021 18:39

aanm added this to Needs backport from master in 1.9.9 May 27, 2021

christarazi force-pushed the pr/christarazi/restore-router-ips-k8s branch from c9af154 to 68cae4c Compare June 1, 2021 23:54

aanm merged commit beb8bde into cilium:master Jun 4, 2021

christarazi deleted the pr/christarazi/restore-router-ips-k8s branch June 4, 2021 16:45

pchaigno mentioned this pull request Jun 4, 2021

v1.10 backports 2021-06-04 #16438

Merged

pchaigno added backport-pending/1.10 and removed needs-backport/1.10 labels Jun 4, 2021

aanm added backport-done/1.10 and removed backport-pending/1.10 labels Jun 7, 2021

aanm mentioned this pull request Jun 16, 2021

Prepare for release v1.10.1 #16544

Merged

pchaigno mentioned this pull request Jun 16, 2021

v1.9 backports 2021-06-16 #16568

Merged

christarazi mentioned this pull request Jun 16, 2021

v1.9 backports 2021-06-16 #16569

Merged

christarazi added backport-pending/1.9 and removed needs-backport/1.9 labels Jun 16, 2021

maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.9 Jun 16, 2021

pchaigno added needs-backport/1.9 and removed backport-pending/1.9 labels Jun 16, 2021

maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Needs backport from master in 1.9.9 Jun 16, 2021

pchaigno added backport-pending/1.9 and removed needs-backport/1.9 labels Jun 16, 2021

maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.9 Jun 16, 2021

aanm mentioned this pull request Jun 21, 2021

Cilium assigns additional IP to cilium_host upon restart #16520

Closed

pchaigno added backport-done/1.9 and removed backport-pending/1.9 labels Jul 12, 2021

maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.9 Jul 12, 2021

joestringer mentioned this pull request Jul 19, 2021

Prepare for release v1.9.9 #16932

Merged

christarazi mentioned this pull request Aug 30, 2021

daemon, node: Fix faulty router IP restoration logic #16672

Merged

houminz mentioned this pull request Aug 12, 2022

Cilium assigns additional IP to cilium_host upon restart for IPAMCRD mode #20879

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k8s, node: Restore router IPs (cilium_host) from K8s resource #16307

k8s, node: Restore router IPs (cilium_host) from K8s resource #16307

christarazi commented May 25, 2021 •

edited

nathanjsweet left a comment

brb left a comment

christarazi commented May 26, 2021 •

edited

christarazi commented May 26, 2021

christarazi commented May 26, 2021 •

edited

christarazi commented May 27, 2021

christarazi commented Jun 1, 2021 •

edited

k8s, node: Restore router IPs (cilium_host) from K8s resource #16307

k8s, node: Restore router IPs (cilium_host) from K8s resource #16307

Conversation

christarazi commented May 25, 2021 • edited

nathanjsweet left a comment

Choose a reason for hiding this comment

brb left a comment

Choose a reason for hiding this comment

christarazi commented May 26, 2021 • edited

christarazi commented May 26, 2021

christarazi commented May 26, 2021 • edited

christarazi commented May 27, 2021

christarazi commented Jun 1, 2021 • edited

christarazi commented May 25, 2021 •

edited

christarazi commented May 26, 2021 •

edited

christarazi commented May 26, 2021 •

edited

christarazi commented Jun 1, 2021 •

edited