Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pkg/datapath/linux: Fix asymmetric IPsec logic on delete
With ENI IPAM mode and IPsec enabled, users were reporting cases where connectivity to particular pods breaks, and correlated with those drops, the following error msg: ``` Unable to delete the IPsec route OUT from the host routing table ``` In addition, it was also reported that the connectivity outage would only last for a few minutes before resolving itself. The issue turned out to be that upon node deletion, the logic to handle the IPsec cleanup is asymmetric with the IPsec logic to handle a node create / update. Here's how: * With ENI mode and IPsec, subnet encryption mode is enabled implicitly. * Background: Users can explicitly enable subnet encryption mode by configuring `--ipv4-pod-subnets=[cidr1,cidr2,...]`. * Background: ENIs are part of subnet(s). * Cilium with ENI mode automatically appends the node's ENIs' subnets' CIDRs to this slice. * For example, node A has ENI E which is a part of subnet S with CIDR C. Therefore, `--ipv4-pod-subnets=[C]`. * This means that each node should have an IPsec OUT routes for each pod subnet, i.e. each ENI's subnet, as shown by (*linuxNodeHandler).nodeUpdate() which contains the IPsec logic on a node create / update. * Upon a node delete [(*linuxNodeHandler).nodeDelete()], we clean up the "old" node. When it gets to the IPsec logic, it removes the routes for the pod subnets as well, i.e. removes the route to the ENI's subnet from the local node. From the example above, it'd remove the route for CIDR C. * This is problematic because in ENI mode, different nodes can share the same ENI's subnet, meaning subnets are NOT exclusive to a node. For example, a node B can also have ENI E with a subnet C attached to it. * As for how the nodes were fixing themselves, it turns out that (*Manager).backgroundSync() runs on an interval which calls NodeValidateImplementation() which calls down to (*linuxNodeHandler).nodeUpdate() thereby running the IPsec logic of a node create / update which reinstates the missing routes. Therefore, we shouldn't be deleting these routes because pods might still be relying on them. By comparing the IPsec delete logic with [1], we see that they're asymmetric. This commit fixes this asymmetry. [1]: Given subnetEncryption=true, notice how we only call enableSubnetIPsec() if the node is local. That is not the case on node delete. ``` func (n *linuxNodeHandler) nodeUpdate(oldNode, newNode *nodeTypes.Node, firstAddition bool) error { ... if n.nodeConfig.EnableIPSec && !n.subnetEncryption() && !n.nodeConfig.EncryptNode { n.enableIPsec(newNode) newKey = newNode.EncryptionKey } ... if n.nodeConfig.EnableIPSec && !n.subnetEncryption() { n.encryptNode(newNode) } if newNode.IsLocal() { isLocalNode = true ... if n.subnetEncryption() { n.enableSubnetIPsec(n.nodeConfig.IPv4PodSubnets, n.nodeConfig.IPv6PodSubnets) } ... return nil } ``` Fixes: 645de9d ("cilium: remove encryption route and rules if crypto is disabled") Co-authored-by: John Fastabend <john@isovalent.com> Signed-off-by: Chris Tarazi <chris@isovalent.com>
- Loading branch information