Skip to content

Commit

Permalink
pkg/datapath/linux: Fix asymmetric IPsec logic on delete
Browse files Browse the repository at this point in the history
With ENI IPAM mode and IPsec enabled, users were reporting cases where
connectivity to particular pods breaks, and correlated with those drops,
the following error msg:

```
Unable to delete the IPsec route OUT from the host routing table
```

In addition, it was also reported that the connectivity outage would
only last for a few minutes before resolving itself.

The issue turned out to be that upon node deletion, the logic to handle
the IPsec cleanup is asymmetric with the IPsec logic to handle a node
create / update.

Here's how:

  * With ENI mode and IPsec, subnet encryption mode is enabled
    implicitly.
  * Background: Users can explicitly enable subnet encryption mode by
    configuring `--ipv4-pod-subnets=[cidr1,cidr2,...]`.
    * Background: ENIs are part of subnet(s).
    * Cilium with ENI mode automatically appends the node's ENIs'
      subnets' CIDRs to this slice.
    * For example, node A has ENI E which is a part of subnet S with
      CIDR C. Therefore, `--ipv4-pod-subnets=[C]`.
  * This means that each node should have an IPsec OUT routes for each
    pod subnet, i.e. each ENI's subnet, as shown by
    (*linuxNodeHandler).nodeUpdate() which contains the IPsec logic on a
    node create / update.
  * Upon a node delete [(*linuxNodeHandler).nodeDelete()], we clean up
    the "old" node. When it gets to the IPsec logic, it removes the
    routes for the pod subnets as well, i.e. removes the route to the
    ENI's subnet from the local node. From the example above, it'd
    remove the route for CIDR C.
  * This is problematic because in ENI mode, different nodes can share
    the same ENI's subnet, meaning subnets are NOT exclusive to a node.
    For example, a node B can also have ENI E with a subnet C attached
    to it.
  * As for how the nodes were fixing themselves, it turns out that
    (*Manager).backgroundSync() runs on an interval which calls
    NodeValidateImplementation() which calls down to
    (*linuxNodeHandler).nodeUpdate() thereby running the IPsec logic of
    a node create / update which reinstates the missing routes.

Therefore, we shouldn't be deleting these routes because pods might
still be relying on them. By comparing the IPsec delete logic with [1],
we see that they're asymmetric. This commit fixes this asymmetry.

[1]: Given subnetEncryption=true, notice how we only call
     enableSubnetIPsec() if the node is local. That is not the case on
     node delete.

     ```
     func (n *linuxNodeHandler) nodeUpdate(oldNode, newNode *nodeTypes.Node, firstAddition bool) error {
     	...

     	if n.nodeConfig.EnableIPSec && !n.subnetEncryption() && !n.nodeConfig.EncryptNode {
     		n.enableIPsec(newNode)
     		newKey = newNode.EncryptionKey
     	}

     	...

     	if n.nodeConfig.EnableIPSec && !n.subnetEncryption() {
     		n.encryptNode(newNode)
     	}

     	if newNode.IsLocal() {
     		isLocalNode = true
     		...
     		if n.subnetEncryption() {
     			n.enableSubnetIPsec(n.nodeConfig.IPv4PodSubnets, n.nodeConfig.IPv6PodSubnets)
     		}
     		...
     		return nil
     	}
     ```

Fixes: 645de9d ("cilium: remove encryption route and rules if crypto
is disabled")

Co-authored-by: John Fastabend <john@isovalent.com>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
  • Loading branch information
2 people authored and nebril committed Feb 18, 2022
1 parent e41aea0 commit 0bd4e04
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion pkg/datapath/linux/node.go
Expand Up @@ -1171,7 +1171,9 @@ func (n *linuxNodeHandler) nodeDelete(oldNode *nodeTypes.Node) error {
}

if n.nodeConfig.EnableIPSec {
n.deleteIPsec(oldNode)
if oldNode.IsLocal() || !n.subnetEncryption() {
n.deleteIPsec(oldNode)
}
}

if option.Config.EnableWireguard {
Expand Down

0 comments on commit 0bd4e04

Please sign in to comment.