Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where Cilium drops traffic from remote nodes in etcd mode, despite policy that allows the traffic #18777

Merged

Conversation

joestringer
Copy link
Member

@joestringer joestringer commented Feb 11, 2022

This patch updates the Cilium logic for handling remote node identity
updates to ensure that when Cilium's --enable-remote-node-identity
flag is configured, each Cilium node will consistently consider all
other nodes as having the remote-node identity. This fixes an issue
where users reported policy drops from remote nodes -> pods, even though
the policy appeared to allow this. The issue was limited to kvstore
configurations of Cilium, and does not affect configurations where CRDs
are used for sharing information within the cluster.

For background:

When Cilium starts up, it locally scans for IP addresses associated with
the node, and updates its own IPcache to associate those IPs with the
host identity. Additionally, it will also publish this information to
other nodes so that they can make policy decisions regarding traffic
coming from IPs attached to nodes in the cluster.

Before commit 7bf60a5 ("nodediscovery: Fix local host identity
propagation"), Cilium would propagate the identity remote-node as part
of these updates to other nodes. After that commit, it would propagate
the identity host as part of these updates to other nodes.

When receiving these updates, Cilium would trust the identity directly
and push IP->Identity mappings like this into the datapath, regardless
of whether the --enable-remote-node-identity setting was configured or
not. As such, when the above commit changed the behaviour, it triggered
a change in policy handling behaviour.

The --enable-remote-node-identity flag was initially introduced to
allow the security domain of remote nodes in the cluster to be
considered differently vs. the local host. This can be important as
Kubernetes defines that the host should always have access to pods on
the node, so if all nodes are considered the same as the host, this
can represent a larger open policy surface for pods than necessary in a
zero trust environment.

Given the potential security implications of this setting, at the time
that it was introduced, we introduced mitigations both in the control
plane and in the data plane. Whenever the datapath is configured with
--enable-remote-node-identity=true, it will also distrust any reports
that peer node identities are host, even if the ipcache itself reports
this. In this situation, the datapath does not accept that the traffic
is from the host. Rather, it demotes the identity of the traffic to
considering it as part of the world. The motivation behind this is
that allowing world is a very permissive policy, so if the user is OK
with allowing world traffic then it is likely that they will be OK
with accepting any traffic like this which purports to be coming from a
host in the cluster.

As a result of the above conditions, users running in kvstore mode who
upgraded from earlier Cilium versions to 1.9.12, 1.10.6 or 1.11.0 (and
other releases up until this patch is released as part of an official
version) could observe traffic drops for traffic from nodes in the
cluster towards pods on other nodes in the cluster. Hubble would report
that the traffic is coming "from the world" (identity=2), despite having
a source address of another node in the cluster.

We considered multiple approaches to solving this issue:

  1. Revert the commit that introduced the issue (see Revert "nodediscovery: Fix local host identity propagation" #18763).
    • Evidently, by this point there are multiple other codepaths relying
      on the internal storage of the local node's identity as Host, which
      would make this more difficult.
  2. Ensure that the kvstore propagation code propagates the current node's
    identity as remote-node, as other nodes may expect.
    • In cases of versions with mixed knowledge of remote-node-identity
      (for instance during upgrade), then newer nodes could end up
      propagating the new identity, but old nodes would not understand
      how to calculate policy with this identity in consideration, so
      this could result in similar sorts of policy drops during upgrade.
  3. In the case when --enable-remote-node-identity=true, ensure that when
    Cilium receives updates from peer nodes, it demotes the host
    identity reported by peer nodes down to remote-node for the
    associated IP addresses. This way, the impact of the flag is limited
    to the way that the current node configures itself only. If the
    datapath is then informed (via ipcache) that thes IPs correspond to
    remote-node, then the policy will be correctly assessed.

This commit takes approach (C).

Fixes: 7bf60a5 ("nodediscovery: Fix local host identity propagation")
Co-authored-by: André Martins andre@cilium.io
Signed-off-by: Joe Stringer joe@cilium.io

@joestringer joestringer requested a review from a team February 11, 2022 04:38
@joestringer joestringer requested a review from a team as a code owner February 11, 2022 04:38
@joestringer joestringer added needs-backport/1.10 release-note/bug This PR fixes an issue in a previous release of Cilium. labels Feb 11, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Feb 11, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.13 Feb 11, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.8 Feb 11, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.11.2 Feb 11, 2022
@joestringer
Copy link
Member Author

CC @jrajahalme this is partly triggrered by an interaction with 28d9889#diff-e9cec4c2460e089f0f518c07f3d4c2fb40273486a8346d499ec5188395417772R337 , used from the external workloads logic - because the external workloads refactor introduced the field into the Node, which caused all Cilium nodes to report to each other about their own Identity, through the kvstore. This appears to be a bug specific to the kvstore side of things for this reason. I don't think there's any adverse interactions with the external workloads work, since in that case the VMManager will report other identities (not host) for external workload peers.

This patch updates the Cilium logic for handling remote node identity
updates to ensure that when Cilium's '--enable-remote-node-identity'
flag is configured, each Cilium node will consistently consider all
other nodes as having the "remote-node" identity. This fixes an issue
where users reported policy drops from remote nodes -> pods, even though
the policy appeared to allow this. The issue was limited to kvstore
configurations of Cilium, and does not affect configurations where CRDs
are used for sharing information within the cluster.

For background:

When Cilium starts up, it locally scans for IP addresses associated with
the node, and updates its own IPcache to associate those IPs with the
"host" identity. Additionally, it will also publish this information to
other nodes so that they can make policy decisions regarding traffic
coming from IPs attached to nodes in the cluster.

Before commit 7bf60a5 ("nodediscovery: Fix local host identity
propagation"), Cilium would propagate the identity "remote-node" as part
of these updates to other nodes. After that commit, it would propagate
the identity "host" as part of these updates to other nodes.

When receiving these updates, Cilium would trust the identity directly
and push IP->Identity mappings like this into the datapath, regardless
of whether the '--enable-remote-node-identity' setting was configured or
not. As such, when the above commit changed the behaviour, it triggered
a change in policy handling behaviour.

The '--enable-remote-node-identity' flag was initially introduced to
allow the security domain of remote nodes in the cluster to be
considered differently vs. the local host. This can be important as
Kubernetes defines that the host should always have access to pods on
the node, so if all nodes are considered the same as the "host", this
can represent a larger open policy surface for pods than necessary in a
zero trust environment.

Given the potential security implications of this setting, at the time
that it was introduced, we introduced mitigations both in the control
plane and in the data plane. Whenever the datapath is configured with
--enable-remote-node-identity=true, it will also distrust any reports
that peer node identities are "host", even if the ipcache itself reports
this. In this situation, the datapath does not accept that the traffic
is from the "host". Rather, it demotes the identity of the traffic to
considering it as part of the "world". The motivation behind this is
that allowing "world" is a very permissive policy, so if the user is OK
with allowing "world" traffic then it is likely that they will be OK
with accepting any traffic like this which purports to be coming from a
"host" in the cluster.

As a result of the above conditions, users running in kvstore mode who
upgraded from earlier Cilium versions to 1.9.12, 1.10.6 or 1.11.0 (and
other releases up until this patch is released as part of an official
version) could observe traffic drops for traffic from nodes in the
cluster towards pods on other nodes in the cluster. Hubble would report
that the traffic is coming "from the world" (identity=2), despite having
a source address of another node in the cluster.

We considered multiple approaches to solving this issue:
A) Revert the commit that introduced the issue (see ciliumGH-18763).
   * Evidently, by this point there are multiple other codepaths relying
     on the internal storage of the local node's identity as Host, which
     would make this more difficult.
B) Ensure that the kvstore propagation code propagates the current node's
   identity as "remote-node", as other nodes may expect.
   * In cases of versions with mixed knowledge of remote-node-identity
     (for instance during upgrade), then newer nodes could end up
     propagating the new identity, but old nodes would not understand
     how to calculate policy with this identity in consideration, so
     this could result in similar sorts of policy drops during upgrade.
C) In the case when --enable-remote-node-identity=true, ensure that when
   Cilium receives updates from peer nodes, it demotes the "host"
   identity reported by peer nodes down to "remote-node" for the
   associated IP addresses. This way, the impact of the flag is limited
   to the way that the current node configures itself only. If the
   datapath is then informed (via ipcache) that thes IPs correspond to
   "remote-node", then the policy will be correctly assessed.

This commit takes approach (C).

Fixes: 7bf60a5 ("nodediscovery: Fix local host identity propagation")
Co-authored-by: André Martins <andre@cilium.io>
Signed-off-by: Joe Stringer <joe@cilium.io>
@joestringer joestringer force-pushed the submit/fix-identity-propagation-issue branch from 8959ec6 to 967d131 Compare February 11, 2022 04:45
@joestringer
Copy link
Member Author

I manually tested this out by deploying Cilium as per the instructions in #17369, first setting the image to the current master version to observe that the ipcache in nodes would report other nodes' IPs as identity=1, then by building & applying an image based on this PR, and observing that the ipcache is then correctly set up with the remote-node identity (or kube-apiserver identity for remote apiserver within the cluster).

@joestringer
Copy link
Member Author

joestringer commented Feb 11, 2022

/test

Job 'Cilium-PR-K8s-1.16-kernel-4.9' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sConformance Portmap Chaining Check connectivity-check compliance with portmap chaining

Failure Output

FAIL: Found 1 k8s-app=cilium logs matching list of errors that must be investigated:

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.16-kernel-4.9 so I can create a new GitHub issue to track it.

@joestringer joestringer changed the title Fix bug where Cilium drops traffic from remote nodes despite policy to allow the traffic Fix bug where Cilium drops traffic from remote nodes despite policy to allow the traffic in etcd mode Feb 11, 2022
@joestringer joestringer changed the title Fix bug where Cilium drops traffic from remote nodes despite policy to allow the traffic in etcd mode Fix bug where Cilium drops traffic from remote nodes in etcd mode, despite policy that allows the traffic Feb 11, 2022
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough commit msg! Changes LGTM and explanation makes sense. One thing that I didn't quite follow is why this was limited to kvstore configurations and did not occur in CRD mode.

@joestringer
Copy link
Member Author

joestringer commented Feb 11, 2022

@christarazi As far as I can tell, the NodeManager.LocalNode.NodeIdentity is only used from kvstore code to propagate the node's identity to other peers. My guess is that we share this information differently when kvstore is disabled. As far as I could tell, this was related to the commit which I CC'd Jarno on above. I suspect it's shared to etcd through some sort of implicit conversion of the Node object into a structure that can be stored into etcd, then the generic etcd code ends up handling that update event on peer nodes via the pkg/ipcache/kvstore.go logic which this PR changes.

@aanm aanm added the kind/regression This functionality worked fine before, but was broken in a newer release of Cilium. label Feb 11, 2022
@joestringer joestringer merged commit f6a4104 into cilium:master Feb 11, 2022
@joestringer joestringer deleted the submit/fix-identity-propagation-issue branch February 11, 2022 18:02
@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.11.2 Feb 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.8 Feb 14, 2022
@jibi jibi mentioned this pull request Feb 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.13 Feb 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.13 Feb 15, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.13 Feb 15, 2022
@jibi jibi added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Feb 16, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.11 in 1.11.2 Feb 16, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.8 Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. kind/regression This functionality worked fine before, but was broken in a newer release of Cilium. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.10.8
Backport done to v1.10
1.11.2
Backport done to v1.11
1.9.13
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

None yet

7 participants