-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support policy matching against kube-apiserver #17714
Support policy matching against kube-apiserver #17714
Conversation
f01da3e
to
5928021
Compare
Create dedicated functions for checking whether an identity represents any node in the cluster or a remote node in the cluster. This will be useful for an upcoming commit where a remote node may have the REMOTE_NODE_ID identity or alternatively another hardcoded identity. Signed-off-by: Joe Stringer <joe@cilium.io>
This identity will be used to identify remote nodes which also have the kube-apiserver colocated, and allows policy at the higher layer to differentiate nodes with this identity vs. other nodes in the cluster. Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io>
Now that cilium#12544 has been fixed, remove the code that the comment indicated to. Signed-off-by: Chris Tarazi <chris@isovalent.com>
Fix it for (*EgressRule).GetDestinationEndpointSelectorsWithRequirements(). Seems like it was case of copy-pasta from the corresponding ingress function. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This commit expands on the mock identity allocator implementation to actually perform some real work, rather than a complete no-op. It also renames the identity allocator from "Fake" to "Mock" to better represent its behavior. Signed-off-by: Chris Tarazi <chris@isovalent.com>
Has() is a new helper that will be used by subsequent commits where we need to search if a Labels object contains a specific Label. Additionally, this commit implements the ability to remove Labels, providing the inverse logic to MergeLabels(). Signed-off-by: Chris Tarazi <chris@isovalent.com>
These will useful in future commits where we need to reference these labels individual via Has() or similar. Signed-off-by: Chris Tarazi <chris@isovalent.com>
In a subsequent commit, the reserved ID cache will be updated from concurrent users. This commit prepares the code for that. Signed-off-by: Chris Tarazi <chris@isovalent.com>
With the advent of the kube-apiserver reserved identity, it is now possible for the host to have the kube-apiserver label. This means that we need to provide a way to update the reserved ID cache within Cilium for IDs with multiple labels. This breaks the long-held assumption that reserved IDs can only have one label. Signed-off-by: Chris Tarazi <chris@isovalent.com>
Previously, the function prototype that IterateReservedIdentities() took in only considered reserved identities to have a single label. Now with the introduction of the kube-apiserver reserved identity, that is no longer the case, as it contains two labels. This commits changes the function prototype to match the newly introduced AddReservedIdentityWithLabels(), which allow a full set of labels to be passed. In addition, we introduce a new map which holds the numeric identity to the full set of labels, which is the map that will be iterated over in IterateReservedIdentities(). As the previous commit mentioned, this code transitions from the assumption of one label per reserved identity, to potentially multiple labels per reserved identity. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This commit adds the new kube-apiserver label which enables traffic to be identified as kube-apiserver traffic. Along with the new label, we create a new reserved ID which will be associated with traffic going to the kube-apiserver. There are a couple nuances to note here. The kube-apiserver label will used in three scenarios: * Attached along with the local host label if the kube-apiserver is running on the local host. The identity of the local host will now be used for traffic to the local host + to the local kube-apiserver. The reserved ID of the local host *does not* change however. It is still 1, to prevent breaking changes with connectivity. * Attached along with the remote-node label as part of the new kube-apiserver reserved identity. This identity is only used if the kube-apiserver is running on remote nodes. * Attached along with the CIDR labels if the kube-apiserver is running outside of the cluster. This will result in a regular CIDR identity. In subsequent commits, the above three cases are what will be implemented to allow policy matching on the kube-apiserver. This commit setting the groundwork for that. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This will be useful in subsequent commits where it will be used to Upsert() into the ipcache. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This commit introduces a userspace map in the ipcache package called IdentityMetadata (IDMD) that assoicates CIDRs to specific labels. This is useful for figuring out the corresponding identity of the IP as the CIDR -> label mappings will be used to generate identities. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This commit refactors the function for allocating CIDR identities for prefixes to accept labels from the caller, as well as making it generic for any identity allocation (not exclusive to CIDR). The lower-level function now operates on a single prefix and its assoicated labels, while the higher-level functions perform the iteration over a slice of prefixes, passing in the default CIDR labels for each prefix. Now that the lower-level function is no longer specific to allocating CIDR identities, rename it to reflect its generic nature. The CIDR label is only attached to the identity if the prefix is associated with world. This refactor is useful in subsequent commits where the caller of the lower-level function will want to control the labels associated with a prefix. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This commit reduces the regeneration.Owner interface by removing the GetPolicyRepository() method from it. This is to remove the import of package policy from the regeneration package. This will ease the subsequent commit by avoiding an import cycle within the regeneration package. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This commit moves the policy updater trigger from the Daemon to the specific policy package. This allows us to reuse the new policy.Updater object in the future, without having to abstract away the entire Daemon object. In other words, triggering policy updates was only possible from the Daemon object through (*Daemon).TriggerPolicyUpdates(). Now, we can trigger policy updates by using the policy.Updater object instead. The Daemon object also contains too much logic for its own good, so this is a good start to start moving code and delegating to other packages. Signed-off-by: Chris Tarazi <chris@isovalent.com>
The KubeAPIServer source will be used only when an IP is associated with the kube-apiserver. The primary user of this source will be ipcache.InjectLabels() (introduced in a subsequent commit), which will perform this check and overwrite the already passed-in source with the KubeAPIServer source if there's a kube-apiserver label. The aforementioned overwrite can occur as follows: * CN watcher associates the local node IPs with its corresponding reservered:host label and upserts into the ipcache with Local source * EP watcher associates a kube-apiserver backend IP with its corresponding reserved:kube-apiserver label and upserts into the ipcache with CustomResource source (as the default source) * InjectLabels() sees that there has been an upsert with the reservered:kube-apiserver label so it upgrades the source of that request to KubeAPIServer source This source is now the strongest source throughout Cilium (greater than Local) because of the above. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This will be used in the subsequent commit to ensure ipcache.InjectLabels() can retry upon failure to allocate CIDR identities if the allocator is not yet initialized. Signed-off-by: Chris Tarazi <chris@isovalent.com>
5928021
to
c6b2ee6
Compare
Any update to the IdentityMetadata (IDMD) map must trigger an update to the ipcache. Since the IdentityMetadata maps IPS / CIDRs to labels, if an update was made, we must to check if that update was related to the labels. If so, that means it might affect identities. Hence, we must to allocate a new identity, trigger policy updating, and push it into the ipcache. This commit provides the mechanism to do so (InjectLabels()). In future commits, this will be triggered by other subsystems as they encounter IPs / CIDRs for which we need to assoicate labels with. Corresponding functions to find and delete IPs / CIDRs with specific labels is also provided, which include releasing any allocated identities and deletion from the ipcache. Here's a diagram of how this works for the kube-apiserver policy matching: +------------+ (1) (1) +------------+ | EP Watcher +-----+ +-----+ CN Watcher | +-----+------+ W | | W +------+-----+ | | | | | v v | | +------+ | | | IDMD | | | +------+ | | ^ | | | | | (3) |R | | (2) +------+--------+ (2) | +------> |Label Injector |<------+ Trigger +-------+-------+ Trigger (4) |W | v +---+ |IPC| +---+ legend: * W means write * R means read Signed-off-by: Chris Tarazi <chris@isovalent.com>
By leveraging the special "kubernetes" Service / Endpoint object, we can learn about which IPs the kube-apiserver is behind. These IPs will be upserted into the ipcache with an identity that is to-be-allocated. In order to allocate an identity for these IPs, the kube-apiserver-associated IPs are inserted into the IDMD map with the "reserved:kube-apiserver" label. Then the watcher will trigger the label injection via ipcache.TriggerLabelInjection(). We do not make a distinction on where the kube-apiserver is running, whether it's on the local node, on a remote node in the cluster, or if it's running outside of the cluster. All kube-apiserver-associated IPs will have the "reserved:kube-apiserver" label. Note that we are not hooking into the delete event. We only subscribe to the add / update events. This is because we perform a diff on the currently associated kube-apiserver IPs inside the IDMD and what the add / update event contains (desired IPs), then remove the current IPs that are not in the desired set. This operation is equivalent to subscribing to the delete event and removing the "desired" IPs from the IDMD. However, we don't need to subscribe to the delete event because this special Service is automatically recreated / reconciled by the apiserver itself if the object is removed or edited. Signed-off-by: Chris Tarazi <chris@isovalent.com>
It is common for the kube-apiserver to be running as host-network on a node in the cluster. In order to correctly classify traffic to the kube-apiserver with the kube-apiserver identity, we need to know which node IPs are for local node or remote nodes. For these node IPs, they will be associated with the "reserved:host" or "reserved:remote-node" label, respectively, in the IDMD map. Once the node manager has made these associations, we need to trigger label injection via ipcache.TriggerLabelInjection() in order to figure out to if a new identity must be allocated and if we need to trigger policy calculation. After this, if * the local host has the kube-apiserver running, it will have an IP with the following labels: "reserved:host,reserved:kube-apiserver". This traffic will be classified as "host" traffic with ID 1. This does not trigger a new identity allocation, but the ID 1 will now contan the "reserved:kube-apiserver" label. This requires updating the selector cache, as ID 1 itself has changed, which is handled in ipcache.TriggerLabelInjection(). * a remote node(s) has the kube-apiserver running, it will have an IP with the following labels "reserved:remote-node,reserved:kube-apiserver". This traffic will be classified as "kube-apiserver" traffic with the newly created ID of 7. No further policy recalculation is needed in this case, as ID 7 already exists. Signed-off-by: Chris Tarazi <chris@isovalent.com>
Now that the kube-apiserver policy matching has been implemented, expose the user-facing policy API by allowing the user to express: ``` egress: toEntities: - kube-apiserver ``` While we're at it, bump the K8s CRD schema version because this is a user-facing change to the CNP / CCNP CRDs. Signed-off-by: Chris Tarazi <chris@isovalent.com>
c6b2ee6
to
6cd39d5
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found these issues during an initial pass. I'll push a fresh PR with some of these fixes + rebase against master, then start iterating further on these items & finish up the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed this completely now. For anything I thought was crucial to fix, I prepared a fix as a squash/fixup commit at https://github.com/joestringer/cilium/tree/pr/christarazi/apiserver-policy. I will fold those changes into #17823. I can't modify this PR anyway so will need to follow up on that PR.
I'll go back through my own comments and mark "resolved" the ones that I addressed.
if !ok { | ||
id = f.localID | ||
f.ipToIdentity[ip.String()] = id | ||
f.idRefCount[id]++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This testutils refcount increase should happen unconditionally rather than only if the IP is getting a new identity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ Using pkg/counter
could also help avoid issues like this.
n := labels.NewLabelsFromModel(nil) | ||
n.MergeLabels(l) | ||
return n // copy of leftover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The identity handling here is a bit fishy - if the IdentityMetadata labels set has changed and identities are generally tied to the lifetime of the entry in the IdentityMetadata map, then this case should deallocate the identity for the previous set of labels and allocate the identity for the new set of labels.
Superseded by #17823, which is now merged. Closing. @christarazi I've left some comments open here and will be happy to discuss with you about any remaining items from here or the other PR when you're back. |
Superseded by #17823
See commit msgs.
This PR implements support for automated policy matching against the
kube-apiserver. It does so by detecting where the kube-apiserver is running,
whether it's inside the cluster or not.
TODO:
IdentityMetadata
mapToEntities
k8sT/Policies.go
(Add e2e coverage of apiserver policy functionality #17829)Fixes: #14724