-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.11 backports 2022-03-15 #19142
v1.11 backports 2022-03-15 #19142
Conversation
[ upstream commit 4c3bd27 ] We are hitting this timeout sometimes, and it seems it was previously updated on the regular pipelines (see 31a622e) but not on the runtime pipeline. We remove the inner timeout as the outer one is pratically redundant here, as the steps outside of the inner loop are almost instantaneous. Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 4576841 ] [ Backporter's notes: fixed trailing whitespace in yaml comments ] In order to connect Clustermesh clusters without cilium-cli tool we would need to manually patch the cilium agent with hostAliases, configure the cilium-clustermesh secret with mTLS material from the connected clusters. This commit adds support to connect multiple Clustermesh clusters using the Helm Chart. Fixes: cilium#17811 Signed-off-by: Samuel Torres <samuelpirestorres@gmail.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit d9d23ba ] When installing Cilium in an AKS cluster, the Cilium Operator requires an Azure Service Principal with sufficient privileges to the Azure API for the IPAM allocator to be able to work. Previously, the `az ad sp create-for-rbac` was assigning by default the `Contributor` role to new Service Principals when none was provided via the optional `--role` flag, whereas it now does not assign any role at all. This of course breaks IPAM allocation due to insufficient permissions, resulting in operator failures of this kind: ``` level=warning msg="Unable to synchronize Azure virtualnetworks list" error="network.VirtualNetworksClient#ListAll: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code=\"AuthorizationFailed\" Message=\"The client 'd09fb531-793a-40fc-b934-7af73ca60e32' with object id 'd09fb531-793a-40fc-b934-7af73ca60e32' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/read' over scope '/subscriptions/22716d91-fb67-4a07-ac5f-d36ea49d6167' or the scope is invalid. If access was recently granted, please refresh your credentials.\"" subsys=azure level=fatal msg="Unable to start azure allocator" error="Initial synchronization with instances API failed" subsys=cilium-operator-azure ``` We update the documentation guidelines for new installations to assign the `Contributor` role to new Service Principals used for Cilium. We also take the opportunity to: - Update Azure IPAM required privileges documentation. - Make it so users can now set up all AKS-specific required variables for a Helm install in a single command block, rather than have it spread over several command blocks with intermediate steps and temporary files. - Have the documentation recommend creating Service Principals with privileges over a restricted scope (AKS node resource group) for increased security. Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm for #19108
9182d1d
to
8dc58db
Compare
/test-backport-1.11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR looks good. Thanks Timo!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes LGTM, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1c89a98 lgtm. thank you! 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ci-gke-1.11 |
[ upstream commit 9b1d3a3 ] [ Backporter's notes: regenerated docs ] In order to enable offline deployment for certain platforms (like OpenShift) we need to be able to have a universal override for all images so that the OpenShift certified operator can list its "related images"[1][2]. [1]https://docs.openshift.com/container-platform/4.9/operators/operator_sdk/osdk-generating-csvs.html#olm-enabling-operator-for-restricted-network_osdk-generating-csvs [2]https://redhat-connect.gitbook.io/certified-operator-guide/appendix/offline-enabled-operators Signed-off-by: Nate Sweet <nathanjsweet@pm.me> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 4b49c91 ] Signed-off-by: Michael Fischer <fiscmi@amazon.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit ea9fd6f ] Turns out that in GKE's 'cos' images the 'containerd' binary is still present even though '/etc/containerd/config.toml' is not. Hence, the kubelet wrapper would still be installed for these images according to the current check, even though it's not necessary. What's worse, starting the kubelet would fail because the 'sed' command targeting the aforementioned file would fail. This PR changes the check to rely on the presence of the '--container-runtime-endpoint' flag in the kubelet, which is probably a more reliable way of detecting '*_containerd' flavours and only applying the fix in these cases. Fixes cilium#19015. Signed-off-by: Bruno M. Custódio <brunomcustodio@gmail.com> Co-authored-by: Alexandre Perrin <alex@kaworu.ch> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit e396b5e ] The documentation page about setting up Hubble observability wrongly states that TCP port 4245 needs to be open on all nodes running Cilium to allow Hubble Relay to operate correctly. This is incorrect. Port 4245 is actually the default port used by Hubble Relay, which is a regular deployment and doesn't require any particular action from the user. However, Hubble server uses port 4244 by default and, given that it is embedded in the Cilium agent and uses a host port, it requires it to be open on all nodes to allow Hubble Relay to connect to each Hubble server instance. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 0f4d3a7 ] In October 2020, we made changes[1] to the cilium-agent's ClusterRole to be more permissive. We did this, because Openshift enables[2] the OwnerReferencesPermissionEnforcement[3] admission controller. This admissions controller prevents changes to the metadata.ownerReferences of any object unless the entity (the cilium-agent in this case) has permission to delete the object as well. Furthermore, the controller allows protects metadata.ownerReferences[x].blockOwnerDeletion of a resource unless the entity (again, the cilium-agent) has "update" access to the finalizer of the object having its deletion blocked. The original PR mistakenly assumed we set ownerReferences on pods and expanded cilium-agent's permissions beyond what was necessary. Cilium-agent only sets ownerReferences on a CiliumEndpoint and the blockOwnerDeletion field propagates up to the "owning" pod of the endpoint. Cilium-agent only needs to be able to delete CiliumEndpoints (which it has always been able to) and "update" pod/finalizers (to set the blockOwnerDeletion field on CiliumEndpoints). All other changes contained in cilium#13369 were unnecessary. 1 cilium#13369 2 https://docs.openshift.com/container-platform/4.6/architecture/admission-plug-ins.html 3 https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#ownerreferencespermissionenforcement Signed-off-by: Nate Sweet <nathanjsweet@pm.me> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 0211c92 ] syncDesiredPolicyMapWith() syncs the diffs between the endpoint't desired policy mapstate and the given 'realized' mapstate to the datapath bpf map, applying all differences also to endoint's realized mapstate. The current name of this function highlights the desired mapstate, which might be understood as endpoint's realized mapstate not being updated. Clarify this by renaming syncDesiredPolicyMapWith() as syncPolicyMapsWith(). Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 4d3f829 ] When an identity is deleted, the agent processes the incremental policy map changes against previously saved policy state with owners and dependents of the identity key to decide if an identity can be marked for deletion. In some cases, the `owners` field for the identity entry was nil (due to them being redacted when returned by ConsumeMapChanges() previously), and the deleteKeyWithChanges would return early without tracking the identity for deletion. As a result, when this information is consumed later to delete the identity from the BPF maps, the identity was being leaked in those cases. Callers of ConsumeMapChanges() and AddVisibilityKeys() depend on the returned added and deleted keys to perform datapath policy map updates without iterating whole policy maps. Moreover, the "realized" version of endpoint's policy map used to be updated solely based on these returned entries. Make this less fragile and error prone by only returning the added and deleted map keys (no values). AddVisibilityKeys() supports reverting the changes, so it still needs to return the full MapState (keys and values) for any changed entries. Now that ConsumeMapChanges() and AddVisibilityKeys() return the keys of added entries, the caller needs to find the value to be inserted from the desired policy map using the given key. Fixes: cilium#18581 Fixes: 0d585f1 Co-authored-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 295e857 ] [ Backporter's notes: did not apply cleanly, but end result is identical ] Michi and Maciej report that they can trigger a deadlock as part of FQDN selectorcache preparation: goroutine 873 [chan send, 969 minutes]: github.com/cilium/cilium/pkg/identity/cache.(*localIdentityCache).lookupOrCreate(0xc00096b440, 0xc0011baf00, 0x0, 0xc000f4c800, 0x0, 0x0) /go/src/github.com/cilium/cilium/pkg/identity/cache/local.go:106 +0x32a github.com/cilium/cilium/pkg/identity/cache.(*CachingIdentityAllocator).AllocateIdentity(0xc0001decc0, 0x2f6c4a0, 0xc00263e660, 0xc0011baf00, 0xc000a18400, 0x0, 0x2ed1e00, 0x0, 0x0) /go/src/github.com/cilium/cilium/pkg/identity/cache/allocator.go:377 +0xbfa github.com/cilium/cilium/pkg/ipcache.allocateCIDRs(0xc001031000, 0x816, 0x816, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) /go/src/github.com/cilium/cilium/pkg/ipcache/cidr.go:105 +0x27d github.com/cilium/cilium/pkg/ipcache.AllocateCIDRsForIPs(0xc0010ce000, 0x816, 0x955, 0x0, 0x0, 0x0, 0x8000101, 0x0, 0xffffffffffffffff) /go/src/github.com/cilium/cilium/pkg/ipcache/cidr.go:66 +0x71 github.com/cilium/cilium/daemon/cmd.cachingIdentityAllocator.AllocateCIDRsForIPs(0xc0001decc0, 0xc0010ce000, 0x816, 0x955, 0x0, 0x252fbe0, 0xc00076ca38, 0x2598940, 0xc00148a610, 0x0) /go/src/github.com/cilium/cilium/daemon/cmd/identity.go:124 +0x4d github.com/cilium/cilium/pkg/policy.(*fqdnSelector).allocateIdentityMappings(0xc0012cfa40, 0x7fe4824e20f8, 0xc0001decc0, 0xc0013b9050) /go/src/github.com/cilium/cilium/pkg/policy/selectorcache.go:502 +0x2dd github.com/cilium/cilium/pkg/policy.(*SelectorCache).AddFQDNSelector(0xc000483420, 0x2f0dc80, 0xc001848f00, 0xc000fcef40, 0x32, 0x0, 0x0, 0x0, 0x0, 0x0) /go/src/github.com/cilium/cilium/pkg/policy/selectorcache.go:831 +0x3db ... goroutine 440 [semacquire, 969 minutes]: sync.runtime_SemacquireMutex(0xc000483424, 0xc001163500, 0x1) /usr/local/go/src/runtime/sema.go:71 +0x47 sync.(*Mutex).lockSlow(0xc000483420) /usr/local/go/src/sync/mutex.go:138 +0x105 sync.(*Mutex).Lock(...) /usr/local/go/src/sync/mutex.go:81 sync.(*RWMutex).Lock(0xc000483420) /usr/local/go/src/sync/rwmutex.go:111 +0x90 github.com/cilium/cilium/pkg/policy.(*SelectorCache).UpdateIdentities(0xc000483420, 0xc0009685a0, 0xc0009685d0, 0xc000699e40) /go/src/github.com/cilium/cilium/pkg/policy/selectorcache.go:960 +0x55 github.com/cilium/cilium/daemon/cmd.(*Daemon).UpdateIdentities(0xc0002ed8c0, 0xc0009685a0, 0xc0009685d0) /go/src/github.com/cilium/cilium/daemon/cmd/policy.go:99 +0x6a github.com/cilium/cilium/pkg/identity/cache.(*identityWatcher).watch.func1(0xc000655140, 0xc0001decf8) /go/src/github.com/cilium/cilium/pkg/identity/cache/cache.go:205 +0x20c created by github.com/cilium/cilium/pkg/identity/cache.(*identityWatcher).watch /go/src/github.com/cilium/cilium/pkg/identity/cache/cache.go:154 +0x49 Commit b10797c ("fqdn: Move identity allocation to FQDN selector") contains the following fateful quote in its commit message: > This commit should have no functional changes. Technically the identity > allocation is moved inside the SelectorCache critical section while > holding the mutex, but CIDR identities are only ever locally allocated > within memory anyway so this is not expected to block for a long time. Evidently, the author (yours truly) was unaware that deep in the local identity allocator, it holds a notification mechanism to notify the rest of the selectorcache about the newly allocated identity; or indeed that this in itself would also attempt to grab the SelectorCache mutex in order to propagate this identity into the rest of the policy subsystem. On average, this issue would not trigger a deadlock as there is a channel between the two goroutines at the beginning of this commit message. As long as the channel does not become full, and the goroutines are handled by different OS threads, the first goroutine above would not stall at (*localIdentityCache).lookupOrCreate and would instead proceed to release the SelectorCache mutex, at which point the identityWatcher from the second goroutine could acquire the mutex and make progress. However, in certain cases, particularly if many IPs were associated with a single FQDN, and policy recalculation was triggered (perhaps due to creation of a new pod or new FQDN policies), the deadlock could be triggered by filling up the channel. This patch fixes the issue by pushing the identity allocation outside of the critical section of the SelectorCache, so that even if there are many queued identity updates to handle and they fill up the channel, first goroutine will block at (*localIdentityCache).lookupOrCreate() without holding the SelectorCache mutex. This should allow the second goroutine to make progress regardless of the channel filling up or the first goroutine blocking. The following patch will apply similar logic to the identity release path. Fixes: b10797c ("fqdn: Move identity allocation to FQDN selector") Reported-by: Michi Mutsuzaki <michi@isovalent.com> Reported-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 7f9cfd1 ] Similar to the prior commit, when releasing identities, it is possible for the identity allocator to attempt to hold SelectorCache.mutex while handling the release of identities. Therefore, this commit moves the release of identities outside of the critical section for the SelectorCache during AddFQDNSelector() (during selector creation), as well as during the removal of the selector from the SelectorCache. Found by code inspection. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit e9bc13d ] Previously, the config code was assuming that a netdev is L3 (i.e., NOARP) if its L2 addr was nil or 0. Recently, we saw a bug report in which a user reported that an ipip tunnel device (L3) had L2 addr of 4 bytes. This was confirmed with the following: ip link add name ipip1 type ipip local 10.0.0.3 remote 10.0.0.4 cat > /tmp/main.go <<EOF package main import "fmt" import "github.com/vishvananda/netlink" func main() { link, _ := netlink.LinkByName("ipip1") fmt.Println(link.Attrs().HardwareAddr) } EOF go run /tmp/main.go 0a:00:00:03 Fix this by making sure that the L2 addr is 6 bytes long. If it's not, then assume that the device is of the L3 type. Reported-by: @sepich Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 42b7f12 ] Added the `endpointRoute.enabled=true` value to the helm install when running Azure CNI in chaining mode. Signed-off-by: Nitish Malhotra <nitishm@microsoft.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 79645bd ] Commit 51647bb ("Makefile: Simplify to run faster") mistakenly added a '/' to the end of the '$(ROOT_DIR)' in the substitution, which invalidated the greps that attempted to exclude directories that do not need to trigger unit tests, for instance the `test/` directory. As a result, when running commands like `make integration-tests`, it was easy to inadvertently attempt to run the end-to-end tests in an environment that cannot run the end-to-end-tests. Fix this issue by injecting a leading '/' before each package, so that the package grep can match on that leading '/' in order to properly select directories based on their path from the root directory. The extra sed command will then filter the leading '/' back out before passing that variable on. While we're at it, fold the '/test' match into the main grep expression and avoid updating the TESTPKGS_EVAL during initial Makefile evaluation, to further speed up the calculation. Fixes: 51647bb ("Makefile: Simplify to run faster") Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit dd54cd8 ] Identities has been calculated within improper range. It is necessary to call the identity.InitWellKnownIdentities() in order to initialize proper identity.MinimumAllocationIdentity and identity.MaximumAllocationIdentity limits. Missing EnableWellKnownIdentities option has been added to provide the responsible code is executed. Signed-off-by: Adam Bocim <adam.bocim@seznam.cz> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 321ec09 ] This was a left-over quirk for DNS visibility a long time ago, and these days we rely on DNS rules for that visibility instead. Remove this legacy work-around so that monitor aggregation can do its job as expected also with other ports. Effectively, this reverts commit 501c2a0 ("bpf: Monitor DNS with MTU-sized payload len"). Fixes: 501c2a0 ("bpf: Monitor DNS with MTU-sized payload len") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 6f7bf6c ] [ Backporter's notes: imports were moved around, so had to re-order ] CEPs are creating as well as updated based on informer store data local to an agent's node but (necessarily) deleted globally from the API server. This can currently lead to situations where an agent that does not own a CEP deletes an unrelated CEP. Avoid this problem by having agents maintain the CEP UID and using it as a precondition when deleting CEPs. This guarantees that only the owning agents can delete "their" CEPs. Signed-off-by: Timo Reimann <ttr314@googlemail.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 1b7bce3 ] [ Backporter's notes: same as previous commit, had to reorder imports ] Signed-off-by: Timo Reimann <ttr314@googlemail.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 4240222 ] Needed for a follow-up change to address an endpoint restoration issue (cilium#18923). Signed-off-by: Timo Reimann <ttr314@googlemail.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 741d4bb ] Pass destination security identity, as used at the time of policy enforcement, from L7 proxies to access logs. This can be 0 if not resolved by the proxy (e.g., by an ingress proxy), but in those cases the logger now resolves the destination IP to the corresponding security identity even when the destination is not a local endpoint. Unify EndpointInfoRegistry interface calls FillEndpointIdentityByID() and FillEndpointIdentityByIP() into a new FillEndpointInfo() that takes both IP and numeric identity as parameters. This is used for both source and destination endpoint info in logger.Addressing() so that EndpointInfo no longer needs to be overwritten in notifyOnDNSMsg() after the Addressing() tag. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit d4a546f ] Fill source and destination endpoint info in LogRecord using the same FillEndpointInfo() function, making LogRecord.localEndpointInfo related functions obsolete. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 3621a22 ] [ Backporter's notes: reordering imports ] endpointInfoRegistry is essentially a package level variable, as it is always the same. Add a new seetter to set it once in the beginning. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 97908ba ] Pass source vs. destination addressing info in access logs correctly instead of relying Hubble to swap them around for responses. This has the benefit that also cilium monitor output has correct address and security identity for L7 responses. Note that this changes cilium monitor output that used to incorrectly show security identities as "A->B" for both requests and responses. Now these are formatted as "A->B" and "B->A" for requests and responses, respectively. Fixes: cilium#9558 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 283937f ] [ Backporter's notes: reordering imports ] Do not discard access logs if the local endpoint can not be found. This can happen if endpoint is removed before all access logs for it are received from Envoy. Endpoint's proxy stats still can not be updated in that case. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Timo Beckers <timo@isovalent.com>
8dc58db
to
d602b86
Compare
/test-backport-1.11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for my changes. Thanks!
/test-backport-1.11 |
@jrajahalme latest
Seems legit. EDIT - just noticed there is a thread covering this already. Disregard. |
/ci-aks-1.11 |
/test-ci-aks-1.11 |
Once this PR is merged, you can update the PR labels via: