New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.11 backports 2022-03-30 #19277
v1.11 backports 2022-03-30 #19277
Conversation
[ upstream commit 04c29ba ] Signed-off-by: kerthcet <kerthcet@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and looks good for my changes 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes look good. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes and db3a9af looks good to me. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR looks good, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes look good. Thanks!
d80b2bb
to
8fe003a
Compare
@jibi: I pushed again one of your commits, after renaming |
/test-backport-1.11 Job 'Cilium-PR-K8s-1.23-kernel-4.9' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/cilium/cilium/compare/d80b2bbb2c6e10c00cc5ad670c2c1810a53d5921..8fe003a8ab804e7d1d0909059e32524164dce0d5 looks good, approving again ✔️ ✔️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes LGTM, thanks.
|
[ upstream commit a63e02e ] While debugging a CI flake in which pod2pod@same host via ClusterIP done by bpf_lxc TCP SYN was not making into the dst pod, I've noticed that the corresponding CT entry had ProxyRedirect flag set. The test case in which the flag happened didn't have any L7 CNP, but some previous test case had it. So it's very likely that the CT entry was reused for by the connection which failed. Paul Chaignon pointed to a very similar issue [1]. For now, to avoid such flakes, clear the CT tables to avoid stale CT entries reuse. [1]: #17459 Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit ab7ff52 ] There is no need to redeploy Cilium in AfterAll of the "With ClusterIP external access" test suite, as the next test case will deploy it anyway. It should improve the K8sServices test suite run time by a bit. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 6e2ce4a ] Use the same condition in defining the label 'skip_policy_enforcement' as for jumping to it. Otherwise we could jump to a nonexisting label given a suitable datapath configuration. This does not seem possible now, as ENABLE_HOST_SERVICES_FULL should not be defined if ENABLE_SOCKET_LB_HOST_ONLY is defined. However, as the conditions for defining ENABLE_PER_PACKET_LB evolve, this would become a compilation bug. Fixes: #17154 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit d53795e ] For this condition: !defined(ENABLE_ROUTING) && defined(TUNNEL_MODE) && !defined(ENABLE_NODEPORT) the endif comment should be: /* !ENABLE_ROUTING && TUNNEL_MODE && !ENABLE_NODEPORT */ Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 93bc479 ] Switch from gopkg.in/natefinch/lumberjack.v2 to github.com/cilium/lumberjack/v2 This fork fixes log compression by using a temp file and then atomically renaming to the final destination. [ Backport notes: Some minor conflicts in pkg/ files on the includes, as well as in go.mod and go.sum. File pkg/logging/hooks/file_rotation.go is not present in this branch. ] Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 75f597b ] The Clustermesh-APIServer creates a CiliumEndPoint and sets a node as its ownerReference, also setting blockOwnerDeletion to "true". If the OwnerReferencesPermissionEnforcement admission controller is enabled (such as in environments like Openshift) then the Clustermesh-APIServer will fail to create the CiliumEndPoint as it has insufficient privileges to set blockOwnerDeletion of a node. It needs to be able to "update" "nodes/finalizers" in order to do this. See #19053 for more details and references. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit eb3662e ] Signed-off-by: Sigurd Spieckermann <sigurd.spieckermann@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 8b67e74 ] Quay changed its API on the 22nd. This commit updates the step that checks for the Docker images in all workflows to use the new API. [ Backport notes: Only update the few files that are present on this branch. Original commit would touch more workflows, but they're on the current branch. ] Suggested-by: André Martins <andre@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 96e7fa6 ] Because string map options are configured `flags.Var`, they are assigned twice: Once in `flags.Var`, and then again in `option.Config.Populate()` with `GetStringMapString`. The latter is needed because it works around a viper bug where ConfigMap options were not properly parsed (#18328). This commit ensures we're always using the result of the fallback parser, which fixes a bug where a ConfigMap value of `{}` was parsed as `map["{}":""]`. Fixes: 768659f ("cmd: Fix issue reading string map type via config map") Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 1a34bc5 ] This test already passes, but it's a good corner case to test to avoid running into regressions. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit f7bb659 ] One of the main callers of the K8s cache sync checker is the logic of the kube-apiserver policy feature. However, if K8s is disabled, this is all moot and the kube-apiserver policy feature, well, isn't relevant. Bypass the check on the cache channel if K8s is disabled. The reason why this commit chooses to return true here instead of at the ipcache.InjectLabels() call site is because the ipcache package would have an import cycle with the k8s package. The logic implemented in this commit isn't in the ideal location, but until the k8s & ipcache packages are reasonably well isolated from each other, then this is the only approach to workaround the problem. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 2efbdd6 ] Local Redirect Policy (LRP) namespace needs to match with the backend pods selected by the LRP. This check was missing in the case where backend pods are deployed after an LRP that selects them was applied. Added unit tests. Reported-by: Joe Stringer <joe@covalent.io> Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 01cbc59 ] This commit ensures that any update to the labels of a k8s Node object is reflected in the corresponding CiliumNode one. [ Backport notes: - Minor conflicts in pkg/k8s/watchers/node.go. - test/k8s/Node.go renamed as test/k8sT/Node.go. ] Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit f5a8eea ] [ Backport notes: Minor conflicts in pkg/k8s/watchers/cilium_node.go. ] Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit ffa5d48 ] Signed-off-by: ivan <i.makarychev@tinkoff.ru> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 4e422e3 ] The IP rules for ENI can have two forms, with or without the to x.x.x.x/x part. There is a bug between the creation and deletion of rules, where the conditions to decide which form to use aren't the same. As the result, we would attempt to remove the wrong IP rules on pod deletion, causing a warning message in logs and leaving a stale IP rule behind: level=warning msg="No rule matching found" rule="111: from 100.69.11.94/32 to 10.244.0.0/16 lookup 0" subsys=linux-routing In case a new rule is later added for a new pod with the same IP, we can end up with a conflict if the new pod is on a different ENI. We will then have two rules for the same IP with lookups in different rouing tables. 111: from 10.19.192.168 lookup 10 111: from 10.19.192.168 lookup 11 This conflict can then cause connectivity issues because of pod traffic leaving through the wrong interface. Flag --aws-release-excess-ips needs to be enabled for the bug to occur. Otherwise the IP address stays attached to the same ENI and we don't end up with an IP rule conflict. Reported-by: Sebastian Wicki <gandro@gmx.net> Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit 298ece3 ] In addition to the inconsistency fixed in the previous commit, there is another we need to fix. On the deletion of ip rules with the form "to x.x.x.x/x", we check that we are running in ENI mode. We don't perform the same check on the creation of those rules. Thus, on AKS and AlibabaCloud, we could theoretically create rules of one form and attempt to delete the other form. In practice, this isn't causing a bug because: - The CIDR used in the "to x.x.x.x/x" form appears to always be 0.0.0.0/0 and is therefore equivalent to not having a to CIDR specified. - We don't support equivalent flags to --aws-release-excess-ips on AKS and AlibabaCloud and, as detailed in the previous commit, that flag is necessary to trigger the bug. Thus, this commit is fixing an inconsistency in the code that doesn't have any consequence today. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[ upstream commit f046521 ] The changes in previous PR #18766 was assumed shared service annotation is for local cluster, rather than for remote cluster. This commit is to perform the adjustment. It's easier to explain with concrete example. Let's say we have one clusters A and B, if the service in cluster B is having shared-service: false annotation added, below behavior will happen: - requests to cluster A service will LB to local endpoint only, as service B is no longer shared. - requests to cluster B service will LB to local and remote endpoints as usual. Relates #18766 Signed-off-by: Tam Mach <tam.mach@isovalent.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
8fe003a
to
885f97f
Compare
/test-only --focus="K8sServicesTest.*With ClusterIP external access" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backport of #18336 looks good to me.
/test-backport-1.11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on my part :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All backports with conflcits have been reviewed. All required tests are passing except for Runtime which is broken on all stable branches (cf. #18919). A bunch of other tests are failing but that's because I mistakenly triggered the master tests with |
{}
was parsed asmap["{}":""]
. #19172 -- cmd: Fix issue where a ConfigMap value of{}
was parsed asmap["{}":""]
. (@gandro)k8sT
instead ofk8s
Once this PR is merged, you can update the PR labels via:
or with