-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.12 backports 2022-10-06 #21610
v1.12 backports 2022-10-06 #21610
Conversation
[ upstream commit 22cf89a ] The graceful termination feature relies on receiving terminating endpoint events from Kubernetes, which are gated by a flag. Reference the terminating endpoint metric in the feature section so that users can track the metric to check if the feature is working correctly, as a first step. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 18ef607 ] The troubleshooting guide explains packet drops seen due to conntrack map fill-up issue. Reference some of the conntrack garbage collection related metrics so that users get visibility into conntrack gc runs - how often it's being triggered, how many entries were gc'd, etc. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 5b41102 ] The mal-formed instance-type looks like below in CiliumNode spec: ```yaml spec: ... alibaba-cloud: instance-type: | <?xml version="1.0" encoding="iso-8859-1"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>404 - Not Found</title> </head> <body> <h1>404 - Not Found</h1> </body> </html> ``` which prohibits cilium-operator from allocating PodIPs for this node. Signed-off-by: ArthurChiao <arthurchiao@hotmail.com> Reported-by: Jaff Cheng <jaff.cheng.sh@gmail.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 5b40396 ] These are the only 2 image fields that are currently not quoted. Some users templatize image name / tag values outside of Helm. Not quoting the image fields can cause issues if these values contain special characters. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 8ae1562 ] Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 3fe7919 ] This commit has no functional changes. It simply renames a few variables in enableIPsec to make their relationships clearer. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 91fdc20 ] This commit adds two new arguments to UpsertIPsecEndpoint to specify the outer source and destination IP address for IPsec. It doesn't include any functional changes. - UpsertIPsecEndpoint(local, remote, fwd *net.IPNet, ... + UpsertIPsecEndpoint(local, remote, fwd *net.IPNet, outerLocal, outerRemote net.IP, ... Until now, those two outer IP addresses were carried as part of the first two CIDR arguments, `local` and `remote`. For example, `local` would be equal to 192.168.56.11/24 where 192.168.56.0/24 would be used to match packets in XFRM policies and 192.168.56.11 as the outer IP address in XFRM states. The outer IPs are now in separate arguments and the next commit will change the local and remote arguments to not carry the IPs. Why this change? Because in a subsequent commit, I will need the CIDR and IP arguments to diverge. For example, we will have UpsertIPsecEndpoint calls with `local=0.0.0.0/0` and `outerLocal=192.168.56.11`. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 645da80 ] The previous commit changed the UpsertIPsecEndpoint function as follows: - UpsertIPsecEndpoint(local, remote, fwd *net.IPNet, ... + UpsertIPsecEndpoint(local, remote, fwd *net.IPNet, outerLocal, outerRemote net.IP, ... The first two CIDR arguments, `local` and `remote`, now don't need to carry the outer IP addresses (moved to `outerLocal` and `outerRemote`). We can therefore change calls to this function so that those two first arguments carry only the CIDR (i.e., changed from e.g. 192.168.56.11/24 to 192.168.56.0/24). As a result, we also don't need to mask those two arguments when we want only the CIDR part. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit b33adeb ] gops and pprof do not use the same protocol to collect profile data. Thus, the default port for pprof debug endpoints in `cilium-bugtool` should not be the one used for gops, but the default one for pprof itself. Besides, clustermesh-apiserver does not support pprof yet, but only gops. Thus, the help message for the pprof port option in cilium-bugtool is fixed accordingly. Fixes: #416319b1cd (bugtool: Default to the agent's gops port) Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 5520be1 ] submit-backport tried to create a backport PR with reviews from all contributors whose fixes are being backported, including people who do not have collaborator status in the repository. GitHub only allows reviews to be assigned to collaborators, and thus rejected the review assignments. This commit changes submit-backport to filter the review assignments to only include collaborators. Fixes: cilium#21548 Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 1019e70 ] cilium#21120 (comment) uncovered that WaitUntilReady() is racy when using the DemoDaemonSet. After adding an additional DaemonSet to this manifest, the number of expected pods (as calculated per numExpectedPods()) no longer matches the actually deployed pods. So WaitUntilReady() exits prematurely, and we potentially access the pods' state before they are fully set up. This explains the CI flakes we've observed in eg. cilium#21120. Fix this by updating the internal manifest description of the demo_ds.yaml manifest. This allows WaitUntilReady() to wait for the correct number of pods to become ready when using demo_ds. Also deleteInAnyNamespace() can clean up _all_ daemon sets of this manifest. Fixes: 74d9f56 ("egressgw: manager: add support for new egressGateway policy's property") Fixes cilium#21120 Reported-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit d45999d ] The script originally just exited and assumed the kubelet would restart it, but this is quite disruptive, since that restarts the agent as well. Additionally, it was discovered that some AWS installations need the service network to be functioning before the AWS CNI file is written, leaving clusters in a broken state. So, change the script to retry looking for the AWS CNI file itself. Fixes: cilium#21243 Signed-off-by: Casey Callendrello <cdc@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 4532996 ] In order to reclaim orphan entries in the NAT table the agent uses the connection tracking state: if a NAT entry is not backed by a connection, it should be considered orphaned and can be reclaimed. Currently the SNAT logic assumes that all connections that needs to be translated are already tracked. The only exception for this is connections originating from the local host, which are tracked by snat_v4_track_local(). In practice egress gateway is another case that requires special attention: in fact, all connections tunneled from a different node to a gateway node are currently not tracked. This means that whenever the CT garbage collector kicks in, all NAT entries for egress gateway will be deleted, potentially leading to connections breakage. This commit fix this by making sure we always track these connections. While at it, also rename snat_v4_track_local() to snat_v4_track_connection(). Towards: cilium#21346 Signed-off-by: Gilberto Bertin <jibi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 19c2d59 ] For consistency with its v4 counterpart. Signed-off-by: Gilberto Bertin <jibi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit c62b209 ] No functional changes introduced. Signed-off-by: Gilberto Bertin <jibi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 658045e ] In snat_v4_needed(), whenever NAT is required because of an egress gateway policy, we unconditionally set the from_endpoint parameter to true, as egress gateway traffic is always originating from an endpoint. This is wrong as that parameter is supposed to track whether traffic is coming from a _local_ endpoint, and so that variable should be set to true only if the __lookup_ip4_endpoint() lookup is successful. Signed-off-by: Gilberto Bertin <jibi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
@jibi looks like I left a git diff marker in the commit, will fix momentarily. |
[ upstream commit 59ac84f ] Move the from_endpoint parameter of snat_v4_needed() into ipv4_nat_target, and rename the function to snat_v4_prepare_state(). Signed-off-by: Gilberto Bertin <jibi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit f9697d2 ] Use the information stored in the SNAT target state to determine if the packet is coming from a local endpoint, saving one map lookup. Signed-off-by: Gilberto Bertin <jibi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
945019a
to
80a76e5
Compare
@jibi ^ force push removed git diff marker. |
The PR description for each backport needs a section like the following so that we can properly generate release notes for the release:
The standard procedure for backports in the backports guide should do this automatically for you, is there a reason this PR doesn't follow that process? |
@joestringer ah, i think i may have deleted the original generated file the script generates. I kept my own copy and used that as input to the branch creation script. |
@ldelossa I see. The general format of the description doesn't matter to me, but either way please ensure that the PR description has that |
Yup will do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4c1a0e2 LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
788f7e8 looks fine, thanks! 💯
/ci-awscni-1.12 |
/ci-aks-1.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR looks good. Thanks!
ci-awscni and ci-aks passed: all good for my commit! edit: I'm a dummy, I need to test EKS ☕ |
/ci-eks-1.12 |
@squeed These will all get tested as part of CI once Louis triggers it. |
@joestringer fixed |
dropped #21499 as per offline discussion. |
Due to permission to forked repo, replace this by #21631 |
PR: 20520 -- Reference datapath metrics in feature and troubleshooting guides (@aditighag) -- Reference datapath metrics in feature and troubleshooting guides #20520
Merge with 2 commit(s) merged at: Wed, 28 Sep 2022 13:53:28 +0000!
Branch: master (!) refs/pull/20520/head
---------- -------------------
-- Clean Merge
v (start)
| 22cf89a via 78f9615 ("docs: Reference terminating endpoints metric in kube-proxy free guide")
| 18ef607 via 167c768 ("docs: Reference conntrack metrics in troubleshooting guide")
v (end)
PR: 21495 -- alibabacloud: fix incorrect instance-type reported by cilium-agent (@ArthurChiao) -- alibabacloud: fix incorrect instance-type reported by cilium-agent #21495
Merge with 1 commit(s) merged at: Thu, 29 Sep 2022 21:23:18 +0000!
Branch: master (!) refs/pull/21495/head
---------- -------------------
-- Clean Merge
v (start)
| 5b41102 via 50ea506 ("alibabacloud: fix incorrect instance-type reported by cilium-agent")
v (end)
PR: 21463 -- helm: Quote operator and etcd-operator image fields (@michi-covalent) -- helm: Quote operator and etcd-operator image fields #21463
Merge with 1 commit(s) merged at: Fri, 30 Sep 2022 08:10:28 +0000!
Branch: master (!) refs/pull/21463/head
---------- -------------------
-- Clean Merge
v (start)
| 5b40396 via 5be1df7 ("helm: Quote operator and etcd-operator image fields")
v (end)
PR: 21461 -- ipsec: Refactoring around
UpsertIPsecEndpoint
(@pchaigno) -- ipsec: Refactoring aroundUpsertIPsecEndpoint
#21461Merge with 4 commit(s) merged at: Tue, 04 Oct 2022 10:19:54 +0000!
Branch: master (!) refs/pull/21461/head
---------- -------------------
-- Clean Merge
v (start)
| 8ae1562 via c15aee5 ("ipsec: Simplify DeleteIPsecEndpoint parameter")
| 3fe7919 via 8613c66 ("ipsec: Rename variables in enableIPsec")
| 91fdc20 via 07c03cb ("ipsec: New arguments for UpsertIPsecEndpoint")
| 645da80 via 23051ac ("ipsec: Simplify UpsertIPsecEndpoint CIDR arguments")
v (end)
PR: 21497 -- bugtool: Fix pprof default ports (@pippolo84) -- bugtool: Fix pprof default ports #21497
Merge with 1 commit(s) merged at: Tue, 04 Oct 2022 10:25:59 +0000!
Branch: master (!) refs/pull/21497/head
---------- -------------------
-- Merge Conflict - daemon/cmd/daemon_main.go, operator/flags.go
-- @pippolo84 - Please confirm changes look good.
v (start)
| b33adeb via 9e1917d ("bugtool: Fix pprof default ports")
v (end)
PR: 21565 -- ipcache: Fix metadata access from CIDR allocation (@joestringer) -- ipcache: Fix metadata access from CIDR allocation #21565
Merge with 1 commit(s) merged at: Tue, 04 Oct 2022 22:40:54 +0000!
Branch: master (!) refs/pull/21565/head
---------- -------------------
-- Merge Conflict - skipped due to complexity
-- @joestringer - does not merge cleanly and not sure context for change.
v (start)
| 021ea42 via 3f27143 ("ipcache: Fix metadata access from CIDR allocation")
v (end)
PR: 21577 -- contrib: avoid reviews from non-collaborators (@bimmlerd) -- contrib: avoid reviews from non-collaborators #21577
Merge with 1 commit(s) merged at: Wed, 05 Oct 2022 09:31:54 +0000!
Branch: master (!) refs/pull/21577/head
---------- -------------------
-- Clean Merge
v (start)
| 5520be1 via 7a7137c ("contrib: avoid reviews from non-collaborators")
v (end)
PR: 21588 -- test: fix up the number of pods in DemoDaemonSet (@julianwiedmann) -- test: fix up the number of pods in DemoDaemonSet #21588
Merge with 1 commit(s) merged at: Wed, 05 Oct 2022 17:19:48 +0000!
Branch: master (!) refs/pull/21588/head
---------- -------------------
-- Clean Merge
v (start)
| 1019e70 via ab5d964 ("test: fix up the number of pods in DemoDaemonSet")
v (end)
PR: 21444 -- plugins/cni: wait for AWS to drop its CNI configuration file (@squeed) -- plugins/cni: wait for AWS to drop its CNI configuration file #21444
Merge with 1 commit(s) merged at: Thu, 06 Oct 2022 00:16:55 +0000!
Branch: master (!) refs/pull/21444/head
---------- -------------------
-- Clean Merge
v (start)
| d45999d via 8b55f3f ("plugins/cni: wait for AWS to drop its CNI configuration file")
v (end)