Bug 2044347: bump to k8s 1.23.3 #1145

soltysh · 2022-01-26T09:54:31Z

This PR has the following changes:

bump to 1.23.3
update rebase doc (removing the step about pushing tags)

Replaces #1139

Signed-off-by: Carlos Panato <ctadeu@gmail.com>

In translation lib rbd pluin, the monitors slice is nil which has to be initialized with an empty value. This commit address the same. Signed-off-by: Humble Chirammal <hchiramm@redhat.com>

…19 from the rules Signed-off-by: Carlos Panato <ctadeu@gmail.com>

Signed-off-by: Carlos Panato <ctadeu@gmail.com>

In case stacked etcd is used, the code that does expiration checks does not validate if the etcd CA is "external" (missing key) and if the etcd CA signed certificates are valid. Add a new function UsingExternalEtcdCA() similar to existing functions for the cluster CA and front-proxy CA, that performs the checks for missing etcd CA key and certificate validity. This function only runs for stacked etcd, since if etcd is external kubeadm does not track any certs signed by that etcd CA. This fixes a bug where the etcd CA will be reported as local even if the etcd/ca.key is missing during "certs check-expiration".

Apply a small fix to ensure the kubeconfig files that kubeadm manages have a CA when printed in the table of the "check expiration" command. "CAName" is the field used for that. In practice kubeconfig files can contain multiple credentials from different CAs, but this is not supported by kubeadm and there is a single cluster CA that signs the single client cert/key in kubeadm managed kubeconfigs.

When the "kubeadm certs check-expiration" command is used and if the ca.key is not present, regular on disk certificate reads pass fine, but fail for kubeconfig files. The reason for the failure is that reading of kubeconfig files currently requires reading both the CA key and cert from disk. Reading the CA is done to ensure that the CA cert in the kubeconfig is not out of date during renewal. Instead of requiring both a CA key and cert to be read, only read the CA cert from disk, as only the cert is needed for kubeconfig files. This fixes printing the cert expiration table even if the ca.key is missing on a host (i.e. the CA is considered external).

Revert to previous behavior in 1.21/1.20 of setting pod phase to failed during graceful node shutdown. Setting pods to failed phase will ensure that external controllers that manage pods like deployments will create new pods to replace those that are shutdown. Many customers have taken a dependency on this behavior and it was breaking change in 1.22, so this change reverts back to the previous behavior. Signed-off-by: David Porter <david@porter.me>

…-fix-123 kubelet: set failed phase during graceful shutdown

…k-of-#106641-upstream-release-1.23 Automated cherry pick of kubernetes#106641: DelegateFSGroupToCSIDriver e2e: skip tests with chgrp

…pick-of-#106891-origin-release-1.23 Automated cherry pick of kubernetes#106891: kubeadm: validate local etcd certficates during

…pick-of-#106927-origin-release-1.23 Automated cherry pick of kubernetes#106854: kubeadm: avoid requiring a CA key during kubeconfig

…ck-of-#106878-upstream-release-1.23 Automated cherry pick of kubernetes#106878: rbd: initialize ceph monitors slice with an empty value.

Signed-off-by: Kevin Klues <kklues@nvidia.com>

Previously these would return lists that were too long because we appended to pre-initialized lists with a specific size. Since the primary place these functions are used is in the mean and standard deviation calculations for the NUMA distribution algorithm, it meant that the results of these calculations were often incorrect. As a result, some of the unit tests we have are actually incorrect (because the results we expect do not actually produce the best balanced distribution of CPUs across all NUMA nodes for the input provided). These tests will be patched up in subsequent commits. Signed-off-by: Kevin Klues <kklues@nvidia.com>

This fixes two related tests to better test our "balanced" distribution algorithm. The first test originally provided an input with the following number of CPUs available on each NUMA node: Node 0: 16 Node 1: 20 Node 2: 20 Node 3: 20 It then attempted to distribute 48 CPUs across them with an expectation that each of the first 3 NUMA nodes would have 16 CPUs taken from them (leaving Node 0 with no more CPUs in the end). This would have resulted in the following amount of CPUs on each node: Node 0: 0 Node 1: 4 Node 2: 4 Node 3: 20 Which results in a standard deviation of 7.6811 However, a more balanced solution would actually be to pull 16 CPUs from NUMA nodes 1, 2, and 3, and leave 0 untouched, i.e.: Node 0: 16 Node 1: 4 Node 2: 4 Node 3: 4 Which results in a standard deviation of 5.1961524227066 To fix this test we changed the original number of available CPUs to start with 4 less CPUs on NUMA node 3, and 2 more CPUs on NUMA node 0, i.e.: Node 0: 18 Node 1: 20 Node 2: 20 Node 3: 16 So that we end up with a result of: Node 0: 2 Node 1: 4 Node 2: 4 Node 3: 16 Which pulls the CPUs from where we want and results in a standard deviation of 5.5452 For the second test, we simply reverse the number of CPUs available for Nodes 0 and 3 as: Node 0: 16 Node 1: 20 Node 2: 20 Node 3: 18 Which forces the allocation to happen just as it did for the first test, except now on NUMA nodes 1, 2, and 3 instead of NUMA nodes 0,1, and 2. Signed-off-by: Kevin Klues <kklues@nvidia.com>

Now that the algorithm for balancing CPU distributions across NUMA nodes is correct, this test actually behaves differently for the "packed" vs. "distributed" allocation algorithms (as it should). In the "packed" case we need to ensure that CPUs are allocated such that they are packed onto cores. Since one CPU is already allocated from a core on NUMA node 0, we want the next CPU to be its hyperthreaded pair (even though the first available CPU id is on Socket 1). In the "distributed" case, however, we want to ensure CPUs are allocated such that we have an balanced distribution of CPUs across all NUMA nodes. This points to allocating from Socket 1 if the only other CPU allocated has been done on Socket 0. To allow CPUs allocations to be packed onto full cores, one can allocate them from the "distributed" algorithm with a 'cpuGroupSize' equal to the number of hypthreads per core (in this case 2). We added an explicit test case for this, demonstrating that we get the same result as the "packed" algorithm does, even though the "distributed" algorithm is in use. Signed-off-by: Kevin Klues <kklues@nvidia.com>

Previously the algorithm was too restrictive because it tried to calculate the minimum based on the number of *available* NUMA nodes and the number of *available* CPUs on those NUMA nodes. Since there was no (easy) way to tell how many CPUs an individual NUMA node happened to have, the average across them was used. Using this value however, could result in thinking you need more NUMA nodes to possibly satisfy a request than you actually do. By using the *total* number of NUMA nodes and CPUs per NUMA node, we can get the true minimum number of nodes required to satisfy a request. For a given "current" allocation this may not be the true minimum, but its better to start with fewer and move up than to start with too many and miss out on a better option. Signed-off-by: Kevin Klues <kklues@nvidia.com>

Signed-off-by: Kevin Klues <kklues@nvidia.com>

Without this fix, the algorithm may decide to allocate "remainder" CPUs from a NUMA node that has no more CPUs to allocate. Moreover, it was only considering allocation of remainder CPUs from NUMA nodes such that each NUMA node in the remainderSet could only allocate 1 (i.e. 'cpuGroupSize') more CPUs. With these two issues in play, one could end up with an accounting error where not enough CPUs were allocated by the time the algorithm runs to completion. The updated algorithm will now omit any NUMA nodes that have 0 CPUs left from the set of NUMA nodes considered for allocating remainder CPUs. Additionally, we now consider *all* combinations of nodes from the remainder set of size 1..len(remainderSet). This allows us to find a better solution if allocating CPUs from a smaller set leads to a more balanced allocation. Finally, we loop through all NUMA nodes 1-by-1 in the remainderSet until all rmeainer CPUs have been accounted for and allocated. This ensure that we will not hit an accounting error later on because we explicitly remove CPUs from the remainder set until there are none left. A follow-on commit adds a set of unit tests that will fail before these changes, but succeeds after them. Signed-off-by: Kevin Klues <kklues@nvidia.com>

openshift-ci · 2022-01-26T19:03:26Z

@soltysh: Overrode contexts on behalf of soltysh: ci/prow/verify-commits

In response to this:

/override ci/prow/verify-commits
that can't pass
/hold
for payload tests verification

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

soltysh · 2022-01-27T10:27:46Z

/payload 4.10 ci blocking
/payload 4.10 ci informing
/payload 4.10 nightly blocking
/payload 4.10 nightly informing

openshift-ci · 2022-01-27T10:27:47Z

@soltysh: trigger 5 jobs of type blocking for the ci release of OCP 4.10

periodic-ci-openshift-release-master-ci-4.10-e2e-aws-ovn-upgrade
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-ovn-upgrade
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-azure-upgrade
periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-upgrade
periodic-ci-openshift-release-master-ci-4.10-e2e-aws-serial

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c4b7bcd0-7f5b-11ec-8d7c-d99303b7cdb0-0

trigger 5 jobs of type informing for the ci release of OCP 4.10

periodic-ci-openshift-release-master-nightly-4.10-credentials-request-freeze
periodic-ci-openshift-release-master-ci-4.10-e2e-gcp
periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-upgrade
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-upgrade
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-aws-ovn-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c4b7bcd0-7f5b-11ec-8d7c-d99303b7cdb0-1

trigger 7 jobs of type blocking for the nightly release of OCP 4.10

periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-upgrade
periodic-ci-openshift-release-master-ci-4.10-e2e-azure-ovn-upgrade
periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-gcp-ovn-upgrade
periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-serial
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-ipv6
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-serial-ipv4

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c4b7bcd0-7f5b-11ec-8d7c-d99303b7cdb0-2

trigger 44 jobs of type informing for the nightly release of OCP 4.10

periodic-ci-openshift-release-master-nightly-4.10-e2e-aws
periodic-ci-openshift-release-master-nightly-4.10-console-aws
periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-csi
periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-fips
periodic-ci-openshift-release-master-ci-4.10-e2e-aws-ovn
periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-single-node
periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-single-node-serial
periodic-ci-openshift-release-master-ci-4.10-e2e-aws-techpreview
periodic-ci-openshift-release-master-ci-4.10-e2e-aws-techpreview-serial
periodic-ci-openshift-release-master-nightly-4.10-e2e-azure
periodic-ci-openshift-release-master-nightly-4.10-e2e-azure-csi
periodic-ci-openshift-release-master-ci-4.10-e2e-azure-ovn
periodic-ci-openshift-release-master-ci-4.10-e2e-azure-techpreview
periodic-ci-openshift-release-master-ci-4.10-e2e-azure-techpreview-serial
periodic-ci-openshift-release-master-ci-4.10-e2e-azure-upgrade-single-node
periodic-ci-openshift-release-master-nightly-4.10-e2e-azure-deploy-cnv
periodic-ci-openshift-release-master-nightly-4.10-e2e-azure-upgrade-cnv
periodic-ci-openshift-release-master-nightly-4.10-e2e-gcp
periodic-ci-openshift-release-master-nightly-4.10-e2e-gcp-csi
periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-ovn
periodic-ci-openshift-release-master-nightly-4.10-e2e-gcp-rt
periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-techpreview
periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-techpreview-serial
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-assisted
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-assisted-ipv6
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-dualstack
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-upgrade
periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.9-e2e-metal-ipi-upgrade
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-upgrade-ovn-ipv6
periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.9-e2e-metal-ipi-upgrade-ovn-ipv6
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-virtualmedia
periodic-ci-openshift-release-master-nightly-4.10-e2e-ovirt
periodic-ci-openshift-release-master-nightly-4.10-e2e-ovirt-csi
periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-proxy
periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-single-node-live-iso
periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-upgrade
periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.9-e2e-aws-upgrade
periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere
periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-csi
periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-serial
periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-techpreview
periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-techpreview-serial
periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-upi
periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-upi-serial

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c4b7bcd0-7f5b-11ec-8d7c-d99303b7cdb0-3

soltysh · 2022-01-27T16:09:43Z

/hold cancel
/remove-label backports/unvalidated-commits
/label backports/validated-commits

wallylewis · 2022-01-27T16:27:10Z

/lgtm

openshift-ci · 2022-01-27T16:32:51Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: soltysh, wallylewis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~DOWNSTREAM_OWNERS~~ [soltysh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

soltysh · 2022-01-27T16:37:36Z

/test e2e-aws-downgrade

wallylewis · 2022-01-27T17:37:03Z

/test e2e-aws-downgrade

soltysh · 2022-01-27T19:54:08Z

/override ci/prow/e2e-aws-downgrade

openshift-ci · 2022-01-27T19:54:39Z

@soltysh: Overrode contexts on behalf of soltysh: ci/prow/e2e-aws-downgrade

In response to this:

/override ci/prow/e2e-aws-downgrade

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

soltysh · 2022-01-27T20:12:33Z

/override ci/prow/verify-commits

openshift-ci · 2022-01-27T20:13:06Z

@soltysh: Overrode contexts on behalf of soltysh: ci/prow/verify-commits

In response to this:

/override ci/prow/verify-commits

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-bot · 2022-01-27T21:58:59Z

/retest-required