New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-23566 - Update to kubernetes 1.27.8 #1808
OCPBUGS-23566 - Update to kubernetes 1.27.8 #1808
Conversation
When some plugin was registered as "unschedulable" in some previous scheduling attempt, it kept that attribute for a pod forever. When that plugin then later failed with an error that requires backoff, the pod was incorrectly moved to the "unschedulable" queue where it got stuck until the periodic flushing because there was no event that the plugin was waiting for. Here's an example where that happened: framework.go:1280: E0831 20:03:47.184243] Reserve/DynamicResources: Plugin failed err="Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" node="scheduler-perf-dra-7l2v2" plugin="DynamicResources" pod="test/test-dragxd5c" schedule_one.go:1001: E0831 20:03:47.184345] Error scheduling pod; retrying err="running Reserve plugin \"DynamicResources\": Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" pod="test/test-dragxd5c" ... scheduling_queue.go:745: I0831 20:03:47.198968] Pod moved to an internal scheduling queue pod="test/test-dragxd5c" event="ScheduleAttemptFailure" queue="Unschedulable" schedulingCycle=9576 hint="QueueSkip" Pop still needs the information about unschedulable plugins to update the UnschedulableReason metric. It can reset that information before returning the PodInfo for the next scheduling attempt.
The status error was embedded inside the new error constructed by WaitForPodsResponding's get function, but not wrapped. Therefore `apierrors.IsServiceUnavailable(err)` didn't find it and returned false -> no retries. Wrapping fixes this and Gomega formatting of the error remains useful: err := &errors.StatusError{} err.ErrStatus.Code = 503 err.ErrStatus.Message = "temporary failure" err2 := fmt.Errorf("Controller %s: failed to Get from replica pod %s:\n%w\nPod status:\n%s", "foo", "bar", err, "some status") fmt.Println(format.Object(err2, 1)) fmt.Println(errors.IsServiceUnavailable(err2)) => <*fmt.wrapError | 0xc000139340>: Controller foo: failed to Get from replica pod bar: temporary failure Pod status: some status { msg: "Controller foo: failed to Get from replica pod bar:\ntemporary failure\nPod status:\nsome status", err: <*errors.StatusError | 0xc0001a01e0>{ ErrStatus: { TypeMeta: {Kind: "", APIVersion: ""}, ListMeta: { SelfLink: "", ResourceVersion: "", Continue: "", RemainingItemCount: nil, }, Status: "", Message: "temporary failure", Reason: "", Details: nil, Code: 503, }, }, } true
it is required for Server-SIde-Apply to function correctly (SSA is based on OpenAPI schemas)
k8s_tag_files_matching looks for a slash after its argument, so the current value doesnt match anything also update codegen this is required for apiextensions-apiserver tests. After fixing apiextensions server tests to use type-aware SSA (instead of erroneously using untyped SSA) there were errors since none of the apiextensions types were actually used in the openapi given to tests.
it should never be nil
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
In the installation script we use coreos/etcd path which redirect to etcd-io/etcd. This commit replace the same. Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
…-of-#120559-origin-release-1.27 Automated cherry pick of kubernetes#120559: e2e pods: fix WaitForPodsResponding retry
…ck-of-#118027-upstream-release-1.27 [1.27] Automated cherry pick of kubernetes#118027: etcd: Update version to 3.5.9
The Service API Rest implementation is complex and has to use different hooks on the REST storage. The status store was making a shallow copy of the storage before adding the hooks, so it was not inheriting the hooks. The status store must have the same hooks as the rest store to be able to handle correctly the allocation and deallocation of ClusterIPs and nodePorts. Change-Id: I44be21468d36017f0ec41a8f912b8490f8f13f55 Signed-off-by: Antonio Ojea <aojea@google.com>
Change-Id: I7ed4b006faecf0a7e6e583c42b4d6bc4b786a164
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
this check needs to go after any mutations. After the mutating admission chain, rest.BeforeUpdate (which is responsible for reverting updates to immutable timestamp fields, among other things.) is called in the store.Update function. Without moving this check, it will be possible for an object to be written to etcd with only a change to its managed fields timestamp.
…rry-pick-of-#116865-upstream-release-1.27 Automated cherry pick of kubernetes#116865: move check for noop managed field timestamp updates
Bumping govmomi to include an error check fix needed to work with go1.20. We made this fix in the CI, but were reliant on the text matching of error strings, which is why it didn't catch the actual issue. This Fix in vmware/govmomi@b4eac19 PR to bump govmomi in cloud-provider-vsphere: kubernetes/cloud-provider-vsphere#738 Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
…p-1.27 [1.27][go1.20] .: bump govmomi to v0.30.6
…-of-#120334-origin-release-1.27 Automated cherry pick of kubernetes#120334: scheduler: start scheduling attempt with clean
…-of-#120623-upstream-release-1.27 Automated cherry pick of kubernetes#120623: sync Service API status rest storage
…k-of-#120577-upstream-release-1.27 Automated cherry pick of kubernetes#120577: Increase range of job_sync_duration_seconds
…-pick-of-#117539-upstream-release-1.27 Automated cherry pick of kubernetes#117539: mount-utils: fix flaky test 'TestFormat'
@soltysh: Overrode contexts on behalf of soltysh: ci/prow/verify, ci/prow/verify-commits In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
verified the unit failure manually |
@soltysh: Overrode contexts on behalf of soltysh: ci/prow/unit In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@jerpeter1: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
/remove-label backports/unvalidated-commits
/label backports/validated-commits
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jerpeter1, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/label cherry-pick-approved |
/label backport-risk-assessed |
/label jira/valid-reference |
@soltysh: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1925489
into
openshift:release-4.14
[ART PR BUILD NOTIFIER] This PR has been included in build openshift-enterprise-pod-container-v4.14.0-202312010833.p0.g1925489.assembly.stream for distgit openshift-enterprise-pod. |
OCPBUGS-23566 - followup to #1808
No description provided.