New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to Kubernetes v1.28.3 #1776
Conversation
When some plugin was registered as "unschedulable" in some previous scheduling attempt, it kept that attribute for a pod forever. When that plugin then later failed with an error that requires backoff, the pod was incorrectly moved to the "unschedulable" queue where it got stuck until the periodic flushing because there was no event that the plugin was waiting for. Here's an example where that happened: framework.go:1280: E0831 20:03:47.184243] Reserve/DynamicResources: Plugin failed err="Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" node="scheduler-perf-dra-7l2v2" plugin="DynamicResources" pod="test/test-dragxd5c" schedule_one.go:1001: E0831 20:03:47.184345] Error scheduling pod; retrying err="running Reserve plugin \"DynamicResources\": Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" pod="test/test-dragxd5c" ... scheduling_queue.go:745: I0831 20:03:47.198968] Pod moved to an internal scheduling queue pod="test/test-dragxd5c" event="ScheduleAttemptFailure" queue="Unschedulable" schedulingCycle=9576 hint="QueueSkip" Pop still needs the information about unschedulable plugins to update the UnschedulableReason metric. It can reset that information before returning the PodInfo for the next scheduling attempt.
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
The status error was embedded inside the new error constructed by WaitForPodsResponding's get function, but not wrapped. Therefore `apierrors.IsServiceUnavailable(err)` didn't find it and returned false -> no retries. Wrapping fixes this and Gomega formatting of the error remains useful: err := &errors.StatusError{} err.ErrStatus.Code = 503 err.ErrStatus.Message = "temporary failure" err2 := fmt.Errorf("Controller %s: failed to Get from replica pod %s:\n%w\nPod status:\n%s", "foo", "bar", err, "some status") fmt.Println(format.Object(err2, 1)) fmt.Println(errors.IsServiceUnavailable(err2)) => <*fmt.wrapError | 0xc000139340>: Controller foo: failed to Get from replica pod bar: temporary failure Pod status: some status { msg: "Controller foo: failed to Get from replica pod bar:\ntemporary failure\nPod status:\nsome status", err: <*errors.StatusError | 0xc0001a01e0>{ ErrStatus: { TypeMeta: {Kind: "", APIVersion: ""}, ListMeta: { SelfLink: "", ResourceVersion: "", Continue: "", RemainingItemCount: nil, }, Status: "", Message: "temporary failure", Reason: "", Details: nil, Code: 503, }, }, } true
Change-Id: Id29b5b377989dcb5377316cfcdea367071a47365
Co-authored-by: Dave Chen <dave.chen@arm.com>
The Service API Rest implementation is complex and has to use different hooks on the REST storage. The status store was making a shallow copy of the storage before adding the hooks, so it was not inheriting the hooks. The status store must have the same hooks as the rest store to be able to handle correctly the allocation and deallocation of ClusterIPs and nodePorts. Change-Id: I44be21468d36017f0ec41a8f912b8490f8f13f55 Signed-off-by: Antonio Ojea <aojea@google.com>
…-of-#120559-origin-release-1.28 Automated cherry pick of kubernetes#120559: e2e pods: fix WaitForPodsResponding retry
…k-of-#119824-upstream-release-1.28 Automated cherry pick of kubernetes#119824: fix race on etcd client constructor for healthchecks
…ick-of-#120561-upstream-release-1.28 Automated cherry pick of kubernetes#120561: kubeadm: remove reference of
Change-Id: I7ed4b006faecf0a7e6e583c42b4d6bc4b786a164
Bumping govmomi to include an error check fix needed to work with go1.20. We made this fix in the CI, but were reliant on the text matching of error strings, which is why it didn't catch the actual issue. This Fix in vmware/govmomi@b4eac19 PR to bump govmomi in cloud-provider-vsphere: kubernetes/cloud-provider-vsphere#738 Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
…p-1.28 [1.28][go1.20] .: bump govmomi to v0.30.6
- this function is used by other packages and was mistakenly removed in 397cc73 - let resource quota controller use this constructor instead of an object instantiation
…-of-#120334-origin-release-1.28 Automated cherry pick of kubernetes#120334: scheduler: start scheduling attempt with clean
…-of-#120623-upstream-release-1.28 Automated cherry pick of kubernetes#120623: sync Service API status rest storage
…k-of-#120577-upstream-release-1.28 Automated cherry pick of kubernetes#120577: Increase range of job_sync_duration_seconds
…pick-of-#120777-upstream-release-1.28 Automated cherry pick of kubernetes#120777: reintroduce resourcequota.NewMonitor
…list of cronjobs Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
…ry-pick-of-#119317-upstream-release-1.28 Automated cherry pick of kubernetes#119317: change rolling update logic to exclude sunsetting nodes
…port kmsv2: reload metrics bug fix backport
…y-pick-of-#120649-origin-release-1.28 Automated cherry pick of kubernetes#120649: cronjob controller: ensure already existing jobs are added to
This change switches to using isCgroup2UnifiedMode locally to ensure that any mocked function is also used when checking the swap controller availability. Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change bypasses all logic to set swap in the linux container resources if a swap controller is not available on node. Failing to do so may cause errors in runc when starting a container with a swap configuration -- even if this is set to 0. Signed-off-by: Evan Lezar <elezar@nvidia.com>
…ck-of-#120784-upstream-release-1.28 Automated cherry pick of kubernetes#120784: Use local isCgroup2UnifiedMode consistently
Signed-off-by: cpanato <ctadeu@gmail.com>
…-pick-of-#119732-upstream-release-1.28 Automated cherry pick of kubernetes#119732: Fix to honor PDB with an empty selector `{}`
…ated-cherry-pick-of-#121142-upstream-release-1.28 Automated cherry pick of kubernetes#121142: Modify test PVC to detect concurrent map write bug
Kubernetes official release v1.28.3
UnauthenticatedHTTP2DOSMitigation: {Default: true, PreRelease: featuregate.Beta}, | ||
|
||
||||||| 89a4ea3e1e4 | ||
======= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In upstream, this was disabled by kubernetes@0f33a62.
Also, switching the gate here wouldn't be enough. If this gate is to be enabled, it also needs to be switched in /pkg/features/kube_features.go
Update REBASE.openshift.md file with new RHEL 9 images.
This is a follow up commit to enable the UnauthenticatedHTTP2DOSMitigation feature gate in pkg/features/kube_features.go, which hadn't been done in the previous commit because the gate didn't exist in that file yet.
51ab05a
to
aaf5600
Compare
/test unit |
/test unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
/override ci/prow/verify-commits |
/remove-label backports/unvalidated-commits |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bertinatto, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@soltysh: Overrode contexts on behalf of soltysh: ci/prow/verify-commits In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@bertinatto: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
CC @soltysh