Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-23566 - Update to kubernetes 1.27.8 #1808

Merged
merged 89 commits into from Dec 1, 2023

Conversation

jerpeter1
Copy link
Member

No description provided.

pohly and others added 30 commits September 11, 2023 09:48
When some plugin was registered as "unschedulable" in some previous scheduling
attempt, it kept that attribute for a pod forever. When that plugin then later
failed with an error that requires backoff, the pod was incorrectly moved to the
"unschedulable" queue where it got stuck until the periodic flushing because
there was no event that the plugin was waiting for.

Here's an example where that happened:

     framework.go:1280: E0831 20:03:47.184243] Reserve/DynamicResources: Plugin failed err="Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" node="scheduler-perf-dra-7l2v2" plugin="DynamicResources" pod="test/test-dragxd5c"
    schedule_one.go:1001: E0831 20:03:47.184345] Error scheduling pod; retrying err="running Reserve plugin \"DynamicResources\": Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" pod="test/test-dragxd5c"
    ...
    scheduling_queue.go:745: I0831 20:03:47.198968] Pod moved to an internal scheduling queue pod="test/test-dragxd5c" event="ScheduleAttemptFailure" queue="Unschedulable" schedulingCycle=9576 hint="QueueSkip"

Pop still needs the information about unschedulable plugins to update the
UnschedulableReason metric. It can reset that information before returning the
PodInfo for the next scheduling attempt.
The status error was embedded inside the new error constructed by
WaitForPodsResponding's get function, but not wrapped. Therefore
`apierrors.IsServiceUnavailable(err)` didn't find it and returned false -> no
retries.

Wrapping fixes this and Gomega formatting of the error remains useful:

	err := &errors.StatusError{}
	err.ErrStatus.Code = 503
	err.ErrStatus.Message = "temporary failure"

	err2 := fmt.Errorf("Controller %s: failed to Get from replica pod %s:\n%w\nPod status:\n%s",
		"foo", "bar",
		err, "some status")
	fmt.Println(format.Object(err2, 1))
        fmt.Println(errors.IsServiceUnavailable(err2))

=>

    <*fmt.wrapError | 0xc000139340>:
    Controller foo: failed to Get from replica pod bar:
    temporary failure
    Pod status:
    some status
    {
        msg: "Controller foo: failed to Get from replica pod bar:\ntemporary failure\nPod status:\nsome status",
        err: <*errors.StatusError | 0xc0001a01e0>{
            ErrStatus: {
                TypeMeta: {Kind: "", APIVersion: ""},
                ListMeta: {
                    SelfLink: "",
                    ResourceVersion: "",
                    Continue: "",
                    RemainingItemCount: nil,
                },
                Status: "",
                Message: "temporary failure",
                Reason: "",
                Details: nil,
                Code: 503,
            },
        },
    }

    true
it is required for Server-SIde-Apply to function correctly (SSA is based on OpenAPI schemas)
k8s_tag_files_matching looks for a slash after its argument, so the current value doesnt match anything

also update codegen

this is required for apiextensions-apiserver tests. After fixing apiextensions server tests to use type-aware SSA (instead of erroneously using untyped SSA) there were errors since none of the apiextensions types were actually used in the openapi given to tests.
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
In the installation script we use coreos/etcd path which redirect
to etcd-io/etcd. This commit replace the same.

Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
…-of-#120559-origin-release-1.27

Automated cherry pick of kubernetes#120559: e2e pods: fix WaitForPodsResponding retry
…ck-of-#118027-upstream-release-1.27

[1.27] Automated cherry pick of kubernetes#118027: etcd: Update version to 3.5.9
The Service API Rest implementation is complex and has to use different
hooks on the REST storage. The status store was making a shallow copy of
the storage before adding the hooks, so it was not inheriting the hooks.

The status store must have the same hooks as the rest store to be able
to handle correctly the allocation and deallocation of ClusterIPs and
nodePorts.

Change-Id: I44be21468d36017f0ec41a8f912b8490f8f13f55
Signed-off-by: Antonio Ojea <aojea@google.com>
Change-Id: I7ed4b006faecf0a7e6e583c42b4d6bc4b786a164
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
this check needs to go after any mutations. After the mutating admission chain, rest.BeforeUpdate (which is responsible for reverting updates to immutable timestamp fields, among other things.) is called in the store.Update function. Without moving this check, it will be possible for an object to be written to etcd with only a change to its managed fields timestamp.
…rry-pick-of-#116865-upstream-release-1.27

Automated cherry pick of kubernetes#116865: move check for noop managed field timestamp updates
Bumping govmomi to include an error check fix needed
to work with go1.20. We made this fix in the CI, but
were reliant on the text matching of error strings,
which is why it didn't catch the actual issue. This

Fix in vmware/govmomi@b4eac19
PR to bump govmomi in cloud-provider-vsphere: kubernetes/cloud-provider-vsphere#738

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
…p-1.27

[1.27][go1.20] .: bump govmomi to v0.30.6
…-of-#120334-origin-release-1.27

Automated cherry pick of kubernetes#120334: scheduler: start scheduling attempt with clean
…-of-#120623-upstream-release-1.27

Automated cherry pick of kubernetes#120623: sync Service API status rest storage
…k-of-#120577-upstream-release-1.27

Automated cherry pick of kubernetes#120577: Increase range of job_sync_duration_seconds
…-pick-of-#117539-upstream-release-1.27

Automated cherry pick of kubernetes#117539: mount-utils: fix flaky test 'TestFormat'
Copy link

openshift-ci bot commented Nov 30, 2023

@soltysh: Overrode contexts on behalf of soltysh: ci/prow/verify, ci/prow/verify-commits

In response to this:

/test unit
/override ci/prow/verify-commits
this never passes on k8s bump PR
/override ci/prow/verify
this will be fixed in a followup (ref TBA)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@soltysh
Copy link
Member

soltysh commented Nov 30, 2023

verified the unit failure manually
/override ci/prow/unit

Copy link

openshift-ci bot commented Nov 30, 2023

@soltysh: Overrode contexts on behalf of soltysh: ci/prow/unit

In response to this:

verified the unit failure manually
/override ci/prow/unit

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

openshift-ci bot commented Nov 30, 2023

@jerpeter1: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/unit 9904f27 link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Member

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

/remove-label backports/unvalidated-commits
/label backports/validated-commits

@openshift-ci openshift-ci bot added backports/validated-commits Indicates that all commits come to merged upstream PRs. lgtm Indicates that a PR is ready to be merged. and removed backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. labels Nov 30, 2023
Copy link

openshift-ci bot commented Nov 30, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jerpeter1, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 30, 2023
@gangwgr
Copy link

gangwgr commented Nov 30, 2023

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Nov 30, 2023
@soltysh
Copy link
Member

soltysh commented Dec 1, 2023

/label backport-risk-assessed
/label jira/valid-bug
since #1806 merged

@openshift-ci openshift-ci bot added backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Dec 1, 2023
@soltysh
Copy link
Member

soltysh commented Dec 1, 2023

/label jira/valid-reference

Copy link

openshift-ci bot commented Dec 1, 2023

@soltysh: The label(s) /label jira/valid-reference cannot be applied. These labels are supported: acknowledge-critical-fixes-only, platform/aws, platform/azure, platform/baremetal, platform/google, platform/libvirt, platform/openstack, ga, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, px-approved, docs-approved, qe-approved, downstream-change-needed, rebase/manual, approved, backport-risk-assessed, backports/unvalidated-commits, backports/validated-commits, bugzilla/invalid-bug, bugzilla/valid-bug, cherry-pick-approved, jira/invalid-bug, jira/valid-bug, staff-eng-approved. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

/label jira/valid-reference

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@soltysh soltysh added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 1, 2023
@openshift-merge-bot openshift-merge-bot bot merged commit 1925489 into openshift:release-4.14 Dec 1, 2023
23 checks passed
@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

This PR has been included in build openshift-enterprise-pod-container-v4.14.0-202312010833.p0.g1925489.assembly.stream for distgit openshift-enterprise-pod.
All builds following this will include this PR.

openshift-merge-bot bot added a commit that referenced this pull request Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. backports/validated-commits Indicates that all commits come to merged upstream PRs. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet