Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: perform the admission checks that preemption will not help first to avoid meaningless pod eviction #116892

Merged
merged 1 commit into from
Oct 19, 2023

Conversation

SataQiu
Copy link
Member

@SataQiu SataQiu commented Mar 23, 2023

What type of PR is this?

/kind bug
/kind cleanup

What this PR does / why we need it:

When generalFilter fails, HandleAdmissionFailure will attempt to release resources for the critical Pod by triggering Pod eviction. This is a high cost operation and maybe unhelpful. We should perform the resource unrelated checks first and quick return to avoid meaningless pod eviction.

func (c *CriticalPodAdmissionHandler) HandleAdmissionFailure(admitPod *v1.Pod, failureReasons []lifecycle.PredicateFailureReason) ([]lifecycle.PredicateFailureReason, error) {
if !kubetypes.IsCriticalPod(admitPod) {
return failureReasons, nil
}
// InsufficientResourceError is not a reason to reject a critical pod.
// Instead of rejecting, we free up resources to admit it, if no other reasons for rejection exist.
nonResourceReasons := []lifecycle.PredicateFailureReason{}
resourceReasons := []*admissionRequirement{}
for _, reason := range failureReasons {
if r, ok := reason.(*lifecycle.InsufficientResourceError); ok {
resourceReasons = append(resourceReasons, &admissionRequirement{
resourceName: r.ResourceName,
quantity: r.GetInsufficientAmount(),
})
} else {
nonResourceReasons = append(nonResourceReasons, reason)
}
}
if len(nonResourceReasons) > 0 {
// Return only reasons that are not resource related, since critical pods cannot fail admission for resource reasons.
return nonResourceReasons, nil
}
err := c.evictPodsToFreeRequests(admitPod, admissionRequirementList(resourceReasons))
// if no error is returned, preemption succeeded and the pod is safe to admit.
return nil, err
}

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Mar 23, 2023
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 23, 2023
@pacoxu
Copy link
Member

pacoxu commented Mar 27, 2023

My concern is that this would rarely happen the pod would be rejected for OSSelector or OSField.

  • I am not sure if this is widely used or used in some special customers.

From a logical standpoint, this PR looks good to me. However, it would be helpful if you could provide a specific use case for it. For instance, imagine that a Windows Pod with high resource requirements is requested and labeled as such but without any corresponding Windows node selector or affinity. In this scenario, the pod may end up being scheduled on a Linux node and potentially cause an eviction of other pods on that node - particularly if the evicted pods have lower priority than the new Windows Pod.

If a large resource is requested, the pod should not be scheduled here. Therefore, I am unable to create a complete test case to reproduce this bug.

@SataQiu
Copy link
Member Author

SataQiu commented Mar 27, 2023

@pacoxu To be honest, It is not a usual problem. But how about these scenarios where the scheduler is not involved?
For example, a new critical static Pod with high memory requests is deployed over HTTP endpoint, kubelet mistakenly think that evicting some Pods can get it up and running, but finally failed by OS label mismatch. It is unhelpful to perform eviction prematurely.

@pacoxu
Copy link
Member

pacoxu commented Mar 28, 2023

/lgtm
but I still think this would be caused by old-style use and some user mistake configuration(with os label but no node selector/node affinity).

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 28, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 3e03565994eadcdfb121a980239e2c1b68b0ac54

@SataQiu
Copy link
Member Author

SataQiu commented Mar 28, 2023

/assign @klueska

@SataQiu SataQiu added this to Needs Approver in SIG Node PR Triage Mar 28, 2023
@SataQiu
Copy link
Member Author

SataQiu commented Mar 28, 2023

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 28, 2023
@marquiz
Copy link
Contributor

marquiz commented May 31, 2023

lgtm

ping @klueska could you take a look

@SataQiu
Copy link
Member Author

SataQiu commented Jul 11, 2023

Hi @klueska
Would you mind taking a look?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrunalp, SataQiu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 18, 2023
@mrunalp mrunalp moved this from Needs Approver to Done in SIG Node PR Triage Oct 18, 2023
@k8s-ci-robot k8s-ci-robot merged commit 3cb3e8b into kubernetes:master Oct 19, 2023
1 check passed
@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants