Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent scheduler crashing in default preemption plugin #101560

Merged
merged 1 commit into from
May 5, 2021

Conversation

yuanchen8911
Copy link
Member

@yuanchen8911 yuanchen8911 commented Apr 28, 2021

What type of PR is this?

/kind bug

What this PR does / why we need it:

FindCandidates or CallExtenders in defaultpreemption plugin can return victim without pods (e.g., because of issues in Filter plugins or extenders). As a result, the scheduler will panic and crash when accessing an non-existing pod. This PR adds additional checks to prevent it from crashing the scheduler.

Which issue(s) this PR fixes:

Fixes #101548

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 28, 2021
@k8s-ci-robot
Copy link
Contributor

@yuanchen8911: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Apr 28, 2021
@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 28, 2021
@yuanchen8911
Copy link
Member Author

/cc @Huang-Wei @ahg-g

@yuanchen8911
Copy link
Member Author

/sig scheduling

@yuanchen8911 yuanchen8911 changed the title Add additonal checks to prevent scheduler crashing because of bugs in… Add checks to prevent scheduler crashing caused by custom plugins Apr 28, 2021
Copy link
Member

@Huang-Wei Huang-Wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #101548 (comment), I'd like to understand the root cause, and then come up with all defensive checking at once in this PR.

Comment on lines 563 to 566
var (
victims *extenderv1.Victims
found bool
latestStartTime *metav1.Time
)
if victims, found = nodesToVictims[minNodes2[0]]; found {
latestStartTime = util.GetEarliestPodStartTime(victims)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neater:

var latestStartTime *metav1.Time
if victims, found := nodesToVictims[minNodes2[0]]; found {
	latestStartTime = util.GetEarliestPodStartTime(victims)
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed it.

@yuanchen8911
Copy link
Member Author

/retest

@yuanchen8911
Copy link
Member Author

Note that there are test cases that return candidates with empty victims. It makes the simple change if fits && len(pods) ==0 in dryRunPreemption fail the tests.

--- FAIL: TestDryRunPreemption (0.11s)
    --- FAIL: TestDryRunPreemption/a_pod_that_fits_with_no_preemption (0.00s)
        default_preemption_test.go:1059: cycle 0: unexpected candidates (-want, +got):   []defaultpreemption.Candidate{
            - 	&defaultpreemption.candidate{victims: &v1.Victims{}, name: "node1"},
            - 	&defaultpreemption.candidate{victims: &v1.Victims{}, name: "node2"},
              }
    --- FAIL: TestDryRunPreemption/a_pod_that_fits_on_one_node_with_no_preemption (0.00s)
        default_preemption_test.go:1059: cycle 0: unexpected candidates (-want, +got):   []defaultpreemption.Candidate{
            - 	&defaultpreemption.candidate{victims: &v1.Victims{}, name: "node1"},
              }

@yuanchen8911
Copy link
Member Author

/retest

@yuanchen8911 yuanchen8911 changed the title Add checks to prevent scheduler crashing caused by custom plugins Add checks to prevent scheduler crashing in defaultpreemption plugin May 2, 2021
@yuanchen8911
Copy link
Member Author

@Huang-Wei updated the description and PR. PTAL!

Comment on lines 339 to 345
var victims extenderv1.Victims
// do not create victims without pods
if len(pods) != 0 {
victims = extenderv1.Victims{
Pods: pods,
NumPDBViolations: int64(numPDBViolations),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should treat len(pods) == 0 as an internal error, right? Why not just do if status.IsSuccess() && len(pods) != 0 in L338?

Copy link
Member Author

@yuanchen8911 yuanchen8911 May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my original version but the unit test failed as there are test cases that return empty victims with fit= true. Please see my comment. #101560 (comment) Shall I update the unit test file instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, please update those tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should status.IsSuccess() && len(pods) == 0 be considered an error or just ignored?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's craft it as an internal error.

@@ -404,8 +408,14 @@ func CallExtenders(extenders []framework.Extender, pod *v1.Pod, nodeLister frame

var newCandidates []Candidate
for nodeName := range victimsMap {
// check if victims.Pods is empty
victims := victimsMap[nodeName]
if len(victims.Pods) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to run the check around L391: if an extender returns illegal result and extender.IsIgnorable() == false, return it as an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to run the check around L391: if an extender returns illegal result and extender.IsIgnorable() == false, return it as an error.

Check the validness of nodeNameToVictims returned by PrecessPreemption before L399?

for nodeName := range nodeNameToVictims {
	// check if victims.Pods is empty
 		victims := victimsMap[nodeName]
 		if len(victims.Pods) == 0 {
 		    delete(victims, nodeName)
 		 }
}  

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before L399, like this:

		for nodeName, victims := range nodeNameToVictims {
			if victims != nil && len(victims.Pods) != 0 {
				continue
			}
			if extender.IsIgnorable() {
				delete(nodeNameToVictims, nodeName)
				// klog....
				continue
			} else {
				return nil, framework.AsStatus(/* build an internal err */)
			}
		}

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 4, 2021
@yuanchen8911
Copy link
Member Author

Updated the PR as suggested, please review it. @Huang-Wei Thanks!

Copy link
Member

@Huang-Wei Huang-Wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits. LGTM otherwise.

return
}
if status.IsSuccess() && len(pods) == 0 {
status = framework.NewStatus(framework.Unschedulable, fmt.Sprintf("invalid victims on node %q returned by selectVictimsOnNode", nodeInfoCopy.Node().Name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
status = framework.NewStatus(framework.Unschedulable, fmt.Sprintf("invalid victims on node %q returned by selectVictimsOnNode", nodeInfoCopy.Node().Name))
status = framework.AsStatus(fmt.Errorf("invalid victims on node %q returned by selectVictimsOnNode", nodeInfoCopy.Node().Name))

@@ -391,6 +394,18 @@ func CallExtenders(extenders []framework.Extender, pod *v1.Pod, nodeLister frame
}
return nil, framework.AsStatus(err)
}
// Check if the returned victims are valid.
for nodeName, victims := range nodeNameToVictims {
if victims != nil && len(victims.Pods) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfer if victims == nil || len(victims.Pods) == 0 so that we can catch nil victims.

@yuanchen8911 yuanchen8911 changed the title Add checks to prevent scheduler crashing in defaultpreemption plugin Prevent scheduler crashing in default preemption plugin May 4, 2021
Comment on lines +1259 to +1251
if s == nil || len(s.Name()) == 0 {
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the first sub-test "No node needs preemption".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted the test case.

@yuanchen8911
Copy link
Member Author

@ahg-g could you review it when you get a chance? Thanks.

@yuanchen8911 yuanchen8911 force-pushed the master branch 2 times, most recently from 84086b9 to 9c22e00 Compare May 4, 2021 19:10
return
}
if status.IsSuccess() && len(pods) == 0 {
status = framework.AsStatus(fmt.Errorf("invalid victims on node %q returned by selectVictimsOnNode", nodeInfoCopy.Node().Name))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the error msg be more explicit: should not happen: the pod fits with no victim pods? Same below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the error status (set in L358) indicate "it should not happen"?

@yuanchen8911
Copy link
Member Author

/retest

@Huang-Wei
Copy link
Member

/approve
leaving /lgtm to @ahg-g

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Huang-Wei, yuanchen8911

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 4, 2021
if victims == nil || len(victims.Pods) == 0 {
if extender.IsIgnorable() {
delete(nodeNameToVictims, nodeName)
klog.Warningf("Ignoring victims without pods on node %q returned by extender", nodeName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warnings aren't recommended anymore, either error or info. In this case info probably fits better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warnings aren't recommended anymore, either error or info. In this case info probably fits better.

Changed it to InfoS.

return
}
if status.IsSuccess() && len(pods) == 0 {
status = framework.AsStatus(fmt.Errorf("invalid victims on node %q returned by selectVictimsOnNode", nodeInfoCopy.Node().Name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, be more explicit, like "expected at least one victim on node $q"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, be more explicit, like "expected at least one victim on node $q"

Updated the error msg. as suggested.

@Huang-Wei
Copy link
Member

/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 5, 2021
@ahg-g
Copy link
Member

ahg-g commented May 5, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 5, 2021
@k8s-ci-robot k8s-ci-robot merged commit aab19d4 into kubernetes:master May 5, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.22 milestone May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kube-scheduler crashes and restarts with panics in DefaultPreemption plugin
5 participants