feature(scheduler): won't run Filter if PreFilter returned a Skip status #114125

sanposhiho · 2022-11-25T05:11:57Z

What type of PR is this?

/kind feature
/sig scheduling

What this PR does / why we need it:

Change the framework so that it doesn't run plugins's Filter() if its PreFilter() returned a Skip status.
For example, nodeAffinity can return Skip in PreFilter if the pod doesn't specify any node selector or affinity.
This skip status is basically regarded as "Success" and you needs to distinguish Skip from Success by IsSkip func, not IsSuccess func if you need to refer to Skip status.

#112637 got reverted because of the bug. 🙏
This PR also changes NodeAffinity PreFilter to return Skip so that we can ensure the same bug doesn't exist in this PR anymore. It's just a starting point, once this PR gets merged, I'll work on changing other PreFilter plugins to return Skip appropriately.

This PR is composed of three commits:

First one: the same commit as feature(scheduler): won't run Filter if PreFilter returned a Skip status #112637
The second one: the patch for the bug + unit test.
The third one: support skip in NodeAffinity.

Which issue(s) this PR fixes:

Part of #107556
Part of #110643

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Scheduler doesn't run plugin's Filter method when its PreFilter method returned a Skip status.
In other words, your PreFilter/Filter plugin can return a Skip status in PreFilter if the plugin does nothing in Filter for that Pod.
Scheduler skips NodeAffinity Filter plugin when NodeAffinity Filter plugin has nothing to do with a Pod.
It may affect some metrics values related to the NodeAffinity Filter plugin.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2022-11-25T05:12:00Z

Please note that we're already in Test Freeze for the release-1.26 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.26.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Fri Nov 25 03:33:55 UTC 2022.

k8s-ci-robot · 2022-11-25T05:12:04Z

@sanposhiho: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sanposhiho · 2022-11-25T05:37:58Z

/assign @alculquicondor
🙏

sanposhiho · 2022-11-25T05:45:25Z

/retest

alculquicondor · 2022-11-25T16:08:19Z

pkg/scheduler/schedule_one_test.go

+				st.RegisterPreFilterPlugin(
+					"FakePreFilter1",
+					st.NewFakePreFilterPlugin("FakeFilter1", nil, nil),
+				),
+				st.RegisterFilterPlugin(
+					"FakeFilter1",
+					st.NewFakeFilterPlugin(map[string]framework.Code{
+						"node1": framework.Unschedulable,
+					}),
+				),


do we need these?

Here, I want to make sure that all other PreFilter or Filter plugins are executed even if some plugins return skip in PreFilter and some plugins are skipped in Filter.

I'm fine leaving that tests just to the runtime package, but up to you.

I'd like to keep it because if this test have existed in the past PR (reverted one), actually we wouldn't have encountered the bug.
The runtime package's test can only confirm the behavior of specific extension point (Only either Filter or PreFilter) and it cannot confirm what if the scheduler actually run from PreFilter to Filter.

alculquicondor · 2022-11-25T16:16:10Z

pkg/scheduler/framework/plugins/nodeaffinity/node_affinity.go

@@ -94,8 +94,15 @@ func (pl *NodeAffinity) PreFilter(ctx context.Context, cycleState *framework.Cyc
 	affinity := pod.Spec.Affinity
 	if affinity == nil ||
 		affinity.NodeAffinity == nil ||
-		affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution == nil ||
-		len(affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms) == 0 {


shouldn't we preserve this line here?

So that if there is no RequiredDuringSchedulingIgnoredDuringExecution and no addedNodeSelector and no .Spec.NodeSelector, we also return Skip.

You mean we can add len(affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms) == 0 here, right? I think that's true. Fixed

Looking at an existing test... it looks like my assumption was wrong. Empty NodeSelectorTerms for a non-nil RequiredDuringScheduling means no match?

Ohh, that's true.
Btw, I checked the validation logic, and Pods with a nil []NodeSelectorTerm should be rejected actually.
https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/validation/validation.go#L4012-L4014

Anyway, even given that, we probably should keep the original behavior.

ahh, good find. It should be fine to remove the test case. Up to you.

alculquicondor · 2022-11-25T16:18:52Z

pkg/scheduler/framework/plugins/nodeaffinity/node_affinity_test.go

 		},
 		{
 			name: "missing labels",
 			pod: st.MakePod().NodeSelector(map[string]string{
 				"foo": "bar",
 			}).Obj(),
-			wantStatus: framework.NewStatus(framework.UnschedulableAndUnresolvable, ErrReasonPod),
+			wantStatus:       framework.NewStatus(framework.UnschedulableAndUnresolvable, ErrReasonPod),
+			disablePreFilter: true,


why? Same question for the rest of the cases.

Sorry, we don't need it. Removed all disablePreFilter.

alculquicondor · 2022-11-25T16:21:03Z

pkg/scheduler/framework/plugins/nodeaffinity/node_affinity_test.go

@@ -45,15 +45,18 @@ func TestNodeAffinity(t *testing.T) {
 		disablePreFilter    bool
 	}{
 		{
-			name: "no selector",
-			pod:  &v1.Pod{},
+			name:                "no selector and affinity",


how is this different from test case "Pod with no Affinity will schedule onto a node" below?

Seems duplicated. Removed it 🙏

Remove the new case, instead of the old one. It's better for the git history.

kerthcet · 2022-12-05T11:06:53Z

pkg/scheduler/framework/interface.go

 func (s *Status) IsSuccess() bool {
-	return s.Code() == Success
+	return s.Code() == Success || s.Code() == Skip


Is this reasonable? IsSuccess general means we'll perform happy path then, but IsSkip means we'll skip the phase.

This method is generally used like if !s.IsSuccess(), in which it makes sense. Maybe we should change the name? But I think Skip is not too different from Success.

I feel the same as @alculquicondor. Skip is also kinda happy path.
I'm not against renaming, but is there any good name for the func? No good idea from my poor English vocab. 😓

We not only depend on !s.IsSuccess(), also s.IsSuccess(), see here (also other places)

kubernetes/pkg/scheduler/schedule_one.go

Lines 514 to 521 in 17bf864

if status.IsSuccess() {

length := atomic.AddInt32(&feasibleNodesLen, 1)

if length > numNodesToFind {

cancel()

atomic.AddInt32(&feasibleNodesLen, -1)

} else {

feasibleNodes[length-1] = nodeInfo.Node()

}

This is not for PreFilter, but we now changed the underlying meanings of Success. If the code is Skip, I don't think we should process with this. Maybe we should add another method like IsPassed() means we can continue with the logic that follows, both success or skip.

Keeping IsSuccess as it is (= only Success) and creating another func like IsPassed(= Success and Skip) makes sense to me.

In pkg/scheduler/schedule_one.go, we currently need to use IsSkip to distinguish Skip from IsSuccess(= Success and Skip), and a new IsSuccess (= only Success) would be useful there.
Then, we can just change all other places where using IsSuccess to use IsPassed

4fec5285ed6. The diff gets much bigger now, but I believe I replace all IsSuccess to IsPassed. (Other than the ones in pkg/scheduler/schedule_one.go which was used like IsSuccess() && !IsSkip().)

sanposhiho · 2022-12-06T02:34:57Z

/retest

kerthcet · 2022-12-06T02:35:20Z

pkg/scheduler/framework/runtime/framework.go

@@ -617,6 +620,9 @@ func (f *frameworkImpl) RunPreFilterPlugins(ctx context.Context, state *framewor
 			}
 			return nil, framework.AsStatus(fmt.Errorf("running PreFilter plugin %q: %w", pl.Name(), s.AsError())).WithFailedPlugin(pl.Name())
 		}
+		if s.IsSkip() {


Currently, if status is not success, how will it be skipped? I think we missed a test for this.

I may not correctly understand what you mean, but if a returned status isn't a success, then it doesn't reach here.

Forget this for my misread, sorry.

kerthcet · 2022-12-06T03:05:37Z

pkg/scheduler/framework/interface.go

 func (s *Status) IsSuccess() bool {
-	return s.Code() == Success
+	return s.Code() == Success || s.Code() == Skip


We not only depend on !s.IsSuccess(), also s.IsSuccess(), see here (also other places)

kubernetes/pkg/scheduler/schedule_one.go

Lines 514 to 521 in 17bf864

if status.IsSuccess() {

length := atomic.AddInt32(&feasibleNodesLen, 1)

if length > numNodesToFind {

cancel()

atomic.AddInt32(&feasibleNodesLen, -1)

} else {

feasibleNodes[length-1] = nodeInfo.Node()

}

This is not for PreFilter, but we now changed the underlying meanings of Success. If the code is Skip, I don't think we should process with this. Maybe we should add another method like IsPassed() means we can continue with the logic that follows, both success or skip.

k8s-ci-robot · 2023-01-05T16:58:09Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, sanposhiho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/scheduler/OWNERS~~ [alculquicondor]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sanposhiho · 2023-01-05T17:10:57Z

Squashed. Thanks @alculquicondor @kerthcet for the long-time effort in reviewing this.

sanposhiho · 2023-01-05T17:49:21Z

pkg/scheduler/framework/runtime/framework_test.go

+		state := framework.NewCycleState()
+
+		f.RunPreFilterPlugins(ctx, state, nil)
+		f.RunPreFilterExtensionAddPod(ctx, state, nil, nil, nil)
+		f.RunPreFilterExtensionRemovePod(ctx, state, nil, nil, nil)


Just added these changes to prevent nil-pointer panic in RunPreFilterExtensionAddPod and RunPreFilterExtensionRemovePod. 🙏

alculquicondor

/lgtm
Thanks!

k8s-ci-robot · 2023-01-05T17:57:11Z

LGTM label has been added.

Git tree hash: 0af973e0dd5d182a572893ef6ccdd1f6226dfe4c

k8s-triage-robot · 2023-01-05T20:59:34Z

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

The PR does have any do-not-merge/* labels
The PR does not have the needs-ok-to-test label
The PR is mergeable (does not have a needs-rebase label)
The PR is approved (has cncf-cla: yes, lgtm, approved labels)
The PR is failing tests required for merge

You can:

Review the full test history for this PR
Prevent this bot from retesting with /lgtm cancel or /hold
Help make our tests less flaky by following our Flaky Tests Guide

/retest

k8s-triage-robot · 2023-01-06T00:07:34Z

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

The PR does have any do-not-merge/* labels
The PR does not have the needs-ok-to-test label
The PR is mergeable (does not have a needs-rebase label)
The PR is approved (has cncf-cla: yes, lgtm, approved labels)
The PR is failing tests required for merge

You can:

Review the full test history for this PR
Prevent this bot from retesting with /lgtm cancel or /hold
Help make our tests less flaky by following our Flaky Tests Guide

/retest

k8s-triage-robot · 2023-01-06T03:16:34Z

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

The PR does have any do-not-merge/* labels
The PR does not have the needs-ok-to-test label
The PR is mergeable (does not have a needs-rebase label)
The PR is approved (has cncf-cla: yes, lgtm, approved labels)
The PR is failing tests required for merge

You can:

Review the full test history for this PR
Prevent this bot from retesting with /lgtm cancel or /hold
Help make our tests less flaky by following our Flaky Tests Guide

/retest

pacoxu · 2023-01-06T03:34:37Z

pkg/scheduler/framework/runtime/framework_test.go

+		name                string
+		plugins             []*TestPlugin
+		wantPreFilterResult *framework.PreFilterResult
+		wantSkippedPlugins  sets.Set[string]
+		wantStatusCode      framework.Code


/hold
for go format CI failure
feel free to unhold once the CI is green.

sanposhiho · 2023-01-06T07:40:30Z

Oops, sorry for that. Formated. 🙏

sanposhiho · 2023-01-06T08:15:22Z

/retest

#110697

sanposhiho · 2023-01-06T08:47:52Z

/retest

#114438

kerthcet · 2023-01-06T09:11:15Z

/lgtm

k8s-ci-robot · 2023-01-06T09:11:21Z

LGTM label has been added.

Git tree hash: fa18af83341c08a7817aff55c709c5f3378289f7

pacoxu · 2023-01-06T09:12:18Z

/unhold

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 25, 2022

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 25, 2022

k8s-ci-robot requested review from chendave and denkensk November 25, 2022 05:12

k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Nov 25, 2022

k8s-ci-robot assigned alculquicondor Nov 25, 2022

sanposhiho force-pushed the skip-reimplementation branch from 5c0fba5 to a54e6ad Compare November 25, 2022 13:22

alculquicondor reviewed Nov 25, 2022

View reviewed changes

kerthcet mentioned this pull request Dec 5, 2022

Optimize NodeVolumeLimits plugin #114143

Closed

kerthcet reviewed Dec 5, 2022

View reviewed changes

kerthcet reviewed Dec 6, 2022

View reviewed changes

alculquicondor mentioned this pull request Dec 8, 2022

pass status to handleSchedulingFailure #114082

Merged

This was referenced Dec 10, 2022

Add NodeName PreFilter to improve the scheduling efficiency of pod with NodeName #114386

Closed

[Umbrella] Skip Filter plugins when coupled PreFilter plugin returns Skip status #114399

Closed

sivchari mentioned this pull request Dec 11, 2022

Add TaintToleration PreFilter to improve the scheduling efficiency of pod #114405

Closed

sanposhiho force-pushed the skip-reimplementation branch from dacfe90 to fa6cb72 Compare January 5, 2023 17:10

sanposhiho force-pushed the skip-reimplementation branch 2 times, most recently from dd138f5 to 6791547 Compare January 5, 2023 17:48

sanposhiho commented Jan 5, 2023

View reviewed changes

alculquicondor reviewed Jan 5, 2023

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 5, 2023

pacoxu reviewed Jan 6, 2023

View reviewed changes

k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jan 6, 2023

k8s-ci-robot requested a review from alculquicondor January 6, 2023 07:37

feature(scheduler): won't run Filter if PreFilter returned a Skip status

e5b5367

sanposhiho force-pushed the skip-reimplementation branch from 7189322 to e5b5367 Compare January 6, 2023 07:38

k8s-ci-robot assigned kerthcet Jan 6, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 6, 2023

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 6, 2023

k8s-ci-robot merged commit c549b59 into kubernetes:master Jan 6, 2023

k8s-ci-robot added this to the v1.27 milestone Jan 6, 2023

alculquicondor mentioned this pull request Feb 13, 2023

feature(scheduler): won't run Score if PreScore returned a Skip status #115652

Merged

	if status.IsSuccess() {
	length := atomic.AddInt32(&feasibleNodesLen, 1)
	if length > numNodesToFind {
	cancel()
	atomic.AddInt32(&feasibleNodesLen, -1)
	} else {
	feasibleNodes[length-1] = nodeInfo.Node()
	}

feature(scheduler): won't run Filter if PreFilter returned a Skip status #114125

feature(scheduler): won't run Filter if PreFilter returned a Skip status #114125

Conversation

sanposhiho commented Nov 25, 2022

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Nov 25, 2022

k8s-ci-robot commented Nov 25, 2022

sanposhiho commented Nov 25, 2022

sanposhiho commented Nov 25, 2022

Choose a reason for hiding this comment

sanposhiho Dec 6, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanposhiho Dec 23, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanposhiho Dec 6, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanposhiho Dec 6, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanposhiho Dec 10, 2022 • edited

Choose a reason for hiding this comment

sanposhiho commented Dec 6, 2022

Choose a reason for hiding this comment

sanposhiho Dec 10, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jan 5, 2023

sanposhiho commented Jan 5, 2023

sanposhiho Jan 5, 2023 • edited

Choose a reason for hiding this comment

alculquicondor left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jan 5, 2023

k8s-triage-robot commented Jan 5, 2023

k8s-triage-robot commented Jan 6, 2023

k8s-triage-robot commented Jan 6, 2023

pacoxu Jan 6, 2023 • edited

Choose a reason for hiding this comment

sanposhiho commented Jan 6, 2023

sanposhiho commented Jan 6, 2023

sanposhiho commented Jan 6, 2023

kerthcet commented Jan 6, 2023

k8s-ci-robot commented Jan 6, 2023

pacoxu commented Jan 6, 2023

sanposhiho Dec 6, 2022 •

edited

sanposhiho Dec 23, 2022 •

edited

sanposhiho Dec 6, 2022 •

edited

sanposhiho Dec 6, 2022 •

edited

sanposhiho Dec 10, 2022 •

edited

sanposhiho Dec 10, 2022 •

edited

sanposhiho Jan 5, 2023 •

edited

pacoxu Jan 6, 2023 •

edited