Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(scheduler): won't run Filter if PreFilter returned a Skip status #114125

Merged
merged 1 commit into from Jan 6, 2023

Conversation

sanposhiho
Copy link
Member

What type of PR is this?

/kind feature
/sig scheduling

What this PR does / why we need it:

Change the framework so that it doesn't run plugins's Filter() if its PreFilter() returned a Skip status.
For example, nodeAffinity can return Skip in PreFilter if the pod doesn't specify any node selector or affinity.
This skip status is basically regarded as "Success" and you needs to distinguish Skip from Success by IsSkip func, not IsSuccess func if you need to refer to Skip status.

#112637 got reverted because of the bug. 🙏
This PR also changes NodeAffinity PreFilter to return Skip so that we can ensure the same bug doesn't exist in this PR anymore. It's just a starting point, once this PR gets merged, I'll work on changing other PreFilter plugins to return Skip appropriately.


This PR is composed of three commits:

Which issue(s) this PR fixes:

Part of #107556
Part of #110643

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Scheduler doesn't run plugin's Filter method when its PreFilter method returned a Skip status.
In other words, your PreFilter/Filter plugin can return a Skip status in PreFilter if the plugin does nothing in Filter for that Pod.
Scheduler skips NodeAffinity Filter plugin when NodeAffinity Filter plugin has nothing to do with a Pod.
It may affect some metrics values related to the NodeAffinity Filter plugin.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Nov 25, 2022
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.26 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.26.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Fri Nov 25 03:33:55 UTC 2022.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 25, 2022
@k8s-ci-robot
Copy link
Contributor

@sanposhiho: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 25, 2022
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Nov 25, 2022
@sanposhiho
Copy link
Member Author

/assign @alculquicondor
🙏

@sanposhiho
Copy link
Member Author

/retest

Comment on lines +1986 to +1996
st.RegisterPreFilterPlugin(
"FakePreFilter1",
st.NewFakePreFilterPlugin("FakeFilter1", nil, nil),
),
st.RegisterFilterPlugin(
"FakeFilter1",
st.NewFakeFilterPlugin(map[string]framework.Code{
"node1": framework.Unschedulable,
}),
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need these?

Copy link
Member Author

@sanposhiho sanposhiho Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I want to make sure that all other PreFilter or Filter plugins are executed even if some plugins return skip in PreFilter and some plugins are skipped in Filter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine leaving that tests just to the runtime package, but up to you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep it because if this test have existed in the past PR (reverted one), actually we wouldn't have encountered the bug. 
The runtime package's test can only confirm the behavior of specific extension point (Only either Filter or PreFilter) and it cannot confirm what if the scheduler actually run from PreFilter to Filter.

@@ -94,8 +94,15 @@ func (pl *NodeAffinity) PreFilter(ctx context.Context, cycleState *framework.Cyc
affinity := pod.Spec.Affinity
if affinity == nil ||
affinity.NodeAffinity == nil ||
affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution == nil ||
len(affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we preserve this line here?

So that if there is no RequiredDuringSchedulingIgnoredDuringExecution and no addedNodeSelector and no .Spec.NodeSelector, we also return Skip.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean we can add len(affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms) == 0 here, right? I think that's true. Fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at an existing test... it looks like my assumption was wrong. Empty NodeSelectorTerms for a non-nil RequiredDuringScheduling means no match?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, that's true.
Btw, I checked the validation logic, and Pods with a nil []NodeSelectorTerm should be rejected actually.
https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/validation/validation.go#L4012-L4014

Anyway, even given that, we probably should keep the original behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, good find. It should be fine to remove the test case. Up to you.

},
{
name: "missing labels",
pod: st.MakePod().NodeSelector(map[string]string{
"foo": "bar",
}).Obj(),
wantStatus: framework.NewStatus(framework.UnschedulableAndUnresolvable, ErrReasonPod),
wantStatus: framework.NewStatus(framework.UnschedulableAndUnresolvable, ErrReasonPod),
disablePreFilter: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? Same question for the rest of the cases.

Copy link
Member Author

@sanposhiho sanposhiho Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, we don't need it. Removed all disablePreFilter.

@@ -45,15 +45,18 @@ func TestNodeAffinity(t *testing.T) {
disablePreFilter bool
}{
{
name: "no selector",
pod: &v1.Pod{},
name: "no selector and affinity",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this different from test case "Pod with no Affinity will schedule onto a node" below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems duplicated. Removed it 🙏

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the new case, instead of the old one. It's better for the git history.

func (s *Status) IsSuccess() bool {
return s.Code() == Success
return s.Code() == Success || s.Code() == Skip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reasonable? IsSuccess general means we'll perform happy path then, but IsSkip means we'll skip the phase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is generally used like if !s.IsSuccess(), in which it makes sense. Maybe we should change the name? But I think Skip is not too different from Success.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the same as @alculquicondor. Skip is also kinda happy path.
I'm not against renaming, but is there any good name for the func? No good idea from my poor English vocab. 😓

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We not only depend on !s.IsSuccess(), also s.IsSuccess(), see here (also other places)

if status.IsSuccess() {
length := atomic.AddInt32(&feasibleNodesLen, 1)
if length > numNodesToFind {
cancel()
atomic.AddInt32(&feasibleNodesLen, -1)
} else {
feasibleNodes[length-1] = nodeInfo.Node()
}

This is not for PreFilter, but we now changed the underlying meanings of Success. If the code is Skip, I don't think we should process with this. Maybe we should add another method like IsPassed() means we can continue with the logic that follows, both success or skip.

Copy link
Member Author

@sanposhiho sanposhiho Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping IsSuccess as it is (= only Success) and creating another func like IsPassed(= Success and Skip) makes sense to me.

In pkg/scheduler/schedule_one.go, we currently need to use IsSkip to distinguish Skip from IsSuccess(= Success and Skip), and a new IsSuccess (= only Success) would be useful there.
Then, we can just change all other places where using IsSuccess to use IsPassed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4fec5285ed6. The diff gets much bigger now, but I believe I replace all IsSuccess to IsPassed. (Other than the ones in pkg/scheduler/schedule_one.go which was used like IsSuccess() && !IsSkip().)

@sanposhiho
Copy link
Member Author

/retest

@@ -617,6 +620,9 @@ func (f *frameworkImpl) RunPreFilterPlugins(ctx context.Context, state *framewor
}
return nil, framework.AsStatus(fmt.Errorf("running PreFilter plugin %q: %w", pl.Name(), s.AsError())).WithFailedPlugin(pl.Name())
}
if s.IsSkip() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, if status is not success, how will it be skipped? I think we missed a test for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not correctly understand what you mean, but if a returned status isn't a success, then it doesn't reach here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget this for my misread, sorry.

func (s *Status) IsSuccess() bool {
return s.Code() == Success
return s.Code() == Success || s.Code() == Skip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We not only depend on !s.IsSuccess(), also s.IsSuccess(), see here (also other places)

if status.IsSuccess() {
length := atomic.AddInt32(&feasibleNodesLen, 1)
if length > numNodesToFind {
cancel()
atomic.AddInt32(&feasibleNodesLen, -1)
} else {
feasibleNodes[length-1] = nodeInfo.Node()
}

This is not for PreFilter, but we now changed the underlying meanings of Success. If the code is Skip, I don't think we should process with this. Maybe we should add another method like IsPassed() means we can continue with the logic that follows, both success or skip.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. sig/storage Categorizes an issue or PR as relevant to SIG Storage. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 10, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, sanposhiho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sanposhiho
Copy link
Member Author

Squashed. Thanks @alculquicondor @kerthcet for the long-time effort in reviewing this.

@sanposhiho sanposhiho force-pushed the skip-reimplementation branch 2 times, most recently from dd138f5 to 6791547 Compare January 5, 2023 17:48
Comment on lines +1405 to +1409
state := framework.NewCycleState()

f.RunPreFilterPlugins(ctx, state, nil)
f.RunPreFilterExtensionAddPod(ctx, state, nil, nil, nil)
f.RunPreFilterExtensionRemovePod(ctx, state, nil, nil, nil)
Copy link
Member Author

@sanposhiho sanposhiho Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added these changes to prevent nil-pointer panic in RunPreFilterExtensionAddPod and RunPreFilterExtensionRemovePod. 🙏

Copy link
Member

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 5, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 0af973e0dd5d182a572893ef6ccdd1f6226dfe4c

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

2 similar comments
@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

Comment on lines +1428 to +1432
name string
plugins []*TestPlugin
wantPreFilterResult *framework.PreFilterResult
wantSkippedPlugins sets.Set[string]
wantStatusCode framework.Code
Copy link
Member

@pacoxu pacoxu Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold
for go format CI failure
feel free to unhold once the CI is green.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jan 6, 2023
@sanposhiho
Copy link
Member Author

Oops, sorry for that. Formated. 🙏

@sanposhiho
Copy link
Member Author

/retest

#110697

@sanposhiho
Copy link
Member Author

/retest

#114438

@kerthcet
Copy link
Member

kerthcet commented Jan 6, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 6, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: fa18af83341c08a7817aff55c709c5f3378289f7

@pacoxu
Copy link
Member

pacoxu commented Jan 6, 2023

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 6, 2023
@k8s-ci-robot k8s-ci-robot merged commit c549b59 into kubernetes:master Jan 6, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.27 milestone Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants