Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(scheduler): won't run Filter if PreFilter returned a Skip status #112637

Merged
merged 1 commit into from Nov 8, 2022

Conversation

sanposhiho
Copy link
Member

@sanposhiho sanposhiho commented Sep 21, 2022

What type of PR is this?

/kind feature

What this PR does / why we need it:

Change the framework so that it doesn't run plugins's Filter() if its PreFilter() returned a Skip status.
For example, nodeAffinity returns Skip in PreFilter if the pod doesn't specify any node selector or affinity.

In this PR, I only changed the Framework logic. I'm going to change each plugin's PreFilter in another PR.

Which issue(s) this PR fixes:

Part of #107556
Part of #110643

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 21, 2022
@k8s-ci-robot
Copy link
Contributor

@sanposhiho: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Sep 21, 2022
@sanposhiho
Copy link
Member Author

/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 21, 2022
@sanposhiho sanposhiho force-pushed the pre-filter-skip branch 2 times, most recently from 2703fdc to ec85e91 Compare September 21, 2022 15:14
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Sep 21, 2022
@sanposhiho sanposhiho force-pushed the pre-filter-skip branch 2 times, most recently from 27d1069 to 746d8be Compare September 22, 2022 02:01
@sanposhiho
Copy link
Member Author

cc @x13n
(Just for your information, this PR is created before #112660 and no one from autoscaling team is assigned now so.)

@x13n
Copy link
Member

x13n commented Oct 6, 2022

Thanks and apologies for late response. The change LGTM from autoscaling point of view. Since it predates #112660, I guess there's nothing else for me to do here (i.e. my approval won't matter to k8s bot).

@alculquicondor
Copy link
Member

/assign

// anything but Success. If a non-success status is returned, then the scheduling
// cycle is aborted.
func (f *frameworkImpl) RunPreFilterPlugins(ctx context.Context, state *framework.CycleState, pod *v1.Pod) (_ *framework.PreFilterResult, status *framework.Status) {
func (f *frameworkImpl) RunPreFilterPlugins(ctx context.Context, state *framework.CycleState, pod *v1.Pod) (_ *framework.PreFilterResult, _ sets.String, status *framework.Status) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add a dedicated field in CycleState to hold the set of SkipFilterPlugins

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense. I didn't come up with that idea.
Let me change...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed implementation as suggested. 🙏

pkg/scheduler/framework/cycle_state.go Outdated Show resolved Hide resolved
@@ -48,6 +50,8 @@ type CycleState struct {
storage sync.Map
// if recordPluginMetrics is true, PluginExecutionDuration will be recorded for this cycle.
recordPluginMetrics bool
// skipFilterPlugin is plugins that are skipped in the Filter extension point.
skipFilterPlugins sets.String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just export this field instead of adding getters and setters. Or you could have functions like SkipFilterPlugin to add elements and ShouldSkipFilterPlugin to check if the plugin is in the set.

Are there scenarios in the codebase where c==nil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just export this field instead of adding getters and setters.

That seems simpler and better.
I just followed how we handle recordPluginMetrics, and no big reason to do this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there scenarios in the codebase where c==nil?

I checked and seems No.

name: preFilterPluginName,
inj: injectedResult{PreFilterStatus: int(framework.Error)},
func TestRunPreFilterPlugins(t *testing.T) {
tests := []struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much better, thanks!

pkg/scheduler/framework/runtime/framework_test.go Outdated Show resolved Hide resolved
Comment on lines 1324 to 1473
if status == nil && tt.wantStatus != nil || status != nil && tt.wantStatus == nil {
t.Errorf("wrong status. got: %v, want: %v", status, tt.wantStatus)
} else if status == nil && tt.wantStatus == nil {
// do nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't cmp.Diff already handle all these cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If status and/or tt.wantStatus is nil, *status and/or *tt.wantStatus in cmp.Diff will be nil-pointer panic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use cmp.Diff(status, tt.wantStatus)

Copy link
Member Author

@sanposhiho sanposhiho Nov 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmp.Diff cannot handle pointer.
↓ is the result by using cmp.Diff(status, tt.wantStatus, cmpopts.IgnoreUnexported(framework.Status{})) here.
It was actually the same, but cmp.Diff thought they are different from each other since it's pointer.

(*framework.Status)(
            - 	&⟪0xc00016d780⟫{
            - 		code:    1,
            - 		reasons: []string(⟪ptr:0xc0003d44a0, len:1, cap:1⟫{`running PreFilter plugin "error": injected status`}),
            - 		err: &⟪0xc0003a2060⟫"fmt".wrapError{
            - 			msg: `running PreFilter plugin "error": injected status`,
            - 			err: &⟪0xc0003d4420⟫"errors".errorString{s: "injected status"},
            - 		},
            - 		failedPlugin: "error",
            - 	},
            + 	&⟪0xc00016d2c0⟫{
            + 		code:    1,
            + 		reasons: []string(⟪ptr:0xc0003d52f0, len:1, cap:1⟫{`running PreFilter plugin "error": injected status`}),
            + 		err: &⟪0xc0003a2580⟫"fmt".wrapError{
            + 			msg: `running PreFilter plugin "error": injected status`,
            + 			err: &⟪0xc0003d4c80⟫"errors".errorString{s: "injected status"},
            + 		},
            + 		failedPlugin: "error",
            + 	},
              )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found ↑ is a lie. Actually, cmp.Diff can handle pointers.
The reason we cannot just use cmp.Diff is both errors are created from different error bases.

The status from TestPlugin contains the error created in framework.NewStatus.

return nil, framework.NewStatus(framework.Code(pl.inj.PreFilterStatus), injectReason)

But, the status in each test case contains the error errInjectedStatus.
errInjectedStatus = errors.New(injectReason)


I fixed here to use cmp.Comparer and deal with the ↑ problem.

pkg/scheduler/framework/runtime/framework_test.go Outdated Show resolved Hide resolved
pkg/scheduler/framework/runtime/framework_test.go Outdated Show resolved Hide resolved
pkg/scheduler/framework/runtime/framework_test.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 30, 2022
@sanposhiho
Copy link
Member Author

Sure, I will do it in this weekend

@sanposhiho
Copy link
Member Author

sanposhiho commented Nov 5, 2022

@alculquicondor @kerthcet Fixed and squashed. 🙏

@sanposhiho
Copy link
Member Author

/retest

@sanposhiho
Copy link
Member Author

/retest

Copy link
Member

@kerthcet kerthcet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit.

pkg/scheduler/framework/cycle_state.go Outdated Show resolved Hide resolved
if d := cmp.Diff(result, tt.wantPreFilterResult); d != "" {
t.Errorf("wrong status. got: %v, want: %v, diff: %s", result, tt.wantPreFilterResult, d)
}
if d := cmp.Diff(status, tt.wantStatus, cmpopts.EquateErrors(), cmp.Comparer(func(a, b *framework.Status) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be no need for EquateErrors

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah 🤦, fixed.

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Nov 8, 2022

@sanposhiho: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-gce-csi-serial f6ee0066ce8ff73798ab0e538a8baf4e67c096db link false /test pull-kubernetes-e2e-gce-csi-serial
pull-kubernetes-e2e-gce-alpha-features f6ee0066ce8ff73798ab0e538a8baf4e67c096db link false /test pull-kubernetes-e2e-gce-alpha-features
pull-kubernetes-conformance-kind-ipv6-parallel f6ee0066ce8ff73798ab0e538a8baf4e67c096db link false /test pull-kubernetes-conformance-kind-ipv6-parallel
pull-kubernetes-e2e-gce-storage-slow f6ee0066ce8ff73798ab0e538a8baf4e67c096db link false /test pull-kubernetes-e2e-gce-storage-slow
pull-kubernetes-e2e-gce-storage-snapshot f6ee0066ce8ff73798ab0e538a8baf4e67c096db link false /test pull-kubernetes-e2e-gce-storage-snapshot
pull-kubernetes-e2e-capz-windows-containerd f6ee0066ce8ff73798ab0e538a8baf4e67c096db link false /test pull-kubernetes-e2e-capz-windows-containerd
check-dependency-stats f6ee0066ce8ff73798ab0e538a8baf4e67c096db link false /test check-dependency-stats

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sanposhiho
Copy link
Member Author

/retest

@kerthcet
Copy link
Member

kerthcet commented Nov 8, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2022
@alculquicondor
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, sanposhiho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 8, 2022
@k8s-ci-robot k8s-ci-robot merged commit 2b7b524 into kubernetes:master Nov 8, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.26 milestone Nov 8, 2022
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Nov 10, 2022
@sanposhiho
Copy link
Member Author

remove the release note.

@alculquicondor
Copy link
Member

@sanposhiho can you work on this now?

I think this is important for performance reasons too.

@sanposhiho
Copy link
Member Author

yes, I think i can create a PR hopefully by this weekend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants