Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix default scheduler crash if scheduler extender filter returns a not found node #79641

Merged
merged 1 commit into from
Aug 14, 2019

Conversation

yqwang-ms
Copy link
Contributor

@yqwang-ms yqwang-ms commented Jul 2, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
See issue #79640

Which issue(s) this PR fixes:

Fixes #79640

Special notes for your reviewer:
In future, we may should further improve the default scheduler tolerance to scheduler extender.

Does this PR introduce a user-facing change?:

If scheduler extender filtered a not found node, current scheduling round for this pod will just be skipped.

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 2, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @yqwang-ms. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jul 2, 2019
@yqwang-ms
Copy link
Contributor Author

/assign @ravisantoshgudimetla

@wgliang
Copy link
Contributor

wgliang commented Jul 2, 2019

/assign
/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 2, 2019
@yqwang-ms
Copy link
Contributor Author

/retest

Copy link
Contributor

@hex108 hex108 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Could you please help squash the commits?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 5, 2019
@yqwang-ms
Copy link
Contributor Author

I am not authorized to merge this pull request, @hex108 could you please help to click the "squash and merge" button? Such as
image

@hex108
Copy link
Contributor

hex108 commented Jul 5, 2019

I am not authorized to merge this pull request, @hex108 could you please help to click the "squash and merge" button? Such as
image

You could squash the commits in your local machine and force push it to remote.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 5, 2019
@yqwang-ms
Copy link
Contributor Author

Thanks for the info, @hex108 ,I have force pushed, please check :)

@yqwang-ms
Copy link
Contributor Author

/retest

@hex108
Copy link
Contributor

hex108 commented Jul 5, 2019

/lgtm

Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 5, 2019
@yqwang-ms
Copy link
Contributor Author

Hi @Huang-Wei @k82cn could you please take a look at this small fix? Thanks!

@yqwang-ms
Copy link
Contributor Author

Hi @hex108, I am not familar with the K8S PR process, could you please tell me what should I do to continue push this PR to approve?

@hex108
Copy link
Contributor

hex108 commented Jul 8, 2019

Hi @hex108, I am not familar with the K8S PR process, could you please tell me what should I do to continue push this PR to approve?

Wait for one of the approvers for approve. :)

@yqwang-ms
Copy link
Contributor Author

Great! Thanks!

@yqwang-ms
Copy link
Contributor Author

Hi @Huang-Wei @k82cn could you please take a look at this small fix when you free? :)
Or could you please tell me what should I do next?
Thanks again!

@yqwang-ms
Copy link
Contributor Author

/assign @k82cn

@yqwang-ms
Copy link
Contributor Author

/assign @Huang-Wei

Copy link
Member

@Huang-Wei Huang-Wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the late reply. Some comments below.

nodeResult = append(nodeResult, nodeNameToInfo[(*result.NodeNames)[i]].Node())
for _, nodeName := range *result.NodeNames {
if node, ok := nodeNameToInfo[nodeName]; ok {
nodeResult = append(nodeResult, node.Node())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR, len(nodeResult) is not necessarily the same as len(*result.NodeNames), so it'd good to change L303 to nodeResult = make([]*v1.Node).

Copy link
Contributor Author

@yqwang-ms yqwang-ms Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. :)

For this comment, do you mean change to nodeResult = make([]*v1.Node, 0), however, len(nodeResult) is expected to be the same as len(*result.NodeNames), and in common cases, it is. Otherwise, it is a rare "exception", and we already returned an error for this.

So, to optimize for most common cases, we would better to still make a slice with an expect capacity for the following fast appending.

Is that ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this comment, do you mean change to nodeResult = make([]*v1.Node, 0)

Yes, if the length is not fixed.

According to your comments that the length is fixed, and we return error upon mismatch. Technically we should do nodeResult = make([]*v1.Node, len(*result.NodeNames)) to ensure len and cap inside the slice are the same, and use nodeResult[i] = node.Node(). (I recalled that I did a test and it showed a perf improvement).

Can you update the code, as well as L305 to L310, and make them squashed into one commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Adjusted according to your suggestion, as well as L311 to L314, pls check

pkg/scheduler/core/extender.go Outdated Show resolved Hide resolved
pkg/scheduler/core/extender.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 13, 2019
@yqwang-ms
Copy link
Contributor Author

/retest

1 similar comment
@yqwang-ms
Copy link
Contributor Author

/retest

@yqwang-ms
Copy link
Contributor Author

/retest

@Huang-Wei
Copy link
Member

/lgtm
/approve
/retest

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 14, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Huang-Wei, yqwang-ms

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 14, 2019
@yqwang-ms
Copy link
Contributor Author

/retest

@k8s-ci-robot k8s-ci-robot merged commit 2ad2795 into kubernetes:master Aug 14, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Aug 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Default scheduler crash if scheduler extender filter returns a not found node
8 participants