Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use reservoir sampling to select one host from priority list #78009

Merged
merged 1 commit into from Aug 1, 2019
Merged

Use reservoir sampling to select one host from priority list #78009

merged 1 commit into from Aug 1, 2019

Conversation

hainesc
Copy link
Contributor

@hainesc hainesc commented May 17, 2019

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug

/kind cleanup

/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:

The function selectHost was designed to select one host from priority list with a highest score, However, the current manner use round-robin and it has to keep a var named lastNodeIndex in the struct of genericScheduler, and it introduces a helper function findMaxScores, both are not needed if we use reservoir sampling, which is more suitable for selecting one candidate with same possibility from a set, and the code will be cleaner. For the detail of reservoir sampling, see here.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 17, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @hainesc. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 17, 2019
Copy link
Member

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice!

pkg/scheduler/core/generic_scheduler.go Outdated Show resolved Hide resolved
@hainesc
Copy link
Contributor Author

hainesc commented May 24, 2019

ping @resouer @misterikkit for review and ok-to-test label

Thanks.

@hainesc
Copy link
Contributor Author

hainesc commented May 27, 2019

ping @kubernetes/sig-scheduling-pr-reviews for review

@k8s-ci-robot
Copy link
Contributor

@hainesc: Reiterating the mentions to trigger a notification:
@kubernetes/sig-scheduling-pr-reviews

In response to this:

ping @kubernetes/sig-scheduling-pr-reviews for review

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@yastij yastij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/priority backlog
/assign @misterikkit

@k8s-ci-robot k8s-ci-robot added priority/backlog Higher priority than priority/awaiting-more-evidence. lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 27, 2019
@yastij
Copy link
Member

yastij commented May 27, 2019

@bsalamat - while i can buy the logic behind the change, I'm afraid that this might over-complicate our testing logic, thoughts ?

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 27, 2019
@hainesc
Copy link
Contributor Author

hainesc commented May 27, 2019

/test pull-kubernetes-bazel-test

@k8s-ci-robot
Copy link
Contributor

@hainesc: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test pull-kubernetes-bazel-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hainesc
Copy link
Contributor Author

hainesc commented May 27, 2019

ping @yastij for ok-to-test

@yastij
Copy link
Member

yastij commented May 27, 2019

/ok-to-test

@yastij
Copy link
Member

yastij commented May 28, 2019

re-applying lgtm per my first comment

/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 28, 2019
@hainesc
Copy link
Contributor Author

hainesc commented May 31, 2019

ping @misterikkit for approve, please take a look. Thanks.

@hainesc
Copy link
Contributor Author

hainesc commented Jun 6, 2019

ping @yastij Can you help me ping an approver? It is a long time since the lgtm label. I don't know who is responsible for this pr.

@ahg-g
Copy link
Member

ahg-g commented Jun 6, 2019

we are in code freeze now, even if it gets approved, it is not going to be merged until the freeze is over (probably mid June).

@hainesc
Copy link
Contributor Author

hainesc commented Jul 22, 2019

Hi @ahg-g, Can this pr get merged?

@hainesc
Copy link
Contributor Author

hainesc commented Jul 29, 2019

ping @yastij for approve.
Thanks

@gaorong
Copy link
Contributor

gaorong commented Jul 30, 2019

thanks @hainesc , I love this creative solution, but it seems this will break the previous round-robin manner to select a node and change to random selecting?

@hainesc
Copy link
Contributor Author

hainesc commented Jul 30, 2019

it seems this will break the previous round-robin manner to select a node and change to random selecting?

@gaorong Yes, as shown by the benchmark, random selecting wins a lot against the round-robin manner in performance. And also, as you see, the code is more cleaner than the previous.

I'd like to know what's your concern when we break the round-robin? In my opinion, there should be no assumption which node(with the highest score) will be selected, the only expected thing is that the candidate should be equal in probability.

@misterikkit
Copy link

@gaorong The current "round robin" behavior doesn't actually result in round robin scheduling. Rather, the lastNodeIndex is used to index into the slice of nodes which have the highest ranking. The size of this slice is likely to be different for each pod we schedule.

@bsalamat can confirm, but the intent of this mechanic is to prevent one node from getting all the pending pods when there are other equivalent nodes.

@hainesc
Copy link
Contributor Author

hainesc commented Jul 30, 2019

Thanks @misterikkit Yes, the input slice is likely to be different in scores and order as the cluster is dynamic.

Looking forward this PR getting merged (yeah, a long time wait), as it is proved to be better than the current code. It is approved by @ahg-g and @yastij

Thanks, all!

@gaorong
Copy link
Contributor

gaorong commented Jul 31, 2019

the intent of this mechanic is to prevent one node from getting all the pending pods when there are other equivalent nodes.

thanks misterikkit@, after rethinking, I guess the lastNodeIndex should keep as it is. although it can not guarantee round-robin, it can try to provide fairer selecting among equivalent nodes than random selecting.

@ahg-g
Copy link
Member

ahg-g commented Jul 31, 2019

@Huang-Wei @bsalamat I like this PR, it allows removing a member variable of genericScheduler without compromising semantics (the minimal perf improvement is a cherry on the top).

@misterikkit
Copy link

it can try to provide fairer selecting among equivalent nodes than random selecting.

@gaorong, I took a deeper look into the code, and the filter step makes no promises about stable ordering. In fact, it evaluates nodes in parallel. So in that sense, we are already doing an arbitrary, undefined distribution with lastNodeIndex. Switching to a uniform distribution ought to increase fairness. (It is unlikely to make it noticeably worse.)

@Huang-Wei
Copy link
Member

Huang-Wei commented Jul 31, 2019

Thanks @misterikkit for the above comment. Simply put, we are doing (1) nodes shuffling + (2) (to some extent) fair nodes pickup by lastIndex which still results in an eventual shuffling.
So IMO it's ok to onboard this PR (which is shuffling*2).

@gaorong
Copy link
Contributor

gaorong commented Aug 1, 2019

thank you guys for clarification.

Copy link
Member

@Huang-Wei Huang-Wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks, @hainesc !

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hainesc, Huang-Wei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 1, 2019
@k8s-ci-robot k8s-ci-robot merged commit ac1cde5 into kubernetes:master Aug 1, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Aug 1, 2019
@hainesc hainesc deleted the develop branch August 1, 2019 07:55
Copy link

@yb01 yb01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm.
this change might node improve perf but it simplify the code and randomized the node selection. so let's take it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note-none Denotes a PR that doesn't merit a release note. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants