Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parametrizable priority function mapping requested/capacity ratio to priority #63929

Conversation

losipiuk
Copy link
Contributor

@losipiuk losipiuk commented May 16, 2018

Release notes:

Add RequestedToCapacityRatioPriority priority function. Function is parametrized with set of points mapping node utilization (0-100) to score (0-10).
Function is linear between points. Resource utilization is defined as one minus ratio of total amount of resource requested by pods on node and node's capacity (scaled to 100). 
Final utilization used for computation is arithmetic mean of cpu utilization and memory utilization.
Function is disabled by default and can be enabled via scheduler policy config file.
If no parametrization is specified in config file it defaults to one which gives score 10 to utilization 0 and score 0 to utilization 100.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 16, 2018
@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels May 16, 2018
@MaciekPytel
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 16, 2018

import (
schedulerapi "k8s.io/kubernetes/pkg/scheduler/api"
"fmt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is to import builtin packages separately.

import (
  "fmt"

  "k8s.io/something"
  "k8s.io/somethingelse"
)

@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch 6 times, most recently from 12f04d4 to 3b84f5e Compare May 21, 2018 20:13
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 21, 2018
@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch 2 times, most recently from c231a31 to 4bf026a Compare May 21, 2018 21:55
@losipiuk losipiuk changed the title WIP - do not review - ca friendly scheduler priority function CA friendly scheduler priority function May 21, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2018
@losipiuk losipiuk changed the title CA friendly scheduler priority function Add parametrizable priority function mapping requested/capacity ratio to priority May 21, 2018
@losipiuk
Copy link
Contributor Author

/assign @bsalamat

It starts to be in reviewable state (modulo probably some missing comments)
@bsalamat please give it a pass when you have a chance. I will be OOO for 2 days now. Would be great to have some feedback when I am back.

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @losipiuk! Please see my comments. I haven't checked the tests yet.


var (
// give priority to mostly utilized nodes by default
defaultFunctionShape = newFunctionShape([]float64{0.0, 1.0}, []float64{0.0, 1.0})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think you want to use this priority function as the default priority function to place pods based on node resource usage, it should choose the least utilized node by default. This is the current default behavior of the scheduler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// RequestedToCapacityRatioResourceAllocationPriorityDefault creates a requestedToCapacity based
// ResourceAllocationPriority using default resource scoring function shape.
// The default function assigns 0.0 to resources with all capacity
// not requested and 1.0 when requested amount is equal to capacity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/with all capacity not requested/when all capacity is available/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I changed the default shape to be like least_requested the new comment now looks like

// The default function assigns 1.0 to resource when all capacity is available
// and 0.0 when requested amount is equal to capacity.

)

type functionShape struct {
x []float64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, these numbers must be between 0 and 1. If so, please replace the floating points with integers in scale 0 to 100. Floating point operations are more expensive to perform and their rounding behavior causes surprises. They should be avoided when possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can agree that operations are slightly more expensive though I would argue if that is something actually making non-neglegible difference in this use case. As for the surprises it seems to me that the code is actually more straightforward with floats, and we are not doing roundings on intermediate calculation steps, which makes result more accurate.

Also I see other priority function also use floating point numbers for internal computations (see selector_spreading for example).

I would leave it as is unless you strongly disagree. But then I would go for 0-10000 probably. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of floats in Borg (in a different context) brings a lot of horrible memories about bugs that were very hard and time-consuming to find. You are doing a lot of fine calculations on the floats as well. Those worry me and I'd still prefer integers. As far as range is concerned, 0-100 resembles 0% to 100% which is familiar to humans. Why would anyone want to set the resource utilization of a node with finer granularity than 1%? Even if someone wants to do so, it won't be so useful. Granularity is limited by the amount of resources that pods requests and the state of a cluster. So, in practice it is not possible to guarantee very high accuracy of node resource utilization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not fully agree as the rounding of intermediate computation steps if we are doing whole computation on small integers leads to unexpected results too.
I do not feel strongly about it though. I will make it 0-100 for user. And maybe extend the precision for computation - I will see how it fits in.

@@ -494,6 +503,9 @@ func validatePriorityOrDie(priority schedulerapi.PriorityPolicy) {
if priority.Argument.LabelPreference != nil {
numArgs++
}
if priority.Argument.RequestedToCapacityRatioArguments != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: It is a bit odd to put this check in this function. You may want to check a similar function that validates the RequestedToCapacityRatioArguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Not sure I understand. Isn't this function responsibility to validate if PriorityPolicy struct is actually used as a variant type (exactly one field is set). I just extended the function we new variant option I added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fine to leave it as is. I meant to create another function that validates RequestedToCapacityRatioArguments.


for i := 1; i < n; i++ {
if x[i-1] >= x[i] {
return functionShape{}, errors.New(fmt.Sprintf("Values in x must be increasing. x[%d]==%f >= x[%d]==%f", i-1, x[i-1], i, x[i]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/increasing/sorted/

}
}

// Creates a function which is build using linear segments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/build/built/


// Creates a function which is build using linear segments
// shape.x slice represents points on x axis where different segments meet
// shape.y represent function values at meeting points
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/represent/represents/

// y[0] for p < x[0]
// y[i] for p == x[i]
// y[n-1] for p > x[n-1]
// and linear between points
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add (p < x[i]) after "between points".

func buildBrokenLinearFunction(shape functionShape) func(float64) float64 {
n := shape.n
x := make([]float64, shape.n)
copy(x, shape.x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you make a copy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise modifying the slice passed within shape argument, after function is built, would modify function behaviour (not to mention undefined behaviour due to lack of synchronization). To me it looks like good practice to make copy in such case. Though I did not write Golang before. Are there reasons no to do so in Golang?

type functionShape struct {
x []float64
y []float64
n int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to store the length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. I just found it more readable to have it on separate variable to signal that it is relevant to both x and y.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not. It is confusing and prone to error. Please remove it.

@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch from 4bf026a to 728ee91 Compare May 24, 2018 22:11
Copy link
Contributor Author

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review!

Addressed comments or asked question when in doubt.
I also fixed a few linter errors and added generated code for api changes.

If you are interesed in fixup commit see this branch:
https://github.com/losipiuk/kubernetes/commits/lo/scheduler-priorities-ca-friendly-fixups

type functionShape struct {
x []float64
y []float64
n int
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. I just found it more readable to have it on separate variable to signal that it is relevant to both x and y.

)

type functionShape struct {
x []float64
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can agree that operations are slightly more expensive though I would argue if that is something actually making non-neglegible difference in this use case. As for the surprises it seems to me that the code is actually more straightforward with floats, and we are not doing roundings on intermediate calculation steps, which makes result more accurate.

Also I see other priority function also use floating point numbers for internal computations (see selector_spreading for example).

I would leave it as is unless you strongly disagree. But then I would go for 0-10000 probably. WDYT?


var (
// give priority to mostly utilized nodes by default
defaultFunctionShape = newFunctionShape([]float64{0.0, 1.0}, []float64{0.0, 1.0})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func newFunctionShape(x []float64, y []float64) functionShape {
shape, err := newFunctionShapeErr(x, y)
if err != nil {
panic(err.Error())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed after merge

// RequestedToCapacityRatioResourceAllocationPriorityDefault creates a requestedToCapacity based
// ResourceAllocationPriority using default resource scoring function shape.
// The default function assigns 0.0 to resources with all capacity
// not requested and 1.0 when requested amount is equal to capacity.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I changed the default shape to be like least_requested the new comment now looks like

// The default function assigns 1.0 to resource when all capacity is available
// and 0.0 when requested amount is equal to capacity.

func buildBrokenLinearFunction(shape functionShape) func(float64) float64 {
n := shape.n
x := make([]float64, shape.n)
copy(x, shape.x)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise modifying the slice passed within shape argument, after function is built, would modify function behaviour (not to mention undefined behaviour due to lack of synchronization). To me it looks like good practice to make copy in such case. Though I did not write Golang before. Are there reasons no to do so in Golang?

@@ -494,6 +503,9 @@ func validatePriorityOrDie(priority schedulerapi.PriorityPolicy) {
if priority.Argument.LabelPreference != nil {
numArgs++
}
if priority.Argument.RequestedToCapacityRatioArguments != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Not sure I understand. Isn't this function responsibility to validate if PriorityPolicy struct is actually used as a variant type (exactly one field is set). I just extended the function we new variant option I added.

@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch from 728ee91 to 0f382a7 Compare May 25, 2018 14:32
@ravisantoshgudimetla
Copy link
Contributor

@losipiuk -Thanks for the PR. Can you please provide some background on why we need this PR?

@losipiuk
Copy link
Contributor Author

@ravisantoshgudimetla the new function allows more customization of scheduler behavior via policy config. It is beneficial as different behavior may better for different cluster setups / user needs.

@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch from 6509eba to 2a34076 Compare May 31, 2018 06:18
@losipiuk
Copy link
Contributor Author

@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch from 2a34076 to 3114bb4 Compare May 31, 2018 06:21
Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good. Just a couple of minor comments.

cc/ @misterikkit: if you have any questions about this feature and its use-cases, please ask sooner.


for i := 1; i < n; i++ {
if points[i-1].X >= points[i].X {
return nil, fmt.Errorf("values in x must be sorted. X[%d]==%d >= X[%d]==%d", i-1, points[i-1].X, i, points[i].X)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a user provides an invalid input in the policy config, one of these error messages is shown to the user. Users provide "utilization" and "score". They don't know about "x" and "y". Please change the messages and replace x and y with utilization and score.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not a valid concern without the other comment :) As fields in structure were called X and Y. Changing to Utilization and Score, though IMO living with X and Y internally was fine and less verbose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sure it is easy and more concise for you to call them "x" and "y", but I am not so sure if it would be easy to understand for someone else looking at the code 2 years later.

// FunctionShapePoint represents single point in scoring function shape.
type FunctionShapePoint struct {
// X is function argument.
X int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To improve readability, please change X and Y to utilization and score.

@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch from 3114bb4 to 8dd6e67 Compare June 4, 2018 08:16
@losipiuk
Copy link
Contributor Author

losipiuk commented Jun 4, 2018

@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch from 8dd6e67 to 801caca Compare June 4, 2018 08:35
@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch 2 times, most recently from 20b2c45 to 447f035 Compare June 4, 2018 09:50
@losipiuk losipiuk force-pushed the lo/scheduler-priorities-ca-friendly branch from 447f035 to 98041d0 Compare June 4, 2018 10:37
@losipiuk
Copy link
Contributor Author

losipiuk commented Jun 4, 2018

/retest

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, losipiuk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 4, 2018
@bsalamat bsalamat added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/feature Categorizes issue or PR as related to a new feature. labels Jun 4, 2018
@bsalamat
Copy link
Member

bsalamat commented Jun 4, 2018

please add a release note

@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@bsalamat @losipiuk

Pull Request Labels
  • sig/scheduling: Pull Request will be escalated to these SIGs if needed.
  • priority/important-soon: Escalate to the pull request owners and SIG owner; move out of milestone after several unsuccessful escalation attempts.
  • kind/feature: New functionality.
Help

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 61610, 64591, 58143, 63929). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 1d10287 into kubernetes:master Jun 4, 2018
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jun 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants