New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote resource limits priority function to beta #69437

Merged
merged 2 commits into from Oct 17, 2018

Conversation

@ravisantoshgudimetla
Contributor

ravisantoshgudimetla commented Oct 4, 2018

What this PR does / why we need it:
We should promote resource limits priority function to beta as this feature is useful in scenarios where the nodes that satisfies limits of pods are given higher score.

/cc @aveshagarwal @bsalamat
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Release note:

Promote resource limits priority function to beta
@ravisantoshgudimetla

This comment has been minimized.

Contributor

ravisantoshgudimetla commented Oct 4, 2018

/sig scheduling

@bsalamat

I am fine with the promotion, I just wonder if you have tested this feature in real clusters enough.

@@ -56,6 +58,17 @@ var podRequestedResource *v1.ResourceRequirements = &v1.ResourceRequirements{
},
}
var podwithLargeRequestedResource *v1.ResourceRequirements = &v1.ResourceRequirements{
Limits: v1.ResourceList{
v1.ResourceMemory: resource.MustParse("100Mi"),

This comment has been minimized.

@bsalamat

bsalamat Oct 5, 2018

Contributor

Limits is smaller than Requests?

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Oct 6, 2018

Contributor

This was part of testing, I was doing, I forgot to revert it. Thanks.

@@ -399,3 +434,27 @@ func addRandomTaitToNode(cs clientset.Interface, nodeName string) *v1.Taint {
framework.ExpectNodeHasTaint(cs, nodeName, &testTaint)
return &testTaint
}
func UpdateMemoryOfNode(c clientset.Interface, nodeName string) error {

This comment has been minimized.

@bsalamat

bsalamat Oct 5, 2018

Contributor

It makes more sense for the function to receive the new amount of memory as an argument.

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Oct 6, 2018

Contributor

Sure, I have added it.

By("Pod should preferbly scheduled to nodes which satisfy its limits")
tolePod, err := cs.CoreV1().Pods(ns).Get(podWithLargeLimits, metav1.GetOptions{})
Expect(err).NotTo(HaveOccurred())
Expect(tolePod.Spec.NodeName).To(Equal(nodeName))

This comment has been minimized.

@bsalamat

bsalamat Oct 5, 2018

Contributor

I think you need to revert the node resources back to avoid surprises for future test cases.

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Oct 6, 2018

Contributor

Yes, I missed it. I have updated the PR now.

@@ -209,8 +222,6 @@ var _ = SIGDescribe("SchedulerPriorities [Serial]", func() {
// make the nodes have balanced cpu,mem usage ratio
err := createBalancedPodForNodes(f, cs, ns, nodeList.Items, podRequestedResource, 0.5)
framework.ExpectNoError(err)
//we need apply more taints on a node, because one match toleration only count 1
By("Trying to apply 10 taint on the nodes except first one.")

This comment has been minimized.

@bsalamat

bsalamat Oct 5, 2018

Contributor

Any particular reason for deleting this?

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Oct 6, 2018

Contributor

Thanks for catching, you are right, I was thinking I was changing the test I am introducing as part of this PR. I was not intending to touch this.

It("Pod should be preferably scheduled to nodes which satisfy its limits", func() {
// Update one node to have large allocatable.
nodeName := nodeList.Items[0].Name
err := UpdateMemoryOfNode(cs, nodeName, int64(10000))

This comment has been minimized.

@k82cn

k82cn Oct 7, 2018

Member

hm... , seems node's memory is smaller than pod's limit :) The middle one of node list maybe better.

Including other priority test, we should check length of nodeList; if there's only one node, our test can not catch "issues".

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Oct 8, 2018

Contributor

Yes, I can change it to a middle node but tbh, none of the pull-verify-* jobs are testing these current e2e's, so actually this could have failed after the PR merged :(.

This comment has been minimized.

@k82cn

k82cn Oct 10, 2018

Member

so actually this could have failed after the PR merged :(.

I mean set nodeName to middle of nodeList.Items:)
As I think you're setting a smaller memory (10^4) than pod's limits (3 * 10^4), it's better to use a const, e.g. node's memory is twice of big pod.

@k82cn

This comment has been minimized.

Member

k82cn commented Oct 7, 2018

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature and removed needs-kind labels Oct 7, 2018

@@ -399,3 +443,27 @@ func addRandomTaitToNode(cs clientset.Interface, nodeName string) *v1.Taint {
framework.ExpectNodeHasTaint(cs, nodeName, &testTaint)
return &testTaint
}
func UpdateMemoryOfNode(c clientset.Interface, nodeName string, value int64) error {

This comment has been minimized.

@bsalamat

bsalamat Oct 8, 2018

Contributor

nit: s/value/memory/

Also, please add a comment for the function and perhaps make it private (change to lower case).

}
_, err = c.CoreV1().Nodes().Patch(string(node.Name), types.StrategicMergePatchType, patchBytes)
if err != nil {

This comment has been minimized.

@bsalamat

bsalamat Oct 8, 2018

Contributor

nit: This if statement is not needed. At this point it returns 'err' regardless.

err := UpdateMemoryOfNode(cs, nodeName, int64(10000))
Expect(err).NotTo(HaveOccurred())
defer func() {
nodeOriginalMemory, found := nodeList.Items[0].Status.Allocatable[v1.ResourceMemory]

This comment has been minimized.

@bsalamat

bsalamat Oct 8, 2018

Contributor

Shouldn't you read the original amount of memory at the beginning of the function, before you update it?

@yastij

This comment has been minimized.

Member

yastij commented Oct 9, 2018

@ravisantoshgudimetla

This comment has been minimized.

Contributor

ravisantoshgudimetla commented Oct 15, 2018

/retest

@bsalamat

just a minor comment. Otherwise, LGTM.

@@ -56,6 +58,17 @@ var podRequestedResource *v1.ResourceRequirements = &v1.ResourceRequirements{
},
}
var podwithLargeRequestedResource *v1.ResourceRequirements = &v1.ResourceRequirements{

This comment has been minimized.

@bsalamat

bsalamat Oct 15, 2018

Contributor

Looks like this var is used only in one of the tests. Please move it inside the test func.

@ravisantoshgudimetla

This comment has been minimized.

Contributor

ravisantoshgudimetla commented Oct 16, 2018

/retest

@bsalamat

/lgtm

Thanks, @ravisantoshgudimetla!

@k8s-ci-robot k8s-ci-robot added the lgtm label Oct 16, 2018

@ravisantoshgudimetla

This comment has been minimized.

Contributor

ravisantoshgudimetla commented Oct 16, 2018

/retest

@ravisantoshgudimetla

This comment has been minimized.

Contributor

ravisantoshgudimetla commented Oct 17, 2018

/retest

@bsalamat

This comment has been minimized.

Contributor

bsalamat commented Oct 17, 2018

/approve

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Oct 17, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, ravisantoshgudimetla

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 897b3a9 into kubernetes:master Oct 17, 2018

18 checks passed

cla/linuxfoundation ravisantoshgudimetla authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-e2e-kubeadm-gce Skipped
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
tide In merge pool.
Details
@AishSundar

This comment has been minimized.

Contributor

AishSundar commented Oct 18, 2018

@ravisantoshgudimetla I see the newly added Scheduler Priorities test added fail in the test run - https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gke-gci-new-gci-master-upgrade-cluster-new

Can you please take a look?

@jberkus @mortent as FYI. We need to open a test issue to track the test failure if it repros in the next run too.

@ravisantoshgudimetla

This comment has been minimized.

Contributor

ravisantoshgudimetla commented Oct 18, 2018

@AishSundar Sure, I will look into it. FWIW, the test is also failing on master(https://k8s-testgrid.appspot.com/sig-scheduling#gce-serial).

@AishSundar

This comment has been minimized.

Contributor

AishSundar commented Oct 18, 2018

/milestone v1.13

Ok lets open a new test issue for this. thanks @ravisantoshgudimetla

@bsalamat

This comment has been minimized.

Contributor

bsalamat commented Oct 22, 2018

@ravisantoshgudimetla Looks like the e2e test for this PR is flaky: https://k8s-testgrid.appspot.com/sig-release-master-blocking#gce-cos-master-serial

Could you please take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment