Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not schedule pod to the node under PID pressure. #60007

Merged
merged 1 commit into from
Apr 26, 2018

Conversation

k82cn
Copy link
Member

@k82cn k82cn commented Feb 17, 2018

Signed-off-by: Da K. Ma klaus1982.cn@gmail.com

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
part of #54313

Release note:

Added CheckNodePIDPressurePredicate to checks if a pod can be scheduled on
a node reporting pid pressure condition.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Feb 17, 2018
@k82cn
Copy link
Member Author

k82cn commented Feb 17, 2018

This PR adds a predicate for pid pressure; and the Taint/Toleration way is handled in #60008 as TaintNodeByConditions is still alpha now.

@k82cn
Copy link
Member Author

k82cn commented Feb 25, 2018

@kubernetes/sig-scheduling-pr-reviews

@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Feb 25, 2018
// CheckNodePIDPressurePredicate checks if a pod can be scheduled on a node
// reporting pid pressure condition.
func CheckNodePIDPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) {
// check if node is under pid pressure
Copy link
Contributor

@ravisantoshgudimetla ravisantoshgudimetla Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it better to have a featuregating checked here? Something like utilfeature.DefaultFeatureGate.Enabled(features.TaintNodesByCondition)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point :) , we need to remove/disable this predicate if TaintNodesByCondition enabled.

xref #60398

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -160,6 +160,9 @@ func defaultPredicates() sets.String {
// Fit is determined by node disk pressure condition.
factory.RegisterFitPredicate(predicates.CheckNodeDiskPressurePred, predicates.CheckNodeDiskPressurePredicate),

// Fit is determined by node pid pressure condition.
factory.RegisterFitPredicate(predicates.CheckNodeDiskPressurePred, predicates.CheckNodePIDPressurePredicate),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/CheckNodeDiskPressurePred/CheckNodePIDPressurePred

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@k82cn k82cn force-pushed the k8s_54313_1 branch 2 times, most recently from def3513 to 2c80306 Compare February 26, 2018 03:06
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 26, 2018
@k82cn
Copy link
Member Author

k82cn commented Feb 26, 2018

/retest

@k82cn
Copy link
Member Author

k82cn commented Mar 1, 2018

@kubernetes/sig-scheduling-pr-reviews

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a unit or integration test please?

@@ -1504,6 +1506,16 @@ func CheckNodeDiskPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadat
return true, nil, nil
}

// CheckNodePIDPressurePredicate checks if a pod can be scheduled on a node
// reporting pid pressure condition.
func CheckNodePIDPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why didn't you add this check to CheckNodeConditionPredicate instead of this function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @bsalamat, we might want to check it there as it is nodeCondition.

Also to me, we might want to use taint instead, and let the toleration mechanism do the work, WDYT @k82cn ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bsalamat CheckNodeMemoryPressurePred, CheckNodeDiskPressurePred is also node condition, why we seprate these to two predicates?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanghaoran1988 @k82cn I see a benefit to having separate predicates. Users can enable or disable individual predicates in scheduler policy config. So, a user can disable CheckNodeMemoryPressurePred and/or CheckNodeDiskPressurePred in their policy config while keeping CheckNodeConditionPredicate enabled. For the same reason it may make sense to keep CheckNodePIDPressurePredicate as a separate predicate as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. separate predicates give user more options :).

Also to me, we might want to use taint instead, and let the toleration mechanism do the work,

Handled by #60008 .

@@ -86,6 +86,8 @@ const (
CheckNodeMemoryPressurePred = "CheckNodeMemoryPressure"
// CheckNodeDiskPressurePred defines the name of predicate CheckNodeDiskPressure.
CheckNodeDiskPressurePred = "CheckNodeDiskPressure"
// CheckNodePIDPressurePred defines the name of predicate CheckNodePIDPressure.
CheckNodePIDPressurePred = "CheckNodePIDPressure"
Copy link
Member

@yastij yastij Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this as a predicate, we might want to add it to the predicatesOrdering list, see predicates.go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1504,6 +1506,16 @@ func CheckNodeDiskPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadat
return true, nil, nil
}

// CheckNodePIDPressurePredicate checks if a pod can be scheduled on a node
// reporting pid pressure condition.
func CheckNodePIDPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @bsalamat, we might want to check it there as it is nodeCondition.

Also to me, we might want to use taint instead, and let the toleration mechanism do the work, WDYT @k82cn ?

Copy link
Contributor

@resouer resouer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left comment per equiv class cache need. Thanks.

@@ -160,6 +160,9 @@ func defaultPredicates() sets.String {
// Fit is determined by node disk pressure condition.
factory.RegisterFitPredicate(predicates.CheckNodeDiskPressurePred, predicates.CheckNodeDiskPressurePredicate),

// Fit is determined by node pid pressure condition.
factory.RegisterFitPredicate(predicates.CheckNodePIDPressurePred, predicates.CheckNodePIDPressurePredicate),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should handle equiv cache invalidation properly when node condition changes. See: https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/factory/factory.go#L822-L827

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 19, 2018
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 19, 2018
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 19, 2018
@k82cn
Copy link
Member Author

k82cn commented Apr 19, 2018

Added integration test for the predicates, PTAL :)


// TestNodePIDPressue verifies that scheduler's CheckNodePIDPressurePredicate predicate
// functions works correctly.
func TestNodePIDPressue(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/TestNodePIDPressue/TestNodePIDPressure/
Pleaes fix the comment as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pod.Namespace, pod.Name, targetNode)
}

err = cs.CoreV1().Pods(context.ns.Name).Delete(testPod.Name, metav1.NewDeleteOptions(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting the pods is not necessary in this test. You can leave it to the cleanupTest to take care of pod cleanup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanupTest only cleanup nodes :)

targetNode := nodes[1].Name

// Creats test pod.
testPod := &v1.Pod{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to ensure that the pod is not randomly assigned to the second node, I would add nodeSelector to the spec to choose the first node and would check that the pod is marked unschedulable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@k82cn k82cn force-pushed the k8s_54313_1 branch 2 times, most recently from 237c9c6 to 6bfab6e Compare April 20, 2018 15:03
@k82cn
Copy link
Member Author

k82cn commented Apr 20, 2018

/retest

t.Fatalf("Test Failed: error: %v, while creating pod", err)
}

err = wait.Poll(pollInterval, wait.ForeverTestTimeout, podUnschedulable(cs, testPod.Namespace, testPod.Name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use waitForPodUnschedulable instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

t.Errorf("Test Failed: error, %v, while waiting for scheduled", err)
}

err = cs.CoreV1().Pods(context.ns.Name).Delete(testPod.Name, metav1.NewDeleteOptions(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: you don't need to delete the pod. cleanupTest deletes it anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked cleanupTest: it only deleted Nodes, DeleteTestingNamespace is empty. Did I miss anything?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you are right. You can use cleanupPods to delete pods though.

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the node controller (in a separate PR) to update Node taints when node is under PID pressure?

One nit and then LGTM.

t.Errorf("Test Failed: error, %v, while waiting for scheduled", err)
}

err = cs.CoreV1().Pods(context.ns.Name).Delete(testPod.Name, metav1.NewDeleteOptions(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you are right. You can use cleanupPods to delete pods though.

Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
@k82cn
Copy link
Member Author

k82cn commented Apr 26, 2018

Should we update the node controller (in a separate PR) to update Node taints when node is under PID pressure?

Handled at #60008 :)

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks, @k82cn!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 26, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, k82cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 59367, 60007). If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants