-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not schedule pod to the node under PID pressure. #60007
Conversation
This PR adds a predicate for pid pressure; and the Taint/Toleration way is handled in #60008 as TaintNodeByConditions is still alpha now. |
@kubernetes/sig-scheduling-pr-reviews |
// CheckNodePIDPressurePredicate checks if a pod can be scheduled on a node | ||
// reporting pid pressure condition. | ||
func CheckNodePIDPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) { | ||
// check if node is under pid pressure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it better to have a featuregating checked here? Something like utilfeature.DefaultFeatureGate.Enabled(features.TaintNodesByCondition)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point :) , we need to remove/disable this predicate if TaintNodesByCondition enabled.
xref #60398
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -160,6 +160,9 @@ func defaultPredicates() sets.String { | |||
// Fit is determined by node disk pressure condition. | |||
factory.RegisterFitPredicate(predicates.CheckNodeDiskPressurePred, predicates.CheckNodeDiskPressurePredicate), | |||
|
|||
// Fit is determined by node pid pressure condition. | |||
factory.RegisterFitPredicate(predicates.CheckNodeDiskPressurePred, predicates.CheckNodePIDPressurePredicate), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/CheckNodeDiskPressurePred/CheckNodePIDPressurePred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
def3513
to
2c80306
Compare
/retest |
@kubernetes/sig-scheduling-pr-reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a unit or integration test please?
@@ -1504,6 +1506,16 @@ func CheckNodeDiskPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadat | |||
return true, nil, nil | |||
} | |||
|
|||
// CheckNodePIDPressurePredicate checks if a pod can be scheduled on a node | |||
// reporting pid pressure condition. | |||
func CheckNodePIDPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why didn't you add this check to CheckNodeConditionPredicate
instead of this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bsalamat CheckNodeMemoryPressurePred, CheckNodeDiskPressurePred
is also node condition, why we seprate these to two predicates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wanghaoran1988 @k82cn I see a benefit to having separate predicates. Users can enable or disable individual predicates in scheduler policy config. So, a user can disable CheckNodeMemoryPressurePred
and/or CheckNodeDiskPressurePred
in their policy config while keeping CheckNodeConditionPredicate
enabled. For the same reason it may make sense to keep CheckNodePIDPressurePredicate
as a separate predicate as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. separate predicates give user more options :).
Also to me, we might want to use taint instead, and let the toleration mechanism do the work,
Handled by #60008 .
@@ -86,6 +86,8 @@ const ( | |||
CheckNodeMemoryPressurePred = "CheckNodeMemoryPressure" | |||
// CheckNodeDiskPressurePred defines the name of predicate CheckNodeDiskPressure. | |||
CheckNodeDiskPressurePred = "CheckNodeDiskPressure" | |||
// CheckNodePIDPressurePred defines the name of predicate CheckNodePIDPressure. | |||
CheckNodePIDPressurePred = "CheckNodePIDPressure" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep this as a predicate, we might want to add it to the predicatesOrdering list, see predicates.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -1504,6 +1506,16 @@ func CheckNodeDiskPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadat | |||
return true, nil, nil | |||
} | |||
|
|||
// CheckNodePIDPressurePredicate checks if a pod can be scheduled on a node | |||
// reporting pid pressure condition. | |||
func CheckNodePIDPressurePredicate(pod *v1.Pod, meta algorithm.PredicateMetadata, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left comment per equiv class cache need. Thanks.
@@ -160,6 +160,9 @@ func defaultPredicates() sets.String { | |||
// Fit is determined by node disk pressure condition. | |||
factory.RegisterFitPredicate(predicates.CheckNodeDiskPressurePred, predicates.CheckNodeDiskPressurePredicate), | |||
|
|||
// Fit is determined by node pid pressure condition. | |||
factory.RegisterFitPredicate(predicates.CheckNodePIDPressurePred, predicates.CheckNodePIDPressurePredicate), | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should handle equiv cache invalidation properly when node condition changes. See: https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/factory/factory.go#L822-L827
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Added integration test for the predicates, PTAL :) |
|
||
// TestNodePIDPressue verifies that scheduler's CheckNodePIDPressurePredicate predicate | ||
// functions works correctly. | ||
func TestNodePIDPressue(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/TestNodePIDPressue/TestNodePIDPressure/
Pleaes fix the comment as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pod.Namespace, pod.Name, targetNode) | ||
} | ||
|
||
err = cs.CoreV1().Pods(context.ns.Name).Delete(testPod.Name, metav1.NewDeleteOptions(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting the pods is not necessary in this test. You can leave it to the cleanupTest
to take care of pod cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleanupTest
only cleanup nodes :)
targetNode := nodes[1].Name | ||
|
||
// Creats test pod. | ||
testPod := &v1.Pod{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to ensure that the pod is not randomly assigned to the second node, I would add nodeSelector
to the spec to choose the first node and would check that the pod is marked unschedulable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
237c9c6
to
6bfab6e
Compare
/retest |
t.Fatalf("Test Failed: error: %v, while creating pod", err) | ||
} | ||
|
||
err = wait.Poll(pollInterval, wait.ForeverTestTimeout, podUnschedulable(cs, testPod.Namespace, testPod.Name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use waitForPodUnschedulable
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
t.Errorf("Test Failed: error, %v, while waiting for scheduled", err) | ||
} | ||
|
||
err = cs.CoreV1().Pods(context.ns.Name).Delete(testPod.Name, metav1.NewDeleteOptions(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: you don't need to delete the pod. cleanupTest
deletes it anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checked cleanupTest
: it only deleted Nodes
, DeleteTestingNamespace
is empty. Did I miss anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, you are right. You can use cleanupPods
to delete pods though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update the node controller (in a separate PR) to update Node taints when node is under PID pressure?
One nit and then LGTM.
t.Errorf("Test Failed: error, %v, while waiting for scheduled", err) | ||
} | ||
|
||
err = cs.CoreV1().Pods(context.ns.Name).Delete(testPod.Name, metav1.NewDeleteOptions(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, you are right. You can use cleanupPods
to delete pods though.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
Handled at #60008 :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks, @k82cn!
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bsalamat, k82cn The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Automatic merge from submit-queue (batch tested with PRs 59367, 60007). If you want to cherry-pick this change to another branch, please follow the instructions here. |
Signed-off-by: Da K. Ma klaus1982.cn@gmail.com
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):part of #54313
Release note: