New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YUNIKORN-1526] support K8s pod overhead #520
Conversation
Codecov Report
@@ Coverage Diff @@
## master #520 +/- ##
==========================================
+ Coverage 69.41% 69.59% +0.17%
==========================================
Files 45 45
Lines 7714 7715 +1
==========================================
+ Hits 5355 5369 +14
+ Misses 2162 2151 -11
+ Partials 197 195 -2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
pkg/common/resource_test.go
Outdated
@@ -191,6 +191,19 @@ func TestParsePodResource(t *testing.T) { | |||
assert.Equal(t, res.Resources[siCommon.CPU].GetValue(), int64(3000)) | |||
assert.Equal(t, res.Resources["nvidia.com/gpu"].GetValue(), int64(5)) | |||
|
|||
// Add pod OverHead, only support CPU and Memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @zhuqi-lucas , the getResource function would parse the resource which includes cpu, memory, gpu and other else.
Should we add filtering function to specific resource type?
Or adding a UT case which overhead contains different type of resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing in the KEP says that there is a limit on the type of resource that can be added. That means we need to follow the second option and extend the UT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @wilfred-s and @0yukali0 for review, sure, addressed in latest PR.
pkg/common/resource.go
Outdated
podOverHeadResource := getResource(pod.Spec.Overhead) | ||
podResource = Add(podResource, podOverHeadResource) | ||
// Logging the overall pod size and pod overhead | ||
log.Logger().Debug("We have calculated the overall pod size which includes the pod overhead", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Text in the message with "we": who is "we"?
Make it simple and factual, include a pod reference in the message as this is not traceable to anything:
log.Logger().Debug("Pod overhead specified, overall pod size adjusted",
zap.String("taskID", pod.UID),
zap.String("Pod overall size", podResource.String()),
zap.String("Pod overhead size", podOverHeadResource.String()))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion!
Addressed in latest PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM.
What is this PR for?
With K8s 1.24 a pod can now provide an overhead in the pod spec: pod.Spec.Overhead. The pod overhead allows specifying an overhead based on the runtime set on the pod. For certain runtimes this overhead can be large. See this KEP for details.
The scheduler should take into account this overhead if set on a pod. We currently calculate the size of the pod based on the containers but do not take the overhead into account. That needs to change.
We need to take the overhead into account as part of scheduling and quota calculations: include the pod.Spec.Overhead resources in the size of the pod before sending it to the core. Overhead only support cpu and memory and is added to the requests. The plugin framework (node related checks) calculates the pod size each time it is called (overhead!) and includes the overhead. The callback for the predicates must take that into account and not be broken by that implementation.
Adding clearly how we have calculated the overall pod size in the logging is a requirement.
Note from @wilfred-s :
The field is optional and will not be set in any pod, unless the cluster has been setup with it, even if you run 1.24 or later.
It is not a user managed field it is set by an admission controller. You cannot set the field on a pod when you create the pod. The field has been part of the offical pod spec since 1.16, and marked as a beta field since 1.18. It was only moved to GA in 1.24.
We need to handle it as an optional field. Even for a release that has the feature as GA the field can be a nil
What type of PR is it?
Todos
What is the Jira issue?
[YUNIKORN-2] Gang scheduling interface parameters
How should this be tested?
Screenshots (if appropriate)
Questions: