-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support node resource reservation #998
support node resource reservation #998
Conversation
163f452
to
3a1ea13
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #998 +/- ##
==========================================
- Coverage 67.09% 66.91% -0.19%
==========================================
Files 262 272 +10
Lines 28960 29735 +775
==========================================
+ Hits 19430 19896 +466
- Misses 8173 8428 +255
- Partials 1357 1411 +54
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 58 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
3a1ea13
to
e54ce35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, long awaited PR. I read part of the code first, and I will continue to read it later. BTW docker/koord-runtimeproxy.docker
maybe removed?
48316bb
to
c9bc526
Compare
// if specific cpus reserved by annotation of node, should remove those cpus. | ||
cpusReservedByNodeAnno, _ := util.GetCPUsReservedByNodeAnno(node.Annotations) | ||
if quantity, ok := result[corev1.ResourceCPU]; ok { | ||
quantity.Sub(cpusReservedByNodeAnno[corev1.ResourceCPU]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only for QOSEffected == LSE, since LSR can be shared with BE
var lsrCpus []koordletutil.ProcessorInfo | ||
var lsCpus []koordletutil.ProcessorInfo | ||
// FIXME: be pods might be starved since lse pods can run out of all cpus | ||
for _, processor := range nodeCPUInfo.ProcessorInfos { | ||
cpuset := cpuset.NewCPUSet(int(processor.CPUID)) | ||
if cpuset.IsSubsetOf(reservedByNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should only exclude reservedByNode(qosEffected==LSE).
cpuReservedByNode, _ = util.GetCPUsReservedByNode(reserved) | ||
} | ||
|
||
// suppress(BE) := node.Total * SLOPercent - pod(LS).Used - system.Used - reservedByNodeAnno |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acturally, CPUSuppress want to ensure that nodeCPUUtilPercent <= SLOPercent.
the system.Used already includes the usage of non-pod-type process.
so the target should be:
for QosEffected == LSE
suppress(BE) := node.Total * SLOPercent - pod(LS-type).Used - max(system.Used, reservedByNodeAnno)
for QosEffected == LSR
suppress(BE) := node.Total * SLOPercent - pod(LS-type).Used - system.Used
8dc5e16
to
cc94337
Compare
9a62955
to
50ccbe9
Compare
pkg/scheduler/plugins/nodenumaresource/topology_eventhandler.go
Outdated
Show resolved
Hide resolved
pkg/util/node.go
Outdated
} | ||
|
||
// RemoveNodeReservedCPUs filter out cpus that reserved by annotation of node. | ||
func RemoveNodeReservedCPUs(cpuSharePools []apiext.CPUSharedPool, cpusReservedByNodeAnno string) []apiext.CPUSharedPool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the function RemoveNodeReservedCPUs
maybe just used by koordlet, move the function to the koordlet/util?
There is another special place that needs to be modified. The code related to elasticQuota will perceive Node.Status.Allocatable as the total amount that can be allocated by Quota. Here we also need to modify it. |
9b76936
to
a06c6a9
Compare
@@ -133,6 +133,14 @@ func (p *Plugin) Calculate(strategy *extension.ColocationStrategy, node *corev1. | |||
nodeUsage := getNodeMetricUsage(nodeMetric.Status.NodeMetric) | |||
systemUsed := quotav1.Max(quotav1.Subtract(nodeUsage, podAllUsed), util.NewZeroResourceList()) | |||
|
|||
// System.Used = max(System.Used, Node.Anno.Reserved) | |||
nodeReserved := util.GetNodeReservationFromAnnotation(node.Annotations) | |||
for name, sysUsedQ := range systemUsed { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using quotav1.Max()
?
There miss some checks, e.g. if the returned nodeReserved == nil
, and node.Annotations == nil
(mainly for testing).
@@ -225,6 +233,11 @@ func getNodeMetricUsage(info *slov1alpha1.NodeMetricInfo) corev1.ResourceList { | |||
func getNodeAllocatable(node *corev1.Node) corev1.ResourceList { | |||
result := node.Status.Allocatable.DeepCopy() | |||
result = quotav1.Mask(result, []corev1.ResourceName{corev1.ResourceCPU, corev1.ResourceMemory}) | |||
|
|||
// if something reserved by node.annotation, should remove those resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The node reserved resources would be substracted in the Calculate()
. We should not double-subtract the node reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is used in Calculate()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is used in the Calculate
. Please check the formula if the nodeAllocatable
was subtracted with the node.anno.reserved
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
o, got it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodeAllocatable
contains node.allocatable[memory] and node.allocatable[cpu], we haven't modified this, so there is no double-subtract node.anno.reserved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've defined the formula: batch.alloc := node.alloc * ratio - sum(pod(Prod).usage) - max(sys.usage, node.anno.reserved)
, where node.anno.reserved would be subtracted no more than once. While in the current implementation, we subtract the node.anno.reserved firstly in the getNodeAllocatable
and secondly in the Calculate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node.alloc = node.alloc - node.anno.reserved
batch.alloc := node.alloc * ratio - sum(pod(Prod).usage) - sys.usage
so use this formula to calculate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node.alloc = node.alloc - node.anno.reserved batch.alloc := node.alloc * ratio - sum(pod(Prod).usage) - sys.usage so use this formula to calculate?
This formula is incorrect as we discussed in #922, where sys.usage also includes the usage of node.anno.reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return | ||
} | ||
|
||
cpusReservedByAnno, _ := apiext.GetReservedCPUs(topo.Annotations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to handle the "numReservedCPUs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noderesourcetopo.annotation
will only store the specific cores to be reserved. either by quantity or by specifying the CPU directly, the exact core that is reserved will be calculated, and then written to nodetopo.annotation["node.koordinator.sh/reservation"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation, if we set the NodeReservation.Resources
to a non-zero quantity and set the NodeReservation.ReservedCPUs
as empty, the reserved resources by quantity will be ignored unexpectedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
@@ -133,6 +133,14 @@ func (p *Plugin) Calculate(strategy *extension.ColocationStrategy, node *corev1. | |||
nodeUsage := getNodeMetricUsage(nodeMetric.Status.NodeMetric) | |||
systemUsed := quotav1.Max(quotav1.Subtract(nodeUsage, podAllUsed), util.NewZeroResourceList()) | |||
|
|||
// System.Used = max(System.Used, Node.Anno.Reserved) | |||
nodeReserved := util.GetNodeReservationFromAnnotation(node.Annotations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure the calculation is correct, please add a UT case for the node resource reservation later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
02d3ed6
to
08633e8
Compare
08633e8
to
707d230
Compare
707d230
to
48cd91a
Compare
notPodUsed := systemUsedCPU | ||
rl := util.GetNodeReservationFromAnnotation(node.Annotations) | ||
if rl != nil { | ||
nodeAnnoReserved, _ := rl[corev1.ResourceCPU] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also use quotav1.Max()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -138,6 +138,10 @@ func (p *Plugin) Calculate(strategy *extension.ColocationStrategy, node *corev1. | |||
nodeUsage := getNodeMetricUsage(nodeMetric.Status.NodeMetric) | |||
systemUsed := quotav1.Max(quotav1.Subtract(nodeUsage, podAllUsed), util.NewZeroResourceList()) | |||
|
|||
// System.Used = max(System.Used, Node.Anno.Reserved) | |||
nodeAnnoReserved := util.GetNodeReservationFromAnnotation(node.Annotations) | |||
systemUsed = quotav1.Max(systemUsed, nodeAnnoReserved) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a case in UT TestPluginCalculate
where the node has annotation reserved resources to verify the calculation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
48cd91a
to
3273a42
Compare
Signed-off-by: lucming <2876757716@qq.com>
3273a42
to
d0b7c11
Compare
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Well done. Thanks for your contributions.
/approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eahydra, FillZpp, saintube, zwzhang0107 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Ⅰ. Describe what this PR does
support resource reservation from node.annotation.
see this proposal for details: #922
Ⅱ. Does this pull request fix one issue?
Ⅲ. Describe how to verify it
Ⅳ. Special notes for reviews
V. Checklist
make test