Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support node resource reservation #998

Merged
merged 1 commit into from
Mar 29, 2023

Conversation

lucming
Copy link
Contributor

@lucming lucming commented Feb 9, 2023

Ⅰ. Describe what this PR does

support resource reservation from node.annotation.
see this proposal for details: #922

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

@codecov
Copy link

codecov bot commented Feb 9, 2023

Codecov Report

Patch coverage: 86.82% and project coverage change: -0.19 ⚠️

Comparison is base (f8b4e31) 67.09% compared to head (3273a42) 66.91%.

❗ Current head 3273a42 differs from pull request most recent head d0b7c11. Consider uploading reports for the commit d0b7c11 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #998      +/-   ##
==========================================
- Coverage   67.09%   66.91%   -0.19%     
==========================================
  Files         262      272      +10     
  Lines       28960    29735     +775     
==========================================
+ Hits        19430    19896     +466     
- Misses       8173     8428     +255     
- Partials     1357     1411      +54     
Flag Coverage Δ
unittests 66.91% <86.82%> (-0.19%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/koordlet/resmanager/cpu_suppress.go 73.34% <78.94%> (+3.63%) ⬆️
pkg/util/node.go 86.04% <84.61%> (-2.19%) ⬇️
...xt/sharedlisterext/node_reservation_transformer.go 86.36% <86.36%> (ø)
...dlet/statesinformer/states_noderesourcetopology.go 66.12% <86.95%> (+2.20%) ⬆️
pkg/scheduler/plugins/elasticquota/node_handler.go 71.23% <100.00%> (+4.04%) ⬆️
.../plugins/nodenumaresource/topology_eventhandler.go 57.14% <100.00%> (+1.26%) ⬆️
...oller/noderesource/plugins/batchresource/plugin.go 69.27% <100.00%> (-0.73%) ⬇️

... and 58 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@eahydra eahydra changed the title resource reservation support node resource reservation Feb 9, 2023
Copy link
Member

@eahydra eahydra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, long awaited PR. I read part of the code first, and I will continue to read it later. BTW docker/koord-runtimeproxy.docker maybe removed?

pkg/scheduler/plugins/nodeReservation/plugin.go Outdated Show resolved Hide resolved
apis/extension/constants.go Outdated Show resolved Hide resolved
apis/extension/node.go Outdated Show resolved Hide resolved
apis/extension/node.go Outdated Show resolved Hide resolved
apis/extension/node.go Outdated Show resolved Hide resolved
apis/extension/node.go Outdated Show resolved Hide resolved
apis/extension/node.go Outdated Show resolved Hide resolved
@lucming lucming force-pushed the static_cpu_allocation2 branch 2 times, most recently from 48316bb to c9bc526 Compare February 9, 2023 16:06
// if specific cpus reserved by annotation of node, should remove those cpus.
cpusReservedByNodeAnno, _ := util.GetCPUsReservedByNodeAnno(node.Annotations)
if quantity, ok := result[corev1.ResourceCPU]; ok {
quantity.Sub(cpusReservedByNodeAnno[corev1.ResourceCPU])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only for QOSEffected == LSE, since LSR can be shared with BE

var lsrCpus []koordletutil.ProcessorInfo
var lsCpus []koordletutil.ProcessorInfo
// FIXME: be pods might be starved since lse pods can run out of all cpus
for _, processor := range nodeCPUInfo.ProcessorInfos {
cpuset := cpuset.NewCPUSet(int(processor.CPUID))
if cpuset.IsSubsetOf(reservedByNode) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should only exclude reservedByNode(qosEffected==LSE).

cpuReservedByNode, _ = util.GetCPUsReservedByNode(reserved)
}

// suppress(BE) := node.Total * SLOPercent - pod(LS).Used - system.Used - reservedByNodeAnno
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acturally, CPUSuppress want to ensure that nodeCPUUtilPercent <= SLOPercent.
the system.Used already includes the usage of non-pod-type process.
so the target should be:

for QosEffected == LSE
suppress(BE) := node.Total * SLOPercent - pod(LS-type).Used - max(system.Used, reservedByNodeAnno)

for QosEffected == LSR
suppress(BE) := node.Total * SLOPercent - pod(LS-type).Used - system.Used

@lucming lucming force-pushed the static_cpu_allocation2 branch 3 times, most recently from 9a62955 to 50ccbe9 Compare February 14, 2023 06:19
apis/extension/node_reservation.go Outdated Show resolved Hide resolved
apis/extension/node_reservation.go Outdated Show resolved Hide resolved
docker/koord-runtimeproxy.dockerfile Outdated Show resolved Hide resolved
apis/extension/node_reservation.go Outdated Show resolved Hide resolved
pkg/scheduler/plugins/nodereservation/plugin.go Outdated Show resolved Hide resolved
pkg/scheduler/plugins/nodereservation/plugin.go Outdated Show resolved Hide resolved
pkg/util/node.go Outdated
}

// RemoveNodeReservedCPUs filter out cpus that reserved by annotation of node.
func RemoveNodeReservedCPUs(cpuSharePools []apiext.CPUSharedPool, cpusReservedByNodeAnno string) []apiext.CPUSharedPool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the function RemoveNodeReservedCPUs maybe just used by koordlet, move the function to the koordlet/util?

pkg/util/node.go Outdated Show resolved Hide resolved
pkg/slo-controller/noderesource/resource_calculator.go Outdated Show resolved Hide resolved
@eahydra
Copy link
Member

eahydra commented Feb 23, 2023

There is another special place that needs to be modified. The code related to elasticQuota will perceive Node.Status.Allocatable as the total amount that can be allocated by Quota. Here we also need to modify it.

@@ -133,6 +133,14 @@ func (p *Plugin) Calculate(strategy *extension.ColocationStrategy, node *corev1.
nodeUsage := getNodeMetricUsage(nodeMetric.Status.NodeMetric)
systemUsed := quotav1.Max(quotav1.Subtract(nodeUsage, podAllUsed), util.NewZeroResourceList())

// System.Used = max(System.Used, Node.Anno.Reserved)
nodeReserved := util.GetNodeReservationFromAnnotation(node.Annotations)
for name, sysUsedQ := range systemUsed {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using quotav1.Max()?
There miss some checks, e.g. if the returned nodeReserved == nil, and node.Annotations == nil (mainly for testing).

@@ -225,6 +233,11 @@ func getNodeMetricUsage(info *slov1alpha1.NodeMetricInfo) corev1.ResourceList {
func getNodeAllocatable(node *corev1.Node) corev1.ResourceList {
result := node.Status.Allocatable.DeepCopy()
result = quotav1.Mask(result, []corev1.ResourceName{corev1.ResourceCPU, corev1.ResourceMemory})

// if something reserved by node.annotation, should remove those resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node reserved resources would be substracted in the Calculate(). We should not double-subtract the node reserved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is used in Calculate()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is used in the Calculate. Please check the formula if the nodeAllocatable was subtracted with the node.anno.reserved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o, got it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodeAllocatable contains node.allocatable[memory] and node.allocatable[cpu], we haven't modified this, so there is no double-subtract node.anno.reserved

Copy link
Member

@saintube saintube Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've defined the formula: batch.alloc := node.alloc * ratio - sum(pod(Prod).usage) - max(sys.usage, node.anno.reserved), where node.anno.reserved would be subtracted no more than once. While in the current implementation, we subtract the node.anno.reserved firstly in the getNodeAllocatable and secondly in the Calculate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node.alloc = node.alloc - node.anno.reserved
batch.alloc := node.alloc * ratio - sum(pod(Prod).usage) - sys.usage
so use this formula to calculate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node.alloc = node.alloc - node.anno.reserved batch.alloc := node.alloc * ratio - sum(pod(Prod).usage) - sys.usage so use this formula to calculate?

This formula is incorrect as we discussed in #922, where sys.usage also includes the usage of node.anno.reserved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return
}

cpusReservedByAnno, _ := apiext.GetReservedCPUs(topo.Annotations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to handle the "numReservedCPUs"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noderesourcetopo.annotation will only store the specific cores to be reserved. either by quantity or by specifying the CPU directly, the exact core that is reserved will be calculated, and then written to nodetopo.annotation["node.koordinator.sh/reservation"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, if we set the NodeReservation.Resources to a non-zero quantity and set the NodeReservation.ReservedCPUs as empty, the reserved resources by quantity will be ignored unexpectedly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@@ -133,6 +133,14 @@ func (p *Plugin) Calculate(strategy *extension.ColocationStrategy, node *corev1.
nodeUsage := getNodeMetricUsage(nodeMetric.Status.NodeMetric)
systemUsed := quotav1.Max(quotav1.Subtract(nodeUsage, podAllUsed), util.NewZeroResourceList())

// System.Used = max(System.Used, Node.Anno.Reserved)
nodeReserved := util.GetNodeReservationFromAnnotation(node.Annotations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure the calculation is correct, please add a UT case for the node resource reservation later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

notPodUsed := systemUsedCPU
rl := util.GetNodeReservationFromAnnotation(node.Annotations)
if rl != nil {
nodeAnnoReserved, _ := rl[corev1.ResourceCPU]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also use quotav1.Max()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -138,6 +138,10 @@ func (p *Plugin) Calculate(strategy *extension.ColocationStrategy, node *corev1.
nodeUsage := getNodeMetricUsage(nodeMetric.Status.NodeMetric)
systemUsed := quotav1.Max(quotav1.Subtract(nodeUsage, podAllUsed), util.NewZeroResourceList())

// System.Used = max(System.Used, Node.Anno.Reserved)
nodeAnnoReserved := util.GetNodeReservationFromAnnotation(node.Annotations)
systemUsed = quotav1.Max(systemUsed, nodeAnnoReserved)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a case in UT TestPluginCalculate where the node has annotation reserved resources to verify the calculation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: lucming <2876757716@qq.com>
@zwzhang0107
Copy link
Contributor

/lgtm

Copy link
Member

@saintube saintube left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Well done. Thanks for your contributions.

@zwzhang0107
Copy link
Contributor

/approve

Copy link
Member

@FillZpp FillZpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: eahydra, FillZpp, saintube, zwzhang0107

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@koordinator-bot koordinator-bot bot merged commit 76604d6 into koordinator-sh:main Mar 29, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants