Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: vgpu metrics not update when pod deleted #3614

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

googs1025
Copy link
Member

fix: #3605

When a pod is deleted, we need to re-update the metrics

// SubResource frees the gpu hold by the pod
func (gs *GPUDevices) SubResource(pod *v1.Pod) {
	ids, ok := pod.Annotations[AssignedIDsAnnotations]
	if !ok {
		return
	}
	podDev := decodePodDevices(ids)
	for _, val := range podDev {
		for _, deviceused := range val {
			if gs == nil {
				break
			}
			for index, gsdevice := range gs.Device {
				if gsdevice.UUID == deviceused.UUID {
					klog.V(4).Infoln("VGPU subsctracting pod", pod.Name, "device", deviceused)
					gs.Device[index].UsedMem -= uint(deviceused.Usedmem)
					gs.Device[index].UsedNum--
					gs.Device[index].UsedCore -= uint(deviceused.Usedcores)
				}
			}
		}
	}
	gs.GetStatus()
}

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign k82cn
You can assign the PR to them by writing /assign @k82cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jul 19, 2024
@googs1025
Copy link
Member Author

/assign @archlitchi @Monokaix

@googs1025
Copy link
Member Author

I haven't tested it yet, I will do it in the next few days

@googs1025
Copy link
Member Author

/kind bug

@volcano-sh-bot volcano-sh-bot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 19, 2024
@Monokaix
Copy link
Member

Monokaix commented Jul 20, 2024

gs.GetStatus() has a get semantics but actually does an update operation,I think change to updateStatus is better,what do you think? @archlitchi

@googs1025
Copy link
Member Author

googs1025 commented Jul 20, 2024

gs.GetStatus() has a get semantics but actually does an update operation,I think change to updateStatus is better,what do you think? @archlitchi

+1
IMO, This method name is indeed very misleading. Does it need to be modified?

Signed-off-by: googs1025 <googs1025@gmail.com>
@googs1025
Copy link
Member Author

@archlitchi /PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

volcano vgpu metrics not update properly
4 participants