Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to assign specific gpu or vgpu? #56

Closed
jiangxiaobin96 opened this issue Sep 15, 2022 · 4 comments
Closed

Need to assign specific gpu or vgpu? #56

jiangxiaobin96 opened this issue Sep 15, 2022 · 4 comments

Comments

@jiangxiaobin96
Copy link

How to select which gpu or vgpu to allocate?

for _, str := range req.DevicesIDs {
	vGpuID, err := readVgpuIDFromFile(vGpuBasePath, str, "mdev_type/name")
	if err != nil || vGpuID != dpi.deviceName {
		log.Println("Could not get vGPU type identifier for device ", str)
		continue
	}

	key := fmt.Sprintf("%s_%s", vgpuPrefix, dpi.deviceName)
	if _, exists := envList[key]; !exists {
		envList[key] = []string{}
	}
	envList[key] = append(envList[key], str)
}

It seems that request YAML will assign specific vgpuID(str in code) and device plugin will use str to find specific vgpu in list to allocate.

@rthallisey
Copy link
Collaborator

Are you trying to assign workloads to a specific physical gpu? Or is your question about how to assign a vgpu a profile and expose it on a Kubernetes Node?

@jiangxiaobin96
Copy link
Author

If a Kubernetes Node has multi gpu or vgpu, how to select which gpu or vgpu to use.

@rthallisey
Copy link
Collaborator

Let's say I have these gpus and vgpus exposed on my node:

    nvidia.com/1e37: "2"
    nvidia.com/GeForce_RTX_T10-8: "2"
    nvidia.com/GeForce_RTX_T10x-2: "8"
    nvidia.com/GeForce_RTX_T10x-4: "4"

I can't say, "I want gpu 1e37 with pci id 01:00.00" or "I want vgpu RTX_10-8 on gpu with pci id 01:00.00".
I can say, "I want a 1e37 gpu" or "I want an RTX_10-8" or "I want an RTX_10-8 and a 1e37".

There isn't an exposed interface to select gpus by pci id, however you can control where workloads are scheduled in k8s (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/), which you should use if you need that level of control.

@rthallisey
Copy link
Collaborator

Closing. Please reopen if you are still having an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants