Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat]: enable multiple GPU UUID #361

Closed
JasonHe-WQ opened this issue Jun 19, 2024 · 1 comment
Closed

[feat]: enable multiple GPU UUID #361

JasonHe-WQ opened this issue Jun 19, 2024 · 1 comment

Comments

@JasonHe-WQ
Copy link
Contributor

JasonHe-WQ commented Jun 19, 2024

1. Issue or feature description

In case using one Pod with many GPUs and specific GPUs are needed to be allocated, the current single UUID in annotation may be insufficient since one single uuid can be provided. This feature aims to enable schedule a Pod with a list of GPU UUID submitted in its annotation.

e.g.

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
  annotations:
    nvidia.com/use-gpuuuid: "GPU-123456,GPU-ABCDEFG"
spec:
  containers:
    - name: ubuntu-container
      image: ubuntu:18.04
      command: ["bash", "-c", "sleep 86400"]
      resources:
        limits:
          nvidia.com/gpu: 2 # requesting 2 vGPUs

2. Methodology

There are many cases needed to be considered.

  • Multiple GPU with multiple UUID, and the numbers match
    • Find the very node or mark it as not schedulable
  • Multiple GPU with less UUID than GPU number
    • Mark it as not schedulable
  • Multiple GPU with more UUID than GPU number
    • Choose any node meets its request
  • Single GPU
    • No change between the current code and the purpose

In order to enable this feature, changes will be taken place on function fitInCertainDevice

func fitInCertainDevice(node *NodeUsage, request util.ContainerDeviceRequest, annos map[string]string, pod *corev1.Pod) (bool, map[string]util.ContainerDevices) {

  • Using map[DeviceType][]*DeviceListScore instead of original simple DeviceUsageList
  • Putting function checkType and checkUUID out of the loop
  • Changing function checkUUID

3. Information to [attach]

Design draft in Chinese: link

@JasonHe-WQ
Copy link
Contributor Author

Already supported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant