Is it necessary to bind only one GPU to one slave pod? #10

ilyee · 2021-01-28T11:47:05Z

Hello, I am elihe from Zhihu. I have seen your article in Zhihu before. After reading your code I have a question:
Why is each slave pod bound to only one GPU in GetAvailableGPU method of pkg/util/gpu/allocator/allocator.go?
As far as I'm concerned, in a large-scale cluster, this will bring additional load to the master node (there will be a larger number of pod creation requests); And the creation of multiple single-card pods may cause two competing GPU mount requests all failing (for example There are 4 available GPUs and two requests to mount 4 cards. One request successfully created slave pods 1 and 2, and the other created slave pods 3 and 4. They will all be unable to obtain more resources.)
If you agree with me, can I submit a merge request to optimize this?

pokerfaceSad · 2021-01-28T15:34:25Z

Hi @ilyee , thanks for your issue.

In my opinion, it is really a trade off to bind only one GPU to one slave pod. Because if we request all GPUs by one slave pod, it will be complicated to unmount.

In current implementation, we just need to delete a slave pod if we will to unmount a GPU. But if we request all GPUs by only one slave pod, during a unmount operation, it will be complicated to tell kubelet and kube-scheduler the unmounted GPU is free (May be need some hack).

So I think it is really a trade off.

Please feel free to correct me if my opinion is unreasonable.

ilyee · 2021-01-28T15:58:15Z

I fully get it, your opinion is more reasonable.

pokerfaceSad · 2021-01-29T06:01:47Z

Hope more communication the future if possible.
I have sent you my WeChat ID via Zhihu.

ilyee closed this as completed Jan 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it necessary to bind only one GPU to one slave pod? #10

Is it necessary to bind only one GPU to one slave pod? #10

ilyee commented Jan 28, 2021

pokerfaceSad commented Jan 28, 2021

ilyee commented Jan 28, 2021

pokerfaceSad commented Jan 29, 2021

Is it necessary to bind only one GPU to one slave pod? #10

Is it necessary to bind only one GPU to one slave pod? #10

Comments

ilyee commented Jan 28, 2021

pokerfaceSad commented Jan 28, 2021

ilyee commented Jan 28, 2021

pokerfaceSad commented Jan 29, 2021