-
Notifications
You must be signed in to change notification settings - Fork 2k
MPS Support #419
Comments
Short answer, it is not supported for now. However, we are looking at it for the 2.0 timeframe but there are a lot of corner cases that need to be investigated. I'll update this issue with additional information once we are confident it could work properly. |
Hi, |
This MPS Support seems like it would be a blocker creating the service deployments in orchestration. I'll be following the outcome in anticipation for a pull request use-case for the swarm or Kubernetes functionality. @ |
Any progress? or is there any workaround so I can use CUDA Multi-Process Service in the container? |
Shouldn't it be the other way around? I.E. The MPS should run on the host so it can allocate process time to multiple containers? Is that an already supported architecture? |
With 2.0 it should work as long as you run the MPS server on the host and use # Launch two containers on the second GPU device
sudo CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1 nvidia-cuda-mps-control -d
docker run -ti --rm -e NVIDIA_VISIBLE_DEVICES=1 --runtime=nvidia --ipc=host nvidia/cuda
docker run -ti --rm -e NVIDIA_VISIBLE_DEVICES=1 --runtime=nvidia --ipc=host nvidia/cuda
echo quit | sudo nvidia-cuda-mps-control |
Does it mean that we can set and limit CUDA_MPS_ACTIVE_THREAD_PERCENTAGE for each container? Any examples of usage would really help. Could you please elaborate what you mean by "better integration"? Thank you |
mark |
@3XX0 How much does "-ipc=host" compromise security? Somebody asked the question on SO but no answer yet: https://stackoverflow.com/questions/38907708/docker-ipc-host-and-security |
@3XX0 Any update on when nvidia-docker will officially support MPS? |
@3XX0 I did some tests and --ipc=host does appear to work. But is there anything else we should pay attention to run current nvidia-docker 2 under MPS? Would you recommend to use it in production? Would be super helpful if you can provide some guidance here. |
I've added a wiki page on how to use MPS with Docker Compose: You can look at the |
Hi, @flx42 , is it possible to provide a compose file which format version is 2.1? As lots of companies still use docker 1.12 in their cluster and they cannot upgrade their docker version to 17.0.6 in short term. |
@azazhu are you running RHEL/Atomic's fork of Docker? If you do, you can just remove the If that's not what you are running, you won't be able to make it work since the |
Thx, @flx42, Could you check me if my understanding is correct or not:
|
Yes, that should work. But you can also containerize the MPS daemon, like in the Docker Compose example.
IIRC you can set this value for the MPS daemon, or for all CUDA client apps. I think both work fine. |
thx @flx42 , what do you mean by "containerize the MPS daemon"? To launch MPS daemon(nvidia-cuda-mps-control) on both host machine and container? |
Yes, you can launch it inside a container or on the host. Both ways will work. |
hi @flx42 ,
|
@flx42 Does MPS support pascal GPU in nvidia-docker contrainers? |
@GoodJoey not with the approach documented above, you would need a Volta GPU. |
Seems like mps is not supported on the newest docker version. especially it's not This example shows well that the containers have some kind of problem with cuda....
Would really love to see "usable" support of mps with docker |
Any update on this issue? |
Hi, have you solved this problem? |
Hi, have you solved this problem? I want to set different CUDA_MPS_ACTIVE_THREAD_PERCENTAGE for each container, such as 3*30%and1*10% for a specific GPU. |
any update? |
We are working on a DRA Driver for NVIDA GPUs (https://github.com/NVIDIA/k8s-dra-driver) which will include better MPS support. If there are use cases not covered by this (e.g. outside of K8s), please create an issue describing the use case against https://github.com/NVIDIA/nvidia-container-toolkit. |
Hi,
When I use "CUDA Multi-Process Service" aka MPS in nvidia-docker environment, I met a couple of issues. So I'm wonder if MPS is supported in nvidia-docker? Please help me, thanks in advance~
Here is problems I have met:
nvidia-cuda-mps-control -d
to start mps daemon in Nvidia-docker, I can't see this process fromnvidia-smi
, however, I can see this process from host machine.In comparison, when I run the same command,
nvidia-cuda-mps-control -d
, in Host machine (physical server), I got see this from nvidia-smi. (need run a gpu program first to start MPS server)F0703 13:39:15.539633 97 common.cpp:165] Check failed: error == cudaSuccess (46 vs. 0) all CUDA-capable devices are busy or unavailable
In comparison, this works ok in host (physical machine).
I'm trying this on P100 GPU, Ubuntu14,
Docker version 17.04.0-ce, build 4845c56
I hope this is the right place to ask, thanks again.
The text was updated successfully, but these errors were encountered: