New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I allocate different AMD gpu device to different process? #841
Comments
If it is OpenCL application, set the GPU_DEVICE_ORDINAL environment parameter could help. If it is HC or HIP application, does this GPU_DEVICE_ORDINAL take effect to allocate gpu device to application? |
There're a couple of methods to expose only selected GPUs to the user process for hip/hcc path:
|
@sunway513 many thanks for your response. It is real helpful to me. I am new for AMD GPU world, and I have no real environment for testing and experiment. Could I know more about ROCR_VISIBLE_DEVICES? For ROCR_VISIBLE_DEVICES environment variable, does it work for both HCC, HIP and openCL applications? Is there any environment pre-condition to use the ROCR_VISIBLE_DEVICES environment variable for exposing selected devices? What is difference between ROCm user-bit driver and driver installed from rocm-dkms primary meta-package? If I install rocm platform via rocm-dkms primary meta-package way, then I could leverage this variable to expose selected gpu devices for different application process? |
Hi @nanguanqi , you are welcome :-) ROCm user bit drivers include ROCr and THUNK. And to your last question, yes, that would work. |
@sunway513 but how do i confirm that my process is using the right device? |
Hi, you can open another terminal and watch for the GPU activities using the following command: |
@sunway513, I got Radeon WX4100(node 1) and MI100(node 2), I've set |
Suppose, on the machine, there are 2 AMD gpu devices. How can I make process 1 use device0, process2 use device1? Is there an environment variable like "CUDA_VISIABLE_DEVICES" to set visible AMD GPU devices for processes?
Dy default, are AMD GPU devices shared by all processes?
The text was updated successfully, but these errors were encountered: