Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hiding/Disabling one GPU in a multiGPU system #711

Closed
SandboChang opened this issue Feb 17, 2019 · 4 comments
Closed

Hiding/Disabling one GPU in a multiGPU system #711

SandboChang opened this issue Feb 17, 2019 · 4 comments
Labels

Comments

@SandboChang
Copy link

Background: ubuntu 18.04, ROCm 2.1
GPUs: Vega FE2, Radeon VII1

Would there be a way to hide one GPU in the system?
I am trying to run some tests with the GPUs, but many of the benchmark scripts do not allow selecting a particular GPU device. Thus if I can hide the GPU from the software it will be very helpful.

With Nvidia's card, it seems like this can be done by "CUDA_VISIBLE_DEVICES" to hide a GPU from a test. Would there be a similar feature in ROCm?

@jlgreathouse
Copy link
Collaborator

We make this avaialble as a language-level feature rather than a ROCm-wide feature. So depending on the programming language your benchmark uses, you will need to set a different environment variable:

  • OpenCL: GPU_DEVICE_ORDINAL is a comma-separated list of the devices you would like to be visible to your OpenCL-using application.
  • HIP: HIP_VISIBLE_DEVICES is a comma-separated list of devices you would like to be visible to your HIP-using application.
  • HCC: HCC_DEFAULT_GPU can be used to automatically change which GPU HCC will use for GPU offload.

@SandboChang
Copy link
Author

SandboChang commented Feb 19, 2019

We make this avaialble as a language-level feature rather than a ROCm-wide feature. So depending on the programming language your benchmark uses, you will need to set a different environment variable:

  • OpenCL: GPU_DEVICE_ORDINAL is a comma-separated list of the devices you would like to be visible to your OpenCL-using application.
  • HIP: HIP_VISIBLE_DEVICES is a comma-separated list of devices you would like to be visible to your HIP-using application.
  • HCC: HCC_DEFAULT_GPU can be used to automatically change which GPU HCC will use for GPU offload.

Thanks a lot for your detailed reply. I should have been more specific, I am trying to run some benchmarks which were provided here:
https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks

So it is using Tensorflow with ROCm. In this case, I am not sure if it falls into the category of OpenCL/HIP/HCC. Appreciated if you can let me know.

@jlgreathouse
Copy link
Collaborator

AMD's TensorFlow implementation uses the HIP language runtime, so you should probably use HIP_VISIBLE_DEVICES.

@SandboChang
Copy link
Author

Thanks, by adding this line HIP_VISIBLE_DEVICES=2 in front of python3 ....benchmark...., I was able to run it on the desired GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants