Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVML version + H100 GPU #1297

Closed
mathrock74 opened this issue Aug 8, 2023 · 3 comments
Closed

NVML version + H100 GPU #1297

mathrock74 opened this issue Aug 8, 2023 · 3 comments

Comments

@mathrock74
Copy link

I'm using DeepOps 21.09 which compiled Slurm 21.08 against CUDA NVML 11.4 for GRES autodetect support. We are using A100 GPU currently. I wanted to check if this Slurm installation would be able to detect/use H100 GPU. I couldn't find any hint... nvml.h only mentions "All Tesla products, starting with the Fermi architecture" (also most recent 12.2). Tried to find the responsible libraries/binaries (using "strings"...), no avail... I would be happy if someone could explain this. Thank you.

@arnoldas500
Copy link

Did you get a chance to test this with an H100 GPU yet?

@mathrock74
Copy link
Author

Hi, no I didn't test this yet because of lack of hardware. But my question which is actually more about NVML capabilities was answered here https://forums.developer.nvidia.com/t/slurm-gpu-autodetection-with-nvml-and-h100/263866/2.

This can be closed.

@mathrock74
Copy link
Author

Really close...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants