Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does DCGM supports creating groups of GPU from different hosts? #146

Open
deferen2 opened this issue Jan 15, 2024 · 1 comment
Open

Does DCGM supports creating groups of GPU from different hosts? #146

deferen2 opened this issue Jan 15, 2024 · 1 comment

Comments

@deferen2
Copy link

I’ve read the section on groups in the documentation, but I’m still unclear about the limitations of the Groups feature.
I’m not sure if it’s restricted to creating a group composed of GPUs from a single host, or if it’s possible to group cards from different hosts.

There is this line in the DCGM documentation that makes me think that GPU groups are limited to a single host:
"Almost all DCGM operations take place on groups. Users can create, destroy and modify collections of GPUs on the local node"

But then there is no reference to this limitation again, and in the overview it is written a generic:
"... and individual users managing groups of NVIDIA GPUs."

So I was wondering, are the groups limited to single hosts?

Thanks.

@nikkon-dev
Copy link
Collaborator

@deferen2,

Yes, groups are limited to a single nv-hostengine instance. Internally, groups are just a list of entities local to the hostengine without any special logic attached to it.

WBR,
Nik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants