Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda_plot keeps GPU x open when only GPU y is selected #29

Closed
mwpastore opened this issue Feb 4, 2023 · 8 comments
Closed

cuda_plot keeps GPU x open when only GPU y is selected #29

mwpastore opened this issue Feb 4, 2023 · 8 comments

Comments

@mwpastore
Copy link
Contributor

mwpastore commented Feb 4, 2023

When I select e.g. CUDA device 1 with e.g. -g 1, I can see in nvtop that cuda_plot keeps a process open on CUDA device 0. I'm not sure what impact this has, if any, but I can see some undesirable RMA in numatop and this is the most likely explanation in my case. I think it would be better if cuda_plot completely disengaged the CUDA device(s) not selected with -g.

@mwpastore
Copy link
Contributor Author

mwpastore commented Feb 5, 2023

More info: a cuda_plot -g 0 only starts a single job on CUDA device 0, but a cuda_plot -g 1 starts jobs on CUDA devices 0 and 1.

@madMAx43v3r
Copy link
Owner

madMAx43v3r commented Feb 5, 2023

Yeah I've seen this with many CUDA applications before, the driver likes to use GPU 0 always for no apparent reason...

@madMAx43v3r
Copy link
Owner

I think it would be better if cuda_plot completely disengaged the CUDA device(s) not selected with -g.

I'm not using device 0 on purpose... it's the driver doing it internally.

@madMAx43v3r
Copy link
Owner

The first thing I do in the code is cudaSetDevice(device) for your selected -g

@mwpastore
Copy link
Contributor Author

Maybe it's harmless? I was having trouble pinning down a solid benchmark result that showed any penalty from having that open, unused handle to the other GPU.

I wonder if you can do cudaSetValidDevices before anything else to exclude the non-selected GPU(s)?

@madMAx43v3r
Copy link
Owner

I think that function just remaps the integers, which I can use to simplify the code but that's it.

@mwpastore
Copy link
Contributor Author

I was able to "solve" this "issue" by setting CUDA_VISIBLE_DEVICES in the environment before launching cuda_plot. I don't even need to use the -g flag anymore.

@madMAx43v3r
Copy link
Owner

good to know, thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants