Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom C++ and CUDA Extensions / multiple GPU issue #1133

Closed
adam-ce opened this issue Aug 25, 2020 · 3 comments
Closed

Custom C++ and CUDA Extensions / multiple GPU issue #1133

adam-ce opened this issue Aug 25, 2020 · 3 comments

Comments

@adam-ce
Copy link

adam-ce commented Aug 25, 2020

Hi,
I followed the C++ extension tutorial (https://pytorch.org/tutorials/advanced/cpp_extension.html), but I had issues when using multiple GPUs.

The issue is solved: https://discuss.pytorch.org/t/c-cuda-extension-with-multiple-gpus/91241/6

I'd like to suggest adding the corresponding code lines to the tutorial, so that other people won't have the same problem.

Cheers, Adam

@holly1238
Copy link
Contributor

See pytorch/pytorch#48891 and #1196

@adam-ce
Copy link
Author

adam-ce commented Aug 1, 2021

I'm sorry, but I don't see the resolution of this issue.

cheers, Adam

@AlbertZhangHIT
Copy link

Hi, I also got the same issue for my custom C++ and CUDA extensions.

The RuntimeError: CUDA error: an illegal memory access was encountered occurs especially when some other process is running some GPUs. Let's say, I have 2 GPUs and there is a network training process (PyTorch) running on GPU 0. If I run my custom CUDA extension by setting device=torch.device('cuda:0'), then the custom CUDA extension runs well while it slows down the other network training process. The CUDA error will occur if I set device=torch.device('cuda:1') for my custom CUDA extension.

I found the solution from https://discuss.pytorch.org/t/c-cuda-extension-with-multiple-gpus/91241/6 does not work for the extension that the cpu extension and cuda extension are wrapped up by a common wrapper. Since the compilation of cpu extension will encounter the error that requires CUDA. Does anyone have idea about this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants