Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.13] Sets CUDA_MODULE_LOADING to LAZY when not set by the user #86509

Merged
merged 1 commit into from Oct 18, 2022

Conversation

syed-ahmed
Copy link
Contributor

@syed-ahmed syed-ahmed commented Oct 7, 2022

This PR sets CUDA_MODULE_LOADING if it's not set by the user. By default, it sets it to "LAZY".

It was tested using the following commands:

python -c "import torch; tensor=torch.randn(20, 16, 50, 100).cuda(); free, total = torch.cuda.cudart().cudaMemGetInfo(0); print(total-free)"

which shows a memory usage of: 287,047,680 bytes

vs

CUDA_MODULE_LOADING="DEFAULT" python -c "import torch; tensor=torch.randn(20, 16, 50, 100).cuda(); free, total = torch.cuda.cudart().cudaMemGetInfo(0); print(total-free)"

which shows 666,632,192 bytes.

C++ implementation is needed for the libtorch users (otherwise it could have been a pure python functionality).

cc: @ptrblck @ngimel @malfet

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 7, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86509

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 Failures, 3 Pending

As of commit 52f3d32:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@xsacha
Copy link
Contributor

xsacha commented Oct 10, 2022

In regards to the C++ implementation, maybe_set_cuda_module_loading could be exposed or run static.
However, you would need to set this variable before any of the CUDA libraries get loaded. This proves tricky when it is libtorch bringing in these dependencies. For this reason, unless the CUDA libraries are all delay loaded, it's probably best that libtorch consumers use an equivalent to this function before loading libtorch.

@atalman atalman added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 17, 2022
@malfet malfet merged commit 1442f04 into pytorch:release/1.13 Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants