[CUDA Host Allocator] Add support of CudaHostRegister #108488

banitag1 · 2023-09-03T01:42:24Z

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister.

Differential Revision: D45843715

pytorch-bot · 2023-09-03T01:42:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108488

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 7b5470a with merge base a0cea51 ():

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2023-09-03T01:42:33Z

This pull request was exported from Phabricator. Differential Revision: D45843715

github-actions · 2023-09-03T01:44:49Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister. Differential Revision: D45843715

facebook-github-bot · 2023-09-03T14:46:39Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister. Differential Revision: D45843715

facebook-github-bot · 2023-09-05T23:52:16Z

This pull request was exported from Phabricator. Differential Revision: D45843715

facebook-github-bot · 2023-10-04T19:19:06Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Differential Revision: D45843715

facebook-github-bot · 2023-10-04T22:43:17Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Differential Revision: D45843715

facebook-github-bot · 2023-10-04T22:43:36Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Differential Revision: D45843715

facebook-github-bot · 2023-10-04T23:45:10Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Differential Revision: D45843715

facebook-github-bot · 2023-10-04T23:45:37Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Differential Revision: D45843715

facebook-github-bot · 2023-10-05T00:28:28Z

This pull request was exported from Phabricator. Differential Revision: D45843715

facebook-github-bot · 2023-10-05T00:28:56Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Differential Revision: D45843715

facebook-github-bot · 2023-10-05T02:52:27Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Differential Revision: D45843715

zdevito · 2023-10-05T17:22:46Z

test/test_cuda.py

@@ -76,6 +76,13 @@ def tearDown(self):
        del self.autocast_lists
        super().tearDown()

+    def test_pinned_memory_with_cudaregister(self):
+        os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "pinned_use_cuda_host_register:True,pinned_num_register_threads:8"


This should be torch.cuda.memory._set_allocator_settings otherwise it won't have an effect. Make sure to set it back after the test.

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Reviewed By: zdevito Differential Revision: D45843715

facebook-github-bot · 2023-10-05T18:39:50Z

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Reviewed By: zdevito Differential Revision: D45843715

facebook-github-bot · 2023-10-05T18:43:31Z

This pull request was exported from Phabricator. Differential Revision: D45843715

facebook-github-bot · 2023-10-06T04:09:43Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2023-10-06T04:12:53Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Follow up to #110123, removing the CUDA_VERSION check for ROCm because HIP already has hipMallocAsync() and doesn't need the version check there. Follow up to #108488, fixing the unit failing unit tests by accepting either a "cuda" or "hip" attribute for the caching allocator options. This is aligned to the masquerading strategy for ROCm/HIP. Pull Request resolved: #110715 Approved by: https://github.com/ezyang

facebook-github-bot added the fb-exported label Sep 3, 2023

banitag1 force-pushed the export-D45843715 branch from cf9a231 to 1eeddd4 Compare September 3, 2023 14:46

banitag1 added a commit to banitag1/pytorch that referenced this pull request Sep 3, 2023

[CUDA Host Allocator] Add support of CudaHostRegister (pytorch#108488)

1eeddd4

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister. Differential Revision: D45843715

banitag1 force-pushed the export-D45843715 branch from 1eeddd4 to ffb60f7 Compare September 5, 2023 23:52

banitag1 added a commit to banitag1/pytorch that referenced this pull request Sep 5, 2023

[CUDA Host Allocator] Add support of CudaHostRegister (pytorch#108488)

ffb60f7

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister. Differential Revision: D45843715

banitag1 force-pushed the export-D45843715 branch from ffb60f7 to 415844f Compare October 4, 2023 19:19

banitag1 force-pushed the export-D45843715 branch from 415844f to 3ace388 Compare October 4, 2023 22:43

banitag1 force-pushed the export-D45843715 branch from 3ace388 to e3efe51 Compare October 4, 2023 22:43

banitag1 force-pushed the export-D45843715 branch from e3efe51 to f3e7a3d Compare October 4, 2023 23:45

banitag1 requested review from jeffdaily and jithunnair-amd as code owners October 4, 2023 23:45

banitag1 force-pushed the export-D45843715 branch from f3e7a3d to 77048d1 Compare October 4, 2023 23:45

banitag1 force-pushed the export-D45843715 branch from 77048d1 to 20b7bb6 Compare October 5, 2023 00:28

banitag1 force-pushed the export-D45843715 branch 2 times, most recently from d20ba0c to 8e49959 Compare October 5, 2023 02:52

zdevito approved these changes Oct 5, 2023

View reviewed changes

banitag1 force-pushed the export-D45843715 branch from 8e49959 to b035fb5 Compare October 5, 2023 18:39

[CUDA Host Allocator] Add support of CudaHostRegister (pytorch#108488)

7b5470a

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc. Reviewed By: zdevito Differential Revision: D45843715

banitag1 force-pushed the export-D45843715 branch from b035fb5 to 7b5470a Compare October 5, 2023 18:43

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 6, 2023

pytorchmergebot added the merging label Oct 6, 2023

pytorchmergebot added Merged and removed merging labels Oct 6, 2023

pytorchmergebot closed this in 64583c4 Oct 6, 2023

jeffdaily mentioned this pull request Oct 6, 2023

[CUDA Host Allocator][ROCm] fixes #110715

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA Host Allocator] Add support of CudaHostRegister #108488

[CUDA Host Allocator] Add support of CudaHostRegister #108488

banitag1 commented Sep 3, 2023

pytorch-bot bot commented Sep 3, 2023 •

edited

facebook-github-bot commented Sep 3, 2023

github-actions bot commented Sep 3, 2023

facebook-github-bot commented Sep 3, 2023

facebook-github-bot commented Sep 5, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 5, 2023

facebook-github-bot commented Oct 5, 2023

facebook-github-bot commented Oct 5, 2023

zdevito Oct 5, 2023

facebook-github-bot commented Oct 5, 2023

facebook-github-bot commented Oct 5, 2023

facebook-github-bot commented Oct 6, 2023

pytorchmergebot commented Oct 6, 2023

[CUDA Host Allocator] Add support of CudaHostRegister #108488

[CUDA Host Allocator] Add support of CudaHostRegister #108488

Conversation

banitag1 commented Sep 3, 2023

pytorch-bot bot commented Sep 3, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108488

✅ You can merge normally! (3 Unrelated Failures)

facebook-github-bot commented Sep 3, 2023

github-actions bot commented Sep 3, 2023

This PR needs a release notes: label

facebook-github-bot commented Sep 3, 2023

facebook-github-bot commented Sep 5, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 4, 2023

facebook-github-bot commented Oct 5, 2023

facebook-github-bot commented Oct 5, 2023

facebook-github-bot commented Oct 5, 2023

zdevito Oct 5, 2023

Choose a reason for hiding this comment

facebook-github-bot commented Oct 5, 2023

facebook-github-bot commented Oct 5, 2023

facebook-github-bot commented Oct 6, 2023

pytorchmergebot commented Oct 6, 2023

Merge started

pytorch-bot bot commented Sep 3, 2023 •

edited

This PR needs a `release notes:` label