Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA Host Allocator] Add support of CudaHostRegister #108488

Closed
wants to merge 1 commit into from

Conversation

banitag1
Copy link
Contributor

@banitag1 banitag1 commented Sep 3, 2023

Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister.

Differential Revision: D45843715

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 3, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108488

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 7b5470a with merge base a0cea51 (image):

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

@github-actions
Copy link

github-actions bot commented Sep 3, 2023

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

banitag1 added a commit to banitag1/pytorch that referenced this pull request Sep 3, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister.

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

banitag1 added a commit to banitag1/pytorch that referenced this pull request Sep 5, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister.

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 4, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Differential Revision: D45843715
banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 4, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 4, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 4, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 4, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 5, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 5, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Differential Revision: D45843715
@banitag1 banitag1 force-pushed the export-D45843715 branch 2 times, most recently from d20ba0c to 8e49959 Compare October 5, 2023 02:52
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 5, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Differential Revision: D45843715
@@ -76,6 +76,13 @@ def tearDown(self):
del self.autocast_lists
super().tearDown()

def test_pinned_memory_with_cudaregister(self):
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "pinned_use_cuda_host_register:True,pinned_num_register_threads:8"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be torch.cuda.memory._set_allocator_settings otherwise it won't have an effect. Make sure to set it back after the test.

banitag1 added a commit to banitag1/pytorch that referenced this pull request Oct 5, 2023
Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Reviewed By: zdevito

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

Summary:

This diff adds another option to create cuda pinned memory using cudaHostRegister to avoid large lock wait time with cudaHostAlloc.

Reviewed By: zdevito

Differential Revision: D45843715
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D45843715

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 6, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Oct 6, 2023
Follow up to #110123, removing the CUDA_VERSION check for ROCm because HIP already has hipMallocAsync() and doesn't need the version check there.

Follow up to #108488, fixing the unit failing unit tests by accepting either a "cuda" or "hip" attribute for the caching allocator options.  This is aligned to the masquerading strategy for ROCm/HIP.

Pull Request resolved: #110715
Approved by: https://github.com/ezyang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants