Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Nested Tensor] do not use at::cuda::getDefaultCUDAStream(), again #91180

Closed

Conversation

jeffdaily
Copy link
Collaborator

Otherwise, Nested Tensor kernels won't sync with current stream, resulting in flaky unit tests in test_nestedtensor.py.

This is the second time the wrong streams have been used in NestedTensor code. See #84134 for another example.

Otherwise, Nested Tensor kernels won't sync with current stream,
resulting in flaky unit tests in test_nestedtensor.py.

This is the second time the wrong streams have been used in NestedTensor
code. See pytorch#84134 for another example.
@jeffdaily jeffdaily added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 20, 2022
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 20, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91180

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Failures

As of commit 44f6d32:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: cuda release notes category label Dec 20, 2022
@jeffdaily
Copy link
Collaborator Author

Example CI run where this caused a failure for ROCm.

https://hud.pytorch.org/pytorch/pytorch/commit/7330eabe3675e62d580b3c7d453b1f7f356e2c61

@mikaylagawarecki
Copy link
Contributor

Thanks @jeffdaily for the fix, apologies for the wrong usage of getDefaultCUDAStream

@jeffdaily
Copy link
Collaborator Author

@pytorchbot merge -f "only failure due to offline ROCm runner"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: cuda release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants