Apply cuda::barrier and elect_one feedback#6344
Merged
bernhardmgruber merged 2 commits intoNVIDIA:mainfrom Oct 30, 2025
Merged
Conversation
Contributor
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
ahendriksen
reviewed
Oct 27, 2025
| const auto uniform_warp_id = __shfl_sync(~0, warp_id, 0); // broadcast from lane 0 | ||
| return uniform_warp_id == 0 && cuda::ptx::elect_sync(~0); // elect a leader thread among warp 0 | ||
| ), | ||
| (::cuda::device::__cuda_elect_sync_is_not_supported_before_SM_90__(); _CCCL_UNREACHABLE();)); |
Contributor
There was a problem hiding this comment.
Shouldn't the
return threadIdx.x == 0;
go in the else here?
AFAIK PTX ISA is CTK 12.0. CCCL support 12.0 and up, so there is no need for the __cccl_ptx_isa ifdef.
Contributor
Author
There was a problem hiding this comment.
Then I would need to add it twice, as else branch of the NV_IF_TARGET and as else branch of the _CCCL_CUDA_COMPILATION() && __cccl_ptx_isa >= 800
Regarding PTX ISA. I don't know whether the clang CUDA versions we test already support __cccl_ptx_isa >= 800
Contributor
Author
There was a problem hiding this comment.
ok, let's just try
miscco
approved these changes
Oct 27, 2025
This comment has been minimized.
This comment has been minimized.
This is a follow-up to NVIDIA#6329 after feedback from ahendriksen
973ef6c to
869abbe
Compare
This comment has been minimized.
This comment has been minimized.
Contributor
🥳 CI Workflow Results🟩 Finished in 7h 16m: Pass: 100%/134 | Total: 6d 15h | Max: 5h 10m | Hits: 51%/265638See results here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a follow-up to #6329 after feedback from @ahendriksen