Implement parallel cuda::std::replace_copy#7410
Merged
miscco merged 1 commit intoNVIDIA:mainfrom Feb 5, 2026
Merged
Conversation
6ba6554 to
7e2c03e
Compare
This comment has been minimized.
This comment has been minimized.
Contributor
|
This PR states to add |
Contributor
Author
replace_copy is implemented in terms of A kingdom for stacked PRs |
Contributor
Author
|
Benchmarks against Thrust look great: |
7e2c03e to
5ee4289
Compare
This comment has been minimized.
This comment has been minimized.
5ee4289 to
d832910
Compare
This comment has been minimized.
This comment has been minimized.
This implements the `replace_copy{_if}` algorithms for the cuda backend.
* std::replace_copy see https://en.cppreference.com/w/cpp/algorithm/replace_copy.html
* std::replace_copy_if see https://en.cppreference.com/w/cpp/algorithm/replace_copy.html
It provides tests and benchmarks similar to Thrust and some boilerplate for libcu++
The functionality is publicly available yet and implemented in a private internal header
Fixes NVIDIA#7409
d832910 to
ff09673
Compare
Contributor
🥳 CI Workflow Results🟩 Finished in 1h 29m: Pass: 100%/95 | Total: 17h 19m | Max: 1h 05m | Hits: 98%/248496See results here. |
bernhardmgruber
approved these changes
Feb 5, 2026
Comment on lines
+52
to
+56
| _CCCL_HOST_API constexpr __replace_copy_select(const _Tp& __old_value, | ||
| const _Tp& __new_value) noexcept(is_nothrow_copy_constructible_v<_Tp>) | ||
| : __old_value_(__old_value) | ||
| , __new_value_(__new_value) | ||
| {} |
Contributor
There was a problem hiding this comment.
Suggestion: just use aggregate init and drop the ctor.
Contributor
Author
There was a problem hiding this comment.
This does not work in all CTK versions supported
Comment on lines
+52
to
+56
| _CCCL_HOST_API constexpr __replace_copy_if_select(_UnaryPred __pred, const _Tp& __new_value) noexcept( | ||
| is_nothrow_move_constructible_v<_UnaryPred> && is_nothrow_copy_constructible_v<_Tp>) | ||
| : __pred_(__pred) | ||
| , __new_value_(__new_value) | ||
| {} |
Contributor
Author
There was a problem hiding this comment.
Ditto does not work everywhere
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This implements the
replace_copy{_if}algorithms for the cuda backend.It provides tests and benchmarks similar to Thrust and some boilerplate for libcu++
The functionality is publicly available yet and implemented in a private internal header
Fixes #7409