feat(example): add gpu-pipeline bridging capy with NVIDIA nvexec#290
Conversation
Demonstrates that capy::await_sender and capy::as_sender compose with nvexec::stream_scheduler, not just CPU schedulers. Scene 1: a capy coroutine co_awaits a SAXPY __global__ kernel scheduled on a CUDA stream, with continues_on(cpu) landing completion on host before the bridge connects. Scene 2: capy's read_some is exposed as a stdexec sender, driven by sync_wait, with upon_error catching an injected eof. Gated behind BOOST_CAPY_BUILD_NVEXEC_EXAMPLES (default OFF) which hard-errors if BOOST_CAPY_BUILD_STDEXEC_EXAMPLES is off or CMAKE_CXX_STANDARD < 23, then enables the CUDA language at the top level. Bridge headers are copied verbatim from bench/stdexec/ so example tweaks can land without disturbing the bench. The README documents the working toolchain (clang as both host and CUDA compiler with CUDA_SEPARABLE_COMPILATION OFF). nvc++ 26.3 does not enable C++20 coroutines, so the nominally blessed nvexec compiler cannot compile capy.
|
An automated preview of the documentation is available at https://290.capy.prtest3.cppalliance.org/index.html If more commits are pushed to the pull request, the docs will rebuild at the same URL. 2026-05-28 16:16:56 UTC |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #290 +/- ##
========================================
Coverage 92.27% 92.27%
========================================
Files 164 164
Lines 8862 8862
========================================
Hits 8177 8177
Misses 685 685
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
GCOVR code coverage report https://290.capy.prtest3.cppalliance.org/gcovr/index.html Build time: 2026-05-28 16:36:29 UTC |
Demonstrates that capy::await_sender and capy::as_sender compose with nvexec::stream_scheduler, not just CPU schedulers. Scene 1: a capy coroutine co_awaits a SAXPY global kernel scheduled on a CUDA stream, with continues_on(cpu) landing completion on host before the bridge connects. Scene 2: capy's read_some is exposed as a stdexec sender, driven by sync_wait, with upon_error catching an injected eof.
Gated behind BOOST_CAPY_BUILD_NVEXEC_EXAMPLES (default OFF) which hard-errors if BOOST_CAPY_BUILD_STDEXEC_EXAMPLES is off or CMAKE_CXX_STANDARD < 23, then enables the CUDA language at the top level. Bridge headers are copied verbatim from bench/stdexec/ so example tweaks can land without disturbing the bench.
The README documents the working toolchain (clang as both host and CUDA compiler with CUDA_SEPARABLE_COMPILATION OFF). nvc++ 26.3 does not enable C++20 coroutines, so the nominally blessed nvexec compiler cannot compile capy.