Add libcu++ dependency; initial round of `NV_IF_TARGET` ports. #1605

alliepiper · 2022-01-24T23:11:23Z

This PR contains an initial set of changes necessary to migrate Thrust and CUB to NV_IF_TARGET and remove dependence on __CUDA_ARCH__. It does not fully remove all usages of __CUDA_ARCH__, but rather focuses on the following:

Establish the libcu++ dependency for both Thrust and CUB.
Remove obsolete checks for unsupported CUDA architectures.
Migrate host/device divergent code from #ifdef __CUDA_ARCH__ to use NV_IF_TARGET.

This also includes various bug fixes for issues exposed by the above.

Future PRs will address the remaining usages of __CUDA_ARCH__ in the CDP macros and the kernel dispatch infrastructure.

Pre-written Release Notes

Breaking Changes

Add libcu++ dependency; initial round of NV_IF_TARGET ports. #1605: Add libcu++ dependency.
Add libcu++ dependency; initial round of NV_IF_TARGET ports. #1605: The following macros are no longer defined by default. They can be re-enabled by defining THRUST_PROVIDE_LEGACY_ARCH_MACROS. These will be removed completely in a future release.
- THRUST_IS_HOST_CODE: Replace with NV_IF_TARGET.
- THRUST_IS_DEVICE_CODE: Replace with NV_IF_TARGET.
- THRUST_INCLUDE_HOST_CODE: Replace with NV_IF_TARGET.
- THRUST_INCLUDE_DEVICE_CODE: Replace with NV_IF_TARGET.
- THRUST_DEVICE_CODE: Replace with NV_IF_TARGET.

Other Enhancements

Add libcu++ dependency; initial round of NV_IF_TARGET ports. #1605: Removed special case code for unsupported CUDA architectures.
Add libcu++ dependency; initial round of NV_IF_TARGET ports. #1605: Replace several usages of __CUDA_ARCH__ with <nv/target> to handle host/device code divergence.

Bug Fixes

Add libcu++ dependency; initial round of NV_IF_TARGET ports. #1605: Fix some execution space warnings in the allocator library.

alliepiper · 2022-01-24T23:12:06Z

run tests

alliepiper · 2022-01-25T06:25:38Z

run tests

alliepiper · 2022-02-04T16:42:11Z

run tests

alliepiper · 2022-02-04T22:17:43Z

run tests

alliepiper · 2022-02-05T04:38:24Z

run tests

alliepiper · 2022-03-23T19:47:32Z

run tests

alliepiper · 2022-03-23T22:51:25Z

run tests

alliepiper · 2022-03-24T16:56:18Z

run tests

alliepiper · 2022-03-24T21:35:44Z

run tests

alliepiper · 2022-03-24T21:42:34Z

run tests

alliepiper · 2022-04-13T22:32:42Z

run tests

alliepiper · 2022-04-14T19:03:20Z

run tests

alliepiper · 2022-05-10T21:22:56Z

Rebased. Now that the version has been bumped to 2.0.0 we can start to seriously think about merging this.

run tests

gevtushenko

There are a few comments that I consider critical, but the source of the issue isn't caused by this PR. So the comments shouldn't block it. The rest of the comments are quite optional.

testing/allocator.cu

thrust/system/cuda/config.h

thrust/system/cuda/detail/util.h

thrust/system/cuda/detail/core/util.h

thrust/system/cuda/detail/util.h

thrust/system/cuda/detail/core/util.h

gevtushenko · 2022-05-11T10:27:42Z

thrust/system/cuda/detail/malloc_and_free.h

 #include <thrust/system/cuda/detail/util.h>
 #include <thrust/system/detail/bad_alloc.h>
 #include <thrust/detail/malloc_and_free.h>

+#ifdef THRUST_CACHING_DEVICE_MALLOC


I don't see any documentation on this macro, is it still in use?

alliepiper · 2022-05-16T21:38:04Z

run tests

There's no way for a user to meaningfully use this, since libcudacxx is a required dependency. It is checked during the initial `find_package(Thrust)` call, before the user would have access to Thrust's CMake API. Updated the CMake README.md with instructions for using an explicit libcudacxx target.

- The `g_state` flag wasn't reset between executions. - The `destroy` method was being invoke in the current host system, not the system that owned the allocated memory (always cpp). This broke on MSVC's OpenMP implementation, where it seemed to be asserting the `g_state` flag before it was updated by `destroy`. This only happened on MSVC when host system = OMP, and appears to be a bug/miscompile in MSVC (repro'd on 2019). Fixed by explicitly tagging the allocator system to cpp. - Added check that `destroy` is not invoked on empty vectors.

alliepiper · 2022-05-16T22:05:33Z

run tests

alliepiper added the blocked Cannot make progress. label Jan 24, 2022

alliepiper self-assigned this Jan 24, 2022

alliepiper added this to Inbox in PR Tracking via automation Jan 24, 2022

alliepiper marked this pull request as draft January 24, 2022 23:11

alliepiper moved this from Inbox to Drafts in PR Tracking Jan 24, 2022

alliepiper added this to the 1.17.0 milestone Jan 25, 2022

alliepiper force-pushed the if_target_prep branch from fc0846e to b771cda Compare January 25, 2022 06:24

alliepiper force-pushed the if_target_prep branch 2 times, most recently from 058166e to 9065dda Compare February 4, 2022 16:42

alliepiper force-pushed the if_target_prep branch 2 times, most recently from 17cfee7 to e12b463 Compare February 4, 2022 22:17

alliepiper force-pushed the if_target_prep branch from e12b463 to d51b751 Compare February 5, 2022 04:37

alliepiper mentioned this pull request Feb 7, 2022

Add find_package and add_subdirectory CMake support. NVIDIA/libcudacxx#242

Merged

alliepiper force-pushed the if_target_prep branch from d51b751 to 2f2d141 Compare March 23, 2022 19:41

alliepiper force-pushed the if_target_prep branch from 2f2d141 to bbc43d0 Compare March 23, 2022 22:51

alliepiper force-pushed the if_target_prep branch from bbc43d0 to 193be7c Compare March 24, 2022 16:55

alliepiper force-pushed the if_target_prep branch from 193be7c to 42c32ad Compare March 24, 2022 21:34

alliepiper mentioned this pull request Mar 24, 2022

Add libcu++ dependency; initial round of NV_IF_TARGET ports. NVIDIA/cub#448

Merged

alliepiper changed the title ~~Draft: libcudacxx, if-target prep/testing~~ Add libcu++ dependency; initial round of NV_IF_TARGET ports. Mar 24, 2022

alliepiper marked this pull request as ready for review March 24, 2022 21:42

alliepiper mentioned this pull request Apr 14, 2022

Update CDP support macros for if-target compatibility #1661

Merged

alliepiper modified the milestones: 1.17.0, 2.0.0 Apr 25, 2022

brycelelbach mentioned this pull request Nov 8, 2023

Switch to cuda::std::tuple/<cuda/std/tuple> NVIDIA/cccl#742

Closed

2 tasks

robertmaynard approved these changes May 3, 2022

View reviewed changes

alliepiper force-pushed the if_target_prep branch from e9953c8 to 50316c7 Compare May 10, 2022 21:20

gevtushenko approved these changes May 11, 2022

View reviewed changes

alliepiper added 2 commits May 16, 2022 17:31

Add libcudacxx submodule, initialized to version 1.8.0.

97e63f9

Style fixes for thrust-config.cmake.

b19385a

alliepiper force-pushed the if_target_prep branch from 50316c7 to c9fe022 Compare May 16, 2022 21:32

alliepiper moved this from Need Review to Tests Pending in PR Tracking May 16, 2022

alliepiper added 8 commits May 16, 2022 18:05

Bump CUB for NV_IF_TARGET refactor.

807e9e0

Remove checks for obsolete architectures.

9e4f0a3

Refactor to use NV_IF_TARGET.

3ea8940

Remove unreachable code.

fdcd8e1

Initialize members in cuda_optional detail class.

59a72c0

Fix some new and exciting exec_space [subobject] warnings.

dd561bf

alliepiper force-pushed the if_target_prep branch from c9fe022 to 4cdf6de Compare May 16, 2022 22:05

alliepiper merged commit b223930 into NVIDIA:main May 17, 2022

PR Tracking automation moved this from Tests Pending to Done May 17, 2022

alliepiper deleted the if_target_prep branch May 17, 2022 17:48

alliepiper mentioned this pull request May 20, 2022

Enable usage of libcu++ in Thrust/CUB #1213

Closed

3 tasks

alliepiper mentioned this pull request Mar 1, 2022

Update to new NV_IF_TARGET macros NVIDIA/cccl#760

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add libcu++ dependency; initial round of `NV_IF_TARGET` ports. #1605

Add libcu++ dependency; initial round of `NV_IF_TARGET` ports. #1605

alliepiper commented Jan 24, 2022 •

edited

Loading

alliepiper commented Jan 24, 2022

alliepiper commented Jan 25, 2022

alliepiper commented Feb 4, 2022

alliepiper commented Feb 4, 2022

alliepiper commented Feb 5, 2022

alliepiper commented Mar 23, 2022

alliepiper commented Mar 23, 2022

alliepiper commented Mar 24, 2022

alliepiper commented Mar 24, 2022

alliepiper commented Mar 24, 2022

alliepiper commented Apr 13, 2022

alliepiper commented Apr 14, 2022

alliepiper commented May 10, 2022

gevtushenko left a comment

gevtushenko May 11, 2022

alliepiper May 16, 2022

alliepiper commented May 16, 2022

alliepiper commented May 16, 2022

Add libcu++ dependency; initial round of NV_IF_TARGET ports. #1605

Add libcu++ dependency; initial round of NV_IF_TARGET ports. #1605

Conversation

alliepiper commented Jan 24, 2022 • edited Loading

Pre-written Release Notes

Breaking Changes

Other Enhancements

Bug Fixes

alliepiper commented Jan 24, 2022

alliepiper commented Jan 25, 2022

alliepiper commented Feb 4, 2022

alliepiper commented Feb 4, 2022

alliepiper commented Feb 5, 2022

alliepiper commented Mar 23, 2022

alliepiper commented Mar 23, 2022

alliepiper commented Mar 24, 2022

alliepiper commented Mar 24, 2022

alliepiper commented Mar 24, 2022

alliepiper commented Apr 13, 2022

alliepiper commented Apr 14, 2022

alliepiper commented May 10, 2022

gevtushenko left a comment

Choose a reason for hiding this comment

gevtushenko May 11, 2022

Choose a reason for hiding this comment

alliepiper May 16, 2022

Choose a reason for hiding this comment

alliepiper commented May 16, 2022

alliepiper commented May 16, 2022

Add libcu++ dependency; initial round of `NV_IF_TARGET` ports. #1605

Add libcu++ dependency; initial round of `NV_IF_TARGET` ports. #1605

alliepiper commented Jan 24, 2022 •

edited

Loading