Add NVTX support and RMM_FUNC_RANGE() macro #1558

harrism · 2024-05-09T03:12:24Z

Description

Let's get RMM allocate/deallocates showing up in profiler timelines.

Closes #495

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

CMakeLists.txt

bdice · 2024-05-09T14:44:22Z

cmake/thirdparty/get_nvtx.cmake

+# =============================================================================
+
+# This function finds NVTX and sets any additional necessary environment variables.
+function(find_and_configure_nvtx)


We should probably set up NVTX in rapids-cmake since we have three repos using the same code (rmm, cudf, cuspatial -- can't recall if we finished cuspatial but it's supposed to use this).

I agree, I was surprised to find it wasn't already in rapids-cmake.

rapidsai/rapids-cmake#606

bdice · 2024-05-09T14:58:27Z

The build failure shows:

    /usr/bin/sccache /home/coder/.conda/envs/rapids/bin/x86_64-conda-linux-gnu-c++ -DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DSPDLOG_ACTIVE_LEVEL=SPDLOG_LEVEL_ -DSPDLOG_FMT_EXTERNAL -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -D_torch_allocator_EXPORTS -I/home/coder/rmm/build/conda/cuda-12.2/release/_deps/cccl-src/thrust/thrust/cmake/../.. -I/home/coder/rmm/build/conda/cuda-12.2/release/_deps/cccl-src/libcudacxx/lib/cmake/libcudacxx/../../../include -I/home/coder/rmm/build/conda/cuda-12.2/release/_deps/cccl-src/cub/cub/cmake/../.. -isystem /home/coder/rmm/include -isystem /home/coder/rmm/build/conda/cuda-12.2/release/_deps/fmt-src/include -isystem /home/coder/rmm/build/conda/cuda-12.2/release/_deps/spdlog-src/include -fvisibility-inlines-hidden -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/coder/.conda/envs/rapids/include  -I/home/coder/.conda/envs/rapids/targets/x86_64-linux/include  -L/home/coder/.conda/envs/rapids/targets/x86_64-linux/lib -L/home/coder/.conda/envs/rapids/targets/x86_64-linux/lib/stubs -O3 -DNDEBUG -fPIC -MD -MT rmm/_lib/CMakeFiles/_torch_allocator.dir/_torch_allocator.cpp.o -MF rmm/_lib/CMakeFiles/_torch_allocator.dir/_torch_allocator.cpp.o.d -o rmm/_lib/CMakeFiles/_torch_allocator.dir/_torch_allocator.cpp.o -c /home/coder/rmm/python/rmm/rmm/_lib/_torch_allocator.cpp
    In file included from /home/coder/rmm/include/rmm/mr/device/device_memory_resource.hpp:20,
                     from /home/coder/rmm/include/rmm/mr/device/cuda_memory_resource.hpp:20,
                     from /home/coder/rmm/include/rmm/mr/device/per_device_resource.hpp:21,
                     from /home/coder/rmm/python/rmm/rmm/_lib/_torch_allocator.cpp:19:
    /home/coder/rmm/include/rmm/detail/nvtx/ranges.hpp:19:10: fatal error: nvtx3/nvtx3.hpp: No such file or directory
       19 | #include <nvtx3/nvtx3.hpp>
          |          ^~~~~~~~~~~~~~~~~
    compilation terminated.

This is because we're using NVTX as a BUILD_LOCAL_INTERFACE, so it doesn't seem to get picked up by the rmm Python package from the librmm header-only library:

target_link_libraries(rmm INTERFACE $<BUILD_LOCAL_INTERFACE:nvtx3-cpp>)

I know we had to enable this for cuDF in rapidsai/cudf#15271 because not doing so led to breakage in static builds. I don't know how this should be handled for RMM as a header-only library. Maybe we just need NVTX to have a normal INTERFACE linkage, since it's required by public RMM headers?

edit: I checked with @KyleFromNVIDIA and attempted this change in 702915b.

CMakeLists.txt

…a-add-nvtx

rapids_config.cmake

jrhemstad · 2024-05-13T17:35:36Z

include/rmm/detail/nvtx/ranges.hpp

It really makes me happy to see how simple this is :)

harrism · 2024-05-22T01:51:04Z

Depends on #rapidsai/rapids-cmake#606

wence-

Looks good!

wence- · 2024-05-22T09:16:18Z

include/rmm/detail/nvtx/ranges.hpp

@@ -0,0 +1,61 @@
+/*
+ * Copyright (c) 2020-2024, NVIDIA CORPORATION.


Suggested change

* Copyright (c) 2020-2024, NVIDIA CORPORATION.

* Copyright (c) 2024, NVIDIA CORPORATION.

Fixed, thanks.

wence- · 2024-05-22T09:17:37Z

include/rmm/mr/device/device_memory_resource.hpp

@@ -117,6 +118,7 @@ class device_memory_resource {
   */
  void* allocate(std::size_t bytes, cuda_stream_view stream = cuda_stream_view{})
  {
+    RMM_FUNC_RANGE();


question: Do we want the same set of annotations for the host_memory_resource as well?

Yes. I added them. Note that the new pinned_host_memory_resource does not derive from host_memory_resource so I had to add ranges to it explicitly. This is the way it will have to be done for all MRs that just implement the concepts in the future, unfortunately.

Adds `rapids_cpm_nvtx3`, adapted from the code that is currently implemented in libcudf. This aims to help with rapidsai/rmm#1558. Authors: - Bradley Dice (https://github.com/bdice) - Mark Harris (https://github.com/harrism) - Robert Maynard (https://github.com/robertmaynard) Approvers: - Robert Maynard (https://github.com/robertmaynard) URL: #606

vyasr

Couple of small things. I would like to hear Robert's opinion on one since I think it's perpetuating a form of tech debt that we have throughout RAPIDS CMake code and I'd like to establish a different practice, but it shouldn't block merging so feel free to ping me tomorrow if it isn't addressed by the freeze deadline.

CMakeLists.txt

vyasr · 2024-05-22T23:22:35Z

cmake/thirdparty/get_nvtx.cmake

+
+endfunction()
+
+find_and_configure_nvtx3()


TBH I find these extra modules to be overkill when they're so trivial and would prefer that we inline them. In a few of the more recent cases we have done so. @robertmaynard WDYT? This feels like a pattern that's been copy-pasted many times into a number of scenarios where it isn't really needed.

Specifically I'm talking about just moving lines 18/19 into CMakeLists.txt and removing this file.

While I agree, I also feel that this PR is consistent with the style of 3 other rapids-cmake packages used in RMM (CCCL, fmt, spdlog -- which has additional local cmake code for some reason). Therefore I would like to defer this change to a later PR, which should be applied in unison with the same style change across all of RAPIDS.

vyasr

CMake approval.

harrism · 2024-05-23T00:00:34Z

/merge

bdice · 2024-05-23T12:49:58Z

CMakeLists.txt

 target_compile_features(rmm INTERFACE cxx_std_17 $<BUILD_INTERFACE:cuda_std_17>)
 target_compile_definitions(rmm INTERFACE LIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE)

+# Disable NVTX if necessary
+if(NOT USE_NVTX)
+  target_compile_definitions(rmm INTERFACE NVTX_DISABLE)


The option for disabling NVTX does not appear to be used. This seems like a bug.

It's an nvtx feature.

harrism added 3 commits May 9, 2024 02:56

Add nvtx to cmake

378d5ea

add RMM_FUNC_RANGE

4f8d767

Use RMM_FUNC_RANGE

9546d45

harrism added feature request New feature or request non-breaking Non-breaking change labels May 9, 2024

github-actions bot added CMake cpp Pertains to C++ code labels May 9, 2024

bdice reviewed May 9, 2024

View reviewed changes

Fix style and fix cudf to rmm.

500d07e

bdice reviewed May 9, 2024

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

Don't use BUILD_LOCAL_INTERFACE.

702915b

bdice mentioned this pull request May 10, 2024

Add rapids_cpm_nvtx3. rapidsai/rapids-cmake#606

Merged

7 tasks

bdice added 4 commits May 13, 2024 11:15

Rename to nvtx3.

5b1db3d

Merge branch 'branch-24.06' into fea-add-nvtx

7ff9735

Use testing branch of rapids-cmake.

31b93c9

Merge branch 'fea-add-nvtx' of https://github.com/harrism/rmm into fe…

ef29637

…a-add-nvtx

bdice reviewed May 13, 2024

View reviewed changes

rapids_config.cmake Outdated Show resolved Hide resolved

jrhemstad reviewed May 13, 2024

View reviewed changes

include/rmm/detail/nvtx/ranges.hpp

Copy link

Contributor

jrhemstad May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really makes me happy to see how simple this is :)

harrism mentioned this pull request May 22, 2024

[REVIEW] Add NVTX ranges to RMM #336

Closed

harrism marked this pull request as ready for review May 22, 2024 01:51

harrism requested review from a team as code owners May 22, 2024 01:51

harrism requested review from rongou and jrhemstad May 22, 2024 01:51

wence- approved these changes May 22, 2024

View reviewed changes

harrism added 2 commits May 22, 2024 21:48

Remove temporary reference to bdice rapids-cmake branch

7abab23

Copyright

8ede052

Add NVTX ranges to host MRs

6770320

vyasr reviewed May 22, 2024

View reviewed changes

PUBLIC -> INTERFACE

275cd52

vyasr approved these changes May 22, 2024

View reviewed changes

rapids-bot bot merged commit 4e9482f into rapidsai:branch-24.06 May 23, 2024
57 checks passed

harrism mentioned this pull request May 23, 2024

Inline contents of simple cmake/thirdparty files rapidsai/build-planning#64

Open

bdice reviewed May 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NVTX support and RMM_FUNC_RANGE() macro #1558

Add NVTX support and RMM_FUNC_RANGE() macro #1558

harrism commented May 9, 2024

bdice May 9, 2024

harrism May 10, 2024

bdice May 10, 2024

bdice commented May 9, 2024 •

edited

Loading

jrhemstad May 13, 2024

harrism commented May 22, 2024

wence- left a comment

wence- May 22, 2024

harrism May 22, 2024

wence- May 22, 2024

harrism May 22, 2024

vyasr left a comment

vyasr May 22, 2024

harrism May 22, 2024

vyasr left a comment

harrism commented May 23, 2024

bdice May 23, 2024

vyasr May 23, 2024

		@@ -0,0 +1,61 @@
		/*
		* Copyright (c) 2020-2024, NVIDIA CORPORATION.

	* Copyright (c) 2020-2024, NVIDIA CORPORATION.
	* Copyright (c) 2024, NVIDIA CORPORATION.


		endfunction()

		find_and_configure_nvtx3()

Add NVTX support and RMM_FUNC_RANGE() macro #1558

Add NVTX support and RMM_FUNC_RANGE() macro #1558

Conversation

harrism commented May 9, 2024

Description

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdice commented May 9, 2024 • edited Loading

Choose a reason for hiding this comment

harrism commented May 22, 2024

wence- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vyasr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vyasr left a comment

Choose a reason for hiding this comment

harrism commented May 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdice commented May 9, 2024 •

edited

Loading