Support CUDA compilation with Clang compiler #161

jrmadsen · 2021-03-09T16:30:22Z

support setting CMAKE_CUDA_COMPILER=clang++ or (via env) CUDACXX=clang++
improved automatic cuda arch detection
increased kokkos testing
renamed generic macros like GLOBAL_CALLABLE to TIMEMORY_GLOBAL_FUNCTION, etc.
fixed kokkosp weak binding symbols on macOS
renamed generic macros _UNIX, _MACOS, etc. to TIMEMORY_UNIX, TIMEMORY_MACOS, etc.
- previous macros are still defined, will be removed eventually
replaced __NVCC__ with __CUDACC__
- define _TIMEMORY_OPENMP_TARGET for OpenMP target compilation with timemory since OpenMP target may define __CUDACC__ but won't allow compiling raw CUDA
TIMEMORY_FOLD_EXPRESSION(...) implements C++17 fold expressions instead of C++14 fold expression via initializer list
miscellaneous CI changes to reduce turnover

- protect NVCC flags with gen-expr - GPU "CALLABLE" -> TIMEMORY macros - Use __CUDACC__ instead of __NVCC__ - _TIMEMORY_OPENMP_TARGET guards - disable cupti_counters for CUDA 11 - define malloc_gotcha types explicitly - threading::affinity::set - CI testing for cuda + clang

- _UNIX == TIMEMORY_UNIX - _LINUX == TIMEMORY_LINUX - _WINDOWS == TIMEMORY_WINDOWS - _MACOS == TIMEMORY_MACOS - Added tests when kokkos sample is built - Updated kokkos-common memory_entry_t to use user_kokkosp_bundle

- TIMEMORY_LIBRARY_SOURCE - Fix visibility for kp_timemory on linux - Fix weak binding on macOS - Better data_tracker_tests

codecov · 2021-03-09T18:32:51Z

Codecov Report

Merging #161 (5277e0f) into develop (0653050) will increase coverage by 0.11%.
The diff coverage is 95.66%.

@@             Coverage Diff             @@
##           develop     #161      +/-   ##
===========================================
+ Coverage    82.60%   82.70%   +0.11%     
===========================================
  Files          240      240              
  Lines        16238    16245       +7     
===========================================
+ Hits         13411    13434      +23     
+ Misses        2827     2811      -16

Impacted Files	Coverage Δ
source/kokkosp.cpp	`96.08% <ø> (+6.54%)`	⬆️
source/library.cpp	`75.62% <ø> (ø)`
source/pthread.cpp	`15.79% <ø> (ø)`
source/timemory/api/kokkosp.hpp	`100.00% <ø> (ø)`
source/timemory/backends/cpu.hpp	`85.19% <ø> (ø)`
source/timemory/backends/cupti.hpp	`70.17% <ø> (ø)`
source/timemory/backends/memory.hpp	`100.00% <ø> (ø)`
source/timemory/backends/papi.hpp	`83.55% <ø> (ø)`
source/timemory/backends/process.hpp	`100.00% <ø> (ø)`
source/timemory/backends/threading.hpp	`77.03% <ø> (ø)`
... and 41 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0653050...5277e0f. Read the comment docs.

* Added support for cuda compilation with clang * Miscellaneous updates for clang + cuda - protect NVCC flags with gen-expr - GPU "CALLABLE" -> TIMEMORY macros - Use __CUDACC__ instead of __NVCC__ - _TIMEMORY_OPENMP_TARGET guards - disable cupti_counters for CUDA 11 - define malloc_gotcha types explicitly - threading::affinity::set - CI testing for cuda + clang * Removed error check * Macro updates + kokkos tests - _UNIX == TIMEMORY_UNIX - _LINUX == TIMEMORY_LINUX - _WINDOWS == TIMEMORY_WINDOWS - _MACOS == TIMEMORY_MACOS - Added tests when kokkos sample is built - Updated kokkos-common memory_entry_t to use user_kokkosp_bundle * Update add_secondary.hpp * CI updates * Find launch_compiler component when finding Kokkos * Fix to TIMEMORY_PYTHON_PLOTTER pre-processor def * Use separable compilation for kokkos instead of launch_compiler * cmake_defines fixes for default * Update kokkos_compilation usage * Enable HW counters for CUDA 11 * Kokkos-connector updates * Remove kokkos_compilation macro usage * More kokkos-connector updates * Testing updates * Miscellaneous testing fixes - TIMEMORY_LIBRARY_SOURCE - Fix visibility for kp_timemory on linux - Fix weak binding on macOS - Better data_tracker_tests

jrmadsen added 19 commits March 3, 2021 13:10

Added support for cuda compilation with clang

919d983

Removed error check

555a9da

Macro updates + kokkos tests

d8aada4

- _UNIX == TIMEMORY_UNIX - _LINUX == TIMEMORY_LINUX - _WINDOWS == TIMEMORY_WINDOWS - _MACOS == TIMEMORY_MACOS - Added tests when kokkos sample is built - Updated kokkos-common memory_entry_t to use user_kokkosp_bundle

Update add_secondary.hpp

671391c

CI updates

160a637

Find launch_compiler component when finding Kokkos

da56fa1

Fix to TIMEMORY_PYTHON_PLOTTER pre-processor def

2a72069

Use separable compilation for kokkos instead of launch_compiler

7303818

cmake_defines fixes for default

1ccbd7f

Update kokkos_compilation usage

df7f9a8

Enable HW counters for CUDA 11

a342837

Fix to last commit

8ba3e14

Fix to last commit

1213937

Kokkos-connector updates

e043b35

Remove kokkos_compilation macro usage

574d5e6

More kokkos-connector updates

d3531b4

Testing updates

b3aa0e5

Miscellaneous testing fixes

5277e0f

- TIMEMORY_LIBRARY_SOURCE - Fix visibility for kp_timemory on linux - Fix weak binding on macOS - Better data_tracker_tests

jrmadsen merged commit 63f2106 into develop Mar 9, 2021

jrmadsen deleted the clang-cuda-support branch March 9, 2021 20:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support CUDA compilation with Clang compiler #161

Support CUDA compilation with Clang compiler #161

jrmadsen commented Mar 9, 2021

codecov bot commented Mar 9, 2021 •

edited

Loading

Support CUDA compilation with Clang compiler #161

Support CUDA compilation with Clang compiler #161

Conversation

jrmadsen commented Mar 9, 2021

codecov bot commented Mar 9, 2021 • edited Loading

Codecov Report

codecov bot commented Mar 9, 2021 •

edited

Loading