Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Fix <chrono> and <atomic> build errors with clang-cuda. #304

Merged
merged 1 commit into from
Oct 11, 2022

Conversation

wmaxey
Copy link
Member

@wmaxey wmaxey commented Aug 19, 2022

This fixes compilation in clang-cuda when atomics are included.

EDIT:
Testing support has been split into a separate branch and will include Docker changes to add support for testing.

@wmaxey wmaxey requested review from griwes and miscco August 19, 2022 04:03
@wmaxey wmaxey self-assigned this Aug 19, 2022
@wmaxey wmaxey added this to the 1.9.0 milestone Aug 19, 2022
@@ -1662,7 +1662,7 @@ extern "C" _LIBCUDACXX_FUNC_VIS void __sanitizer_annotate_contiguous_container(
#endif

// CUDA Atomics supersede host atomics in order to insert the host/device dispatch layer
#if defined(_LIBCUDACXX_COMPILER_NVCC) || defined(_LIBCUDACXX_COMPILER_NVRTC) || defined(_LIBCUDACXX_COMPILER_PGI)
#if defined(_LIBCUDACXX_COMPILER_NVCC) || defined(_LIBCUDACXX_COMPILER_NVRTC) || defined(_LIBCUDACXX_COMPILER_PGI) || defined(__CUDACC__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a _LIBCUDACXX_COMPILER_CLANG_CUDA definition that we can reuse elsewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why not. I'll edit that in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why not. I'll edit that in.

# define SHARED
#endif

#if defined(__clang__) && defined(__CUDA__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally try to refrain from using libcudacxx macros in the tests. This allows us to ensure the behavior matches up.


set(LIBCUDACXX_TEST_LINKER_FLAGS
"${LIBCUDACXX_TEST_LINKER_FLAGS} \
-L${CUDA_TOOLKIT_ROOT_DIR}/lib64 -lcuda -lcudart")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use:

find_package(CUDAToolkit)
target_link_libraries( tests PRIVATE CUDA::cuda_driver CUDA::cudart)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're a saint Robert.

I thought this was done for me when enabling CUDA as a language. That's not the case?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CMake is generating the build files it will make sure that cudart is part of the link line ( with controls over static / shared ). It seems that in your case you are passing this information down to lit which does the build generation so you might need to stick with the 'bad' ( original ) approach.

@@ -42,6 +42,18 @@ set(LIBCUDACXX_TEST_COMPILER_FLAGS
-I${CMAKE_SOURCE_DIR}/include"
CACHE INTERNAL "Flags for libcxx testing." FORCE)

if (${CMAKE_CUDA_COMPILER_ID} STREQUAL "Clang")
set(LIBCUDACXX_TEST_COMPILER_FLAGS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string(APPEND is the preferred CMake style

@@ -101,7 +105,7 @@ struct device_shared_memory_provider {

__device__
T * get() {
__shared__ alignas(T) char buffer[shared_offset];
alignas(T) __shared__ char buffer[shared_offset];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This being needed is really funny to me.

@@ -146,6 +146,9 @@ def _initTypeAndVersion(self):
if self.type == 'nvcc':
# Treat C++ as CUDA when the compiler is NVCC.
self.source_lang = 'cu'
elif self.type == 'clang':
# Treate C++ as clang-cuda when the compiler is Clang.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ("treate")

Also "treat c++ as clang-cuda" doesn't particularly make sense, did you mean to say "as CUDA" as above?

Comment on lines 754 to 761
# I don't think this is required, since removing it helps clang-cuda compile and libcudacxx only supports building in CUDA modes?
# if self.cxx.type != 'nvcc' and self.cxx.type != 'pgi':
# self.cxx.compile_flags += ['-nostdinc++']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you are right. Change the comment to state this as a fact, with a note to reenable it if we ever decide to start actually testing in a C++ mode.

self.cxx.link_flags += [abs_path]
else:
self.cxx.link_flags += ['-lc++']
# Device code does not have binary components, don't link libc++
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, add a note to reenable this if we decide to start testing in a C++ mode.

@@ -50,7 +50,7 @@ system_clock::time_point system_clock::now() _NOEXCEPT
{
#ifdef __CUDA_ARCH__
uint64_t __time;
asm volatile("mov.u64 %0, %globaltimer;":"=l"(__time)::);
asm volatile("mov.u64 %0, %%globaltimer;":"=l"(__time)::);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really disappointed that NVCC accepts the old inline asm.

@wmaxey wmaxey force-pushed the bugfix/clang_cuda_atomic_support branch from 525abbe to 676511d Compare October 11, 2022 20:05
@wmaxey wmaxey force-pushed the bugfix/clang_cuda_atomic_support branch from 676511d to 009be41 Compare October 11, 2022 20:15
@wmaxey
Copy link
Member Author

wmaxey commented Oct 11, 2022

I've split the fixes from the CMake changes. I'll add those into another branch and will add them into some Dockerfile steps later for proper testing support. I haven't done this because they have the side effect of breaking our containers, I'm just merging the fixes instead.

@wmaxey wmaxey changed the title Add support for testing libcudacxx with clang-cuda enabled. Fix <chrono> and <atomic> build errors with clang-cuda. Oct 11, 2022
@wmaxey wmaxey merged commit 2253815 into main Oct 11, 2022
@wmaxey wmaxey deleted the bugfix/clang_cuda_atomic_support branch October 11, 2022 22:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants