Clang error compiling included C++ headers with GPU locale model #19754

milthorpe · 2022-05-04T10:02:54Z

Summary of Problem

A Chapel package module (ZMQ) that includes C++ code does not compile with the GPU locale model.

Steps to Reproduce

If I try to compile the arkouda code to check for ZeroMQ, which requires a C library wrapper, it works fine with the regular locale model:

milthorpe@equinox:/noback/milthorpe/chapel[main ?]$ chpl ../arkouda/dep/checkZMQ.chpl -I`spack location -i libzmq%gcc@11.1.0 arch=linux-ubuntu20.04-broadwell`/include -L`spack location -i libzmq%gcc@11.1.0 arch=linux-ubuntu20.04-broadwell`/lib
milthorpe@equinox:/noback/milthorpe/chapel[main !?]$ ./checkZMQ
Found ZMQ version: 4.3.3

but crashes when compiled with the GPU locale model:

milthorpe@equinox:/noback/milthorpe/chapel[main !?]$ CHPL_LLVM=bundled CHPL_CUDA_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/21.7/cuda/11.4 CHPL_LOCALE_MODEL=gpu chpl --gpu-arch sm_60 --no-checks ../arkouda/dep/checkZMQ.chpl -I`spack location -i libzmq%gcc@11.1.0 arch=linux-ubuntu20.04-broadwell`/include -L`spack location -i libzmq%gcc@11.1.0 arch=linux-ubuntu20.04-broadwell`/lib
warning: Unknown CUDA version. cuda.h: CUDA_VERSION=11040. Assuming the latest supported version 10.1
In file included from <built-in>:3:
In file included from /tmp/chpl-milthorpe.deleteme-aIK9XQ/command-line-includes.h:7:
In file included from /noback/milthorpe/chapel/modules/packages/ZMQHelper/zmq_helper.h:26:
In file included from /noback/milthorpe/chapel/runtime/include/qio/qio.h:26:
In file included from /noback/milthorpe/chapel/runtime/include/qio/qbuffer.h:39:
In file included from /noback/milthorpe/chapel/runtime/include/qio/deque.h:44:
In file included from /noback/milthorpe/chapel/runtime/include/chpl-mem.h:30:
In file included from /noback/milthorpe/chapel/runtime/include/arg.h:24:
In file included from /noback/milthorpe/chapel/runtime/include/chpltypes.h:39:
In file included from /noback/milthorpe/chapel/third-party/llvm/install/linux64-x86_64-clang/lib/clang/13.0.0/include/cuda_wrappers/complex:33:
/noback/milthorpe/spack/opt/spack/linux-ubuntu20.04-broadwell/gcc-9.4.0/gcc-11.1.0-wnfs2uguhyb3qe4tnvgy3ff43hruxwsw/lib/gcc/x86_64-pc-linux-gnu/11.1.0/../../../../include/c++/11.1.0/type_traits:56:3: error: templates must have C++ linkage
  template<typename _Tp, _Tp __v>
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/chpl-milthorpe.deleteme-aIK9XQ/command-line-includes.h:2:1: note: extern "C" language linkage specification begins here
extern "C" {
^
...
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated when compiling for sm_60.
error: error running clang during code generation

The clang error indicates that the extern "C" in the generated file command-line-includes.h is the problem. If I remove the extern declaration, I can compile and run with the GPU locale model, but I'm not sure if this is the correct solution.

From what I can see, command-line-includes.h is including a bunch of mostly-C header files, but one include chain runs from modules/packages/ZMQHelper/zmq_helper.h down to runtime/include/chpltypes.h, which includes a C++ header file clang/13.0.0/include/cuda_wrappers/complex that includes templated functions. This won't compile in the context of an extern "C" include, but I'm not sure of the best solution to ensure C++ code is correctly included as such.

The compiler function isCHeader just checks whether the filename suffix is .h, which won't work for files like the runtime headers chpltypes.h, chplcast.h which function either as C or C++ headers depending on how they're compiled (#ifdef __cplusplus).

Configuration Information

Output of chpl --version:

chpl version 1.27.0 pre-release (298724cbcb)
  built with LLVM version 13.0.0

Output of $CHPL_HOME/util/printchplenv --anonymize:

CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: llvm
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native
CHPL_LOCALE_MODEL: gpu *
CHPL_COMM: none
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_GMP: bundled
CHPL_HWLOC: bundled
CHPL_RE2: bundled
CHPL_LLVM: bundled
CHPL_AUX_FILESYS: none

Back-end compiler and version, e.g. gcc --version or clang --version:

clang version 13.0.0 (git@github.com:milthorpe/chapel.git 298724cbcb3712d300a98d4734d6e233b87ad201)

The text was updated successfully, but these errors were encountered:

bradcray · 2022-05-27T04:49:53Z

Based on today's discussion over on #19891, I'm wondering whether the right thing to do here would be to remove the code from compiler/llvm/clang.cpp that adds the extern "C" block wholesale to the command-line-includes.h file and rely instead on the header files themselves to be appropriately guarded with extern "C" and/or #ifdef __cplusplus blocks (?).

Something that's unclear to me in the case of the complex includes in ctypes.h is whether part of the problem here is that the header file for C vs. C++ complex numbers is one and the same? Or is it two different files in two different search paths that a compiler will choose between for C vs. C++? Maybe put another way, is there a way we can simply have this code path use the C complex types and version of the header for all compiles (since this is one of the few places I'm aware of where our runtime tries to inject C++ code into the mix?). This SO post gives me some slight hope that perhaps the answer is "yes", but I'm not confident: https://stackoverflow.com/questions/23414270/c-complex-and-complex-h-in-the-same-file and won't have time to try it tonight (and perhaps it's too g++-specific? though does any C/C++ compiler not use the gnu headers these days?)

mppf · 2022-05-27T12:17:58Z

We might need to provide a flag or something so that you can indicate if a header file is a C or C++ header.

Or, maybe, we just need to #include the usual Chapel runtime stuff before the command line includes, so that something like chpltypes.h will be included first in C++ mode (where it can do the right thing w.r.t. extern "C" and C++ headers), and then the include guard #define would prevent it from being included again within the extern "C" block in command-line-includes.h. That seems like a less drastic solution if we can get it to work.

hokiegeek2 · 2022-07-12T20:24:25Z

@mppf @bradcray this is what I am seeing when I attempt to compile Arkouda w GPU locales:

hokiegeek2 · 2022-07-12T21:21:07Z

ah, wonder if it's the compute capability, will set that, recompile Chapel, and see what happens:

e-kayrakli · 2022-07-12T22:05:27Z

The solution to the immediate problem is to put some #includes that are in #ifdef __cplusplus in extern "C++" blocks. Outer extern "C" blocks break linkage for those functions that require C++ linkage.

However, this'll bump into lack of complex support in the GPU locale model after adding missing extern "C++"s. Which is something we're aware of (https://github.com/chapel-lang/chapel/blob/main/test/gpu/native/math.chpl#L20-L23).

diff --git a/runtime/include/atomics/cstdlib/chpl-atomics.h b/runtime/include/atomics/cstdlib/chpl-atomics.h
index 962a90e6cd..76d3322db6 100644
--- a/runtime/include/atomics/cstdlib/chpl-atomics.h
+++ b/runtime/include/atomics/cstdlib/chpl-atomics.h
@@ -32,6 +32,7 @@
 #endif

 #if defined(__cplusplus)
+extern "C++" {
   #if __cplusplus >= 201103L
     #include <atomic>
     #define Atomic(T) std::atomic<T>
@@ -55,9 +56,12 @@
   #else
     #error "The cstdlib atomics need at least C++11.  Use intrinsics or locks."
   #endif
+}
 #elif __STDC_VERSION__ >= 201112L && !defined(__STDC_NO_ATOMICS__)
+extern "C++" {
   #include <stdatomic.h>
   #define Atomic(T) _Atomic T
+}
 #else
   #error "The cstdlib atomics need at least C11.  Use intrinsics or locks."
 #endif
diff --git a/runtime/include/chpltypes.h b/runtime/include/chpltypes.h
index ebb21c82fb..ab2b3de379 100644
--- a/runtime/include/chpltypes.h
+++ b/runtime/include/chpltypes.h
@@ -36,9 +36,11 @@
 typedef float complex        _complex64;
 typedef double complex       _complex128;
 #else
+extern "C++" {
 #include <complex>
 typedef std::complex<float>  _complex64;
 typedef std::complex<double> _complex128;
+}
 #endif

 #ifdef __cplusplus

This patch makes chpl test/library/packages/ZMQ/hello.chpl compile beyond the error reported in this issue (until it fails with known lack of complex support).

I don't think the lack of complex support is something too challenging to handle, but it is not near the top of our priorities. We can make it a priority, though. Is this a blocker @milthorpe and/or @hokiegeek2 ?

hokiegeek2 · 2022-07-14T22:07:09Z

@e-kayrakli thanks for the follow-up! Um, I'd really appreciate getting this issue resolved as I am trying to prototype Arkouda on GPUs where downselected pdarrays can be shared with Python tools such as Ray and RAPIDS vi locales being on GPUs. Even if it's just a code change I need to grab and patch my 1.27.0 codebase, that would be awesome!

@stonea

Fix some issues with the GPU support to enable simple Arkouda builds This branch makes bunch of small-ish changes in an attempt to get Arkouda to compile with the GPU locale model. I can separate this into multiple PRs, but I want to test it some more as a whole. Fixes with this PR: - Resolves #19754 - Stops adding `extern "C"` in the C headers included at the command line. This was proposed by @milthorhpe. - Fixes an interop test broken by this. - To be able to do that makes runtime's generated code interface for complex numbers C-based even if we're compiling with C++. (A prior PR that made some improvements in that direction is #16847) - This enables defining `c_string_to_complex*` functions to be included in the generated executable, because their return types are no longer `std::complex<>` which you can't link with C linkage. - Adds a `no gpu codegen` pragma to thwart gpuization within some internal functions. Currently this only applies to: - `chpl__initCopy_shapeHelp`: Has PRIM_ASSIGN that gets normalized after gpuization, but not resolved. We probably need this function, and the lack might be preventing us from using loop expressions to initialize arrays on gpus. - Fixes `.name` implementation of the `GPULocale` type. - Adds a relevant test. - Adds support for `BitOps` module. - Adds a relevant test. - While there, cleans up `gpuTransforms.cpp` a bit. [Reviewed by @stonea] Test: - [x] gpu/native - [x] gpu/interop - [x] `make check` + spot checks in `types/complex` with `intel` as the host+target compiler - [x] `make check` + spot checks in `types/complex` with `cray` as the host+target compiler

This comment was marked as resolved.

Sign in to view

e-kayrakli added area: Compiler area: GPU Support labels May 4, 2022

This comment was marked as resolved.

Sign in to view

bradcray mentioned this issue May 27, 2022

linker error in GPU code generation with llvm 13.0.1 #19740

Closed

e-kayrakli mentioned this issue Jul 18, 2022

Fix some issues with the GPU support to enable simple Arkouda builds #20239

Merged

4 tasks

e-kayrakli closed this as completed in #20239 Jul 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clang error compiling included C++ headers with GPU locale model #19754

Clang error compiling included C++ headers with GPU locale model #19754

milthorpe commented May 4, 2022 •

edited

Loading

This comment was marked as resolved.

This comment was marked as resolved.

bradcray commented May 27, 2022

mppf commented May 27, 2022

hokiegeek2 commented Jul 12, 2022

hokiegeek2 commented Jul 12, 2022

e-kayrakli commented Jul 12, 2022

hokiegeek2 commented Jul 14, 2022

Clang error compiling included C++ headers with GPU locale model #19754

Clang error compiling included C++ headers with GPU locale model #19754

Comments

milthorpe commented May 4, 2022 • edited Loading

Summary of Problem

Steps to Reproduce

Configuration Information

This comment was marked as resolved.

This comment was marked as resolved.

bradcray commented May 27, 2022

mppf commented May 27, 2022

hokiegeek2 commented Jul 12, 2022

hokiegeek2 commented Jul 12, 2022

e-kayrakli commented Jul 12, 2022

hokiegeek2 commented Jul 14, 2022

milthorpe commented May 4, 2022 •

edited

Loading