[SYCL][CUDA] sycl::atomic_ref usage on CUDA backend produces linking error #5647

krasznaa · 2022-02-23T17:08:19Z

Describe the bug

In the past I've made use of sycl::atomic for atomic operations in our code, which used to behave well on all backends that I tried. (Including the CUDA and HIP ones.) Now that the latest nightlies warn about sycl::atomic being deprecated, I tried to teach our code to use sycl::atomic_ref instead when it is available.

But when I do that, building the code for an NVIDIA backend fails like:

ptxas fatal   : Unresolved extern function '_Z18__spirv_AtomicIAddPiN5__spv5Scope4FlagENS0_19MemorySemanticsMask4FlagEi'
llvm-foreach: 
clang-15: error: ptxas command failed with exit code 255 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).

To Reproduce

Take the following example:

#include <CL/sycl.hpp>
#include <cassert>
#include <iostream>

//#define USE_ATOMIC_REF

int main() {

  // Set up the queue.
  sycl::queue queue;
  std::cout << "Running on device: " << queue.get_device().get_info<sycl::info::device::name>() << std::endl;

  // Set up the array.
  const std::size_t ARRAY_SIZE = 100;
  int* mem = static_cast<int*>(sycl::malloc_shared(ARRAY_SIZE * sizeof(int), queue));
  for (std::size_t i = 0; i < ARRAY_SIZE; ++i) {
    mem[i] = 0;
  }

  // Modify the array atomically on a device.
  queue.submit([mem, ARRAY_SIZE](sycl::handler& h) {
		 h.parallel_for<class atomic_test>(sycl::range<1>(ARRAY_SIZE),
						   [mem](sycl::id<1> id) {
#ifdef USE_ATOMIC_REF
						     sycl::atomic_ref<int, sycl::memory_order::relaxed,
								      sycl::memory_scope::device> aref(mem[id]);
						     aref.fetch_add(1);
#else
						     sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
#endif
						   });
	       }).wait_and_throw();

  // Check the array's payload.
  for (std::size_t i = 0; i < ARRAY_SIZE; ++i) {
    assert(mem[i] == 1);
  }

  // Finish up.
  std::cout << "All OK!" << std::endl;
  sycl::free(mem, queue);
  return 0;
}

In this exact form (without USE_ATOMIC_REF being defined) it builds and runs correctly, albeit with a good number of warnings.

[bash][Legolas]:sycl > clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -g atomic_ref_error.cpp 
atomic_ref_error.cpp:30:46: warning: 'atomic<int, sycl::access::address_space::global_space>' is deprecated: sycl::atomic is deprecated since SYCL 2020 [-Wdeprecated-declarations]
                                                     sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
                                                                                       ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/atomic.hpp:171:7: note: 'atomic<int, sycl::access::address_space::global_space>' has been explicitly marked deprecated here
class __SYCL2020_DEPRECATED(
      ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:54:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
                                       ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:45:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
                                     ^
warning: linking module '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas 'atomic_ref_error.cpp' is 'nvptx64-nvidia-cuda'
 [-Wlinker-warnings]
2 warnings generated.
atomic_ref_error.cpp:30:46: warning: 'atomic<int, sycl::access::address_space::global_space>' is deprecated: sycl::atomic is deprecated since SYCL 2020 [-Wdeprecated-declarations]
                                                     sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
                                                                                       ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/atomic.hpp:171:7: note: 'atomic<int, sycl::access::address_space::global_space>' has been explicitly marked deprecated here
class __SYCL2020_DEPRECATED(
      ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:54:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
                                       ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:45:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
                                     ^
1 warning generated.
[bash][Legolas]:sycl > SYCL_DEVICE_FILTER=CUDA ./a.out 
Running on device: NVIDIA GeForce RTX 3080
All OK!
[bash][Legolas]:sycl >

But if I un-comment the line defining USE_ATOMIC_REF, the build fails with:

[bash][Legolas]:sycl > clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -g atomic_ref_error.cpp 
warning: linking module '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas 'atomic_ref_error.cpp' is 'nvptx64-nvidia-cuda'
 [-Wlinker-warnings]
1 warning generated.
ptxas fatal   : Unresolved extern function '_Z18__spirv_AtomicIAddPiN5__spv5Scope4FlagENS0_19MemorySemanticsMask4FlagEi'
llvm-foreach: 
clang-15: error: ptxas command failed with exit code 255 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).
[bash][Legolas]:sycl >

I tried a few different arguments for the memory order and memory scope template arguments, but that didn't seem to make much of a difference.

Environment (please complete the following information)

OS: Ubuntu 20.04
Target device and vendor: NVIDIA GPU
DPC++ version:

[bash][Legolas]:sycl > clang++ -v
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
Found CUDA installation: /home/krasznaa/software/cuda/11.5.2/x86_64-ubuntu2004, version 11.5
Found HIP installation: /opt/rocm, version 4.2.21155-37cb3a34
[bash][Legolas]:sycl >

Dependencies version: N/A

Pinging @fwyzard and @ivorobts.

The text was updated successfully, but these errors were encountered:

Michoumichmich · 2022-02-23T17:28:31Z

I had the same error previously. I believe it's because the default address space is generic. Using sycl::atomic_ref<int, sycl::memory_order::relaxed, sycl::memory_scope::device, sycl::access::address_space::global_space> aref(mem[id]); works with your code (on CUDA).

Michoumichmich · 2022-02-23T17:40:01Z

Yes it generic:

llvm/sycl/include/CL/sycl/atomic_ref.hpp

Line 659 in 5e6995e

access::address_space AddressSpace =

and it's not supported on CUDA: #5215

krasznaa · 2022-02-23T21:56:53Z

Thanks a lot! That indeed works. I'll have to check tomorrow if that combination of template parameters will work in our actual code under all circumstances, but it smells like it should.

Michoumichmich · 2022-02-23T22:01:48Z

@krasznaa Glad it helped :) You might still need to adapt the address_space (as global_space was just a suggestion, of course)

zhongMou-lilSister · 2022-06-24T21:57:10Z

I wrote

              atomic_ref<float, 
                memory_order::relaxed, 
                memory_scope::device,
                access::address_space::global_space> ao (shRow[colW - i * COL_BLK]);
              ao.fetch_add(valY * valW);

However, it gives this error

fatal error: error in backend: Cannot cast between two non-generic address spaces
llvm-foreach: 
clang-15: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm a258a552dc28a10dda618aeb6da1057797bc18f8)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/name/sycl_workspace/llvm/build/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).

zhongMou-lilSister · 2022-06-24T22:12:24Z

I wrote

              atomic_ref<float, 
                memory_order::relaxed, 
                memory_scope::device,
                access::address_space::global_space> ao (shRow[colW - i * COL_BLK]);
              ao.fetch_add(valY * valW);

However, it gives this error

fatal error: error in backend: Cannot cast between two non-generic address spaces
llvm-foreach: 
clang-15: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm a258a552dc28a10dda618aeb6da1057797bc18f8)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/name/sycl_workspace/llvm/build/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).

When I tried atomic<float>(global_ptr<float>(&shRow[colW - i * COL_BLK])).fetch_add(valY * valW);, it says float aren't available yet. error: static_assert failed due to requirement '!std::is_same<float, float>::value' "SYCL atomic function not available for float type", and it seems that this "atomic().fetch_add()" function is somewhat deprecated.

Thanks in advance!

npmiller · 2023-01-12T10:40:04Z

This has been fixed in the CUDA backend in: #7391

krasznaa · 2023-01-19T13:11:49Z

Let me close this then. It has not affected out code in quite some time.

krasznaa added the bug Something isn't working label Feb 23, 2022

krasznaa mentioned this issue Feb 23, 2022

Update vecmem::atomic to use sycl::atomic_ref when it's available acts-project/vecmem#166

Closed

AerialMantis added cuda CUDA back-end compiler Compiler related issue labels Feb 23, 2022

t4c1 mentioned this issue Mar 21, 2022

[SYCL][CUDA][libclc] Add support for generic AS in atomics #5849

Closed

bader mentioned this issue Aug 1, 2022

SYCL-CTS test_vector_swizzles doesn't compile for NVPTX target. #6505

Closed

krasznaa closed this as completed Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][CUDA] sycl::atomic_ref usage on CUDA backend produces linking error #5647

[SYCL][CUDA] sycl::atomic_ref usage on CUDA backend produces linking error #5647

krasznaa commented Feb 23, 2022

Michoumichmich commented Feb 23, 2022

Michoumichmich commented Feb 23, 2022 •

edited

Loading

krasznaa commented Feb 23, 2022

Michoumichmich commented Feb 23, 2022 •

edited

Loading

zhongMou-lilSister commented Jun 24, 2022 •

edited

Loading

zhongMou-lilSister commented Jun 24, 2022

npmiller commented Jan 12, 2023

krasznaa commented Jan 19, 2023

[SYCL][CUDA] sycl::atomic_ref usage on CUDA backend produces linking error #5647

[SYCL][CUDA] sycl::atomic_ref usage on CUDA backend produces linking error #5647

Comments

krasznaa commented Feb 23, 2022

Describe the bug

To Reproduce

Environment (please complete the following information)

Michoumichmich commented Feb 23, 2022

Michoumichmich commented Feb 23, 2022 • edited Loading

krasznaa commented Feb 23, 2022

Michoumichmich commented Feb 23, 2022 • edited Loading

zhongMou-lilSister commented Jun 24, 2022 • edited Loading

zhongMou-lilSister commented Jun 24, 2022

npmiller commented Jan 12, 2023

krasznaa commented Jan 19, 2023

Michoumichmich commented Feb 23, 2022 •

edited

Loading

Michoumichmich commented Feb 23, 2022 •

edited

Loading

zhongMou-lilSister commented Jun 24, 2022 •

edited

Loading