Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomics build failure when building for arm64 (undefined symbol: __aarch64_ldadd8_acq_rel...) #2527

Closed
lpmitchell opened this issue Nov 7, 2023 · 1 comment

Comments

@lpmitchell
Copy link

When compiling on arm (aarch64) using cmake on any Catboost version 1.2.0+, the build fails with errors relating to atomics.

Catboost version: 1.2.2
Operating System: Linux / aarch64
CPU: Any 64-bit arm processor (e.g. Amazon Graviton)
GPU: None
Details:

I have tested this on various linux distributions and toolchain (gcc, llvm, cmake) versions.

Reproducible steps (Ubuntu 20.04, arm64):

1) Install dependencies

apt-get update
apt-get install -y cmake clang lld gcc python3-pip ninja-build
pip install "conan==1.59.0"

2) Verify dependencies:

root@ubuntu:/# lscpu | grep 'Arch\|Flags'
Architecture:                       aarch64
Flags:                              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs

root@ubuntu:/# cmake --version
cmake version 3.22.1

root@ubuntu:/# gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

root@ubuntu:/# clang --version
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

root@ubuntu:/# ld.lld --version
Ubuntu LLD 14.0.0 (compatible with GNU linkers)

root@ubuntu:/# conan --version
Conan version 1.59.0

root@ubuntu:/# ninja --version
1.10.1

root@ubuntu:/# python3 --version
Python 3.10.12

3) Invoke build

python3 build/build_native.py --build-root-dir=/catboost/out --targets catboostmodel --verbose

4) Observe failure:

FAILED: contrib/tools/flatc/bin/flatc
: && /usr/bin/clang++ -fexceptions   -fno-common   -fcolor-diagnostics   -faligned-allocation   -fdebug-default-version=4   -ffunction-sections   -fdata-sections   -Wall   -Wextra   -Wno-parentheses   -Wno-implicit-const-int-float-conversion   -Wno-unknown-warning-option   -pipe   -D_THREAD_SAFE   -D_PTHREADS   -D_REENTRANT   -D_LARGEFILE_SOURCE   -D__STDC_CONSTANT_MACROS   -D__STDC_FORMAT_MACROS   -D__LONG_LONG_SUPPORTED  -D_GNU_SOURCE -DLIBCXX_BUILDING_LIBCXXRT -fuse-init-array -D_FILE_OFFSET_BITS=64 -fsigned-char   -Woverloaded-virtual   -Wimport-preprocessor-directive-pedantic   -Wno-undefined-var-template   -Wno-return-std-move   -Wno-defaulted-function-deleted   -Wno-pessimizing-move   -Wno-deprecated-anon-enum-enum-conversion   -Wno-deprecated-enum-enum-conversion   -Wno-deprecated-enum-float-conversion   -Wno-ambiguous-reversed-operator   -Wno-deprecated-volatile  -O3 -DNDEBUG -fuse-ld=lld    -nodefaultlibs -ldl -lrt -Wl,--no-as-needed -fPIC -lpthread contrib/tools/flatc/bin/CMakeFiles/flatc.dir/__/__/__/libs/flatbuffers/src/flatc_main.cpp.o contrib/tools/flatc/bin/CMakeFiles/flatc.dir/__vcs_version__.c.o -o contrib/tools/flatc/bin/flatc  contrib/libs/flatbuffers/flatc/liblibs-flatbuffers-flatc.a  library/cpp/malloc/jemalloc/libcpp-malloc-jemalloc.a  contrib/restricted/abseil-cpp/absl/base/libabseil-cpp-absl-base.a  library/cpp/malloc/api/libcpp-malloc-api.a  contrib/libs/jemalloc/libcontrib-libs-jemalloc.a  contrib/libs/cxxsupp/libcxx/liblibs-cxxsupp-libcxx.a  contrib/libs/cxxsupp/libcxxabi-parts/liblibs-cxxsupp-libcxxabi-parts.a  contrib/libs/cxxsupp/libcxxrt/liblibs-cxxsupp-libcxxrt.a  contrib/libs/cxxsupp/builtins/liblibs-cxxsupp-builtins.a  contrib/libs/libunwind/libcontrib-libs-libunwind.a  -lc -lm && :
ld.lld: error: undefined symbol: __aarch64_ldadd8_acq_rel
>>> referenced by flatc_main.cpp
>>>               contrib/tools/flatc/bin/CMakeFiles/flatc.dir/__/__/__/libs/flatbuffers/src/flatc_main.cpp.o:(main)
>>> referenced by flatc_main.cpp
>>>               contrib/tools/flatc/bin/CMakeFiles/flatc.dir/__/__/__/libs/flatbuffers/src/flatc_main.cpp.o:(main)
>>> referenced by flatc_main.cpp
>>>               contrib/tools/flatc/bin/CMakeFiles/flatc.dir/__/__/__/libs/flatbuffers/src/flatc_main.cpp.o:(main)
>>> referenced 342 more times

ld.lld: error: undefined symbol: __aarch64_ldadd8_relax
... etc

Full error log: https://gist.github.com/lpmitchell/297ca3ab99f61f3db1f24d1f612ae35d

Attempted fixes:

Cross compiling:

When cross-compiling from x86, I am able to produce a binary, however the same atomics missing symbol error occurs when linking the dynamic library:

/usr/local/go/pkg/tool/linux_arm64/link: running gcc failed: exit status 1
/usr/bin/ld: $WORK/b001/exe/a.out: hidden symbol `__aarch64_cas8_acq_rel' in /usr/lib/gcc/aarch64-linux-gnu/11/libgcc.a(cas_8_4.o) is referenced by DSO
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status

-mno-outline-atomics:

This seems to have no impact on the build failure, notice the -mno-outline-atomics flag in the clang++ call:

FAILED: contrib/tools/flatc/bin/flatc
: && /usr/bin/clang++ -fexceptions   -fno-common   -fcolor-diagnostics   -faligned-allocation   -fdebug-default-version=4   -ffunction-sections   -fdata-sections   -Wall   -Wextra   -Wno-parentheses   -Wno-implicit-const-int-float-conversion   -Wno-unknown-warning-option   -pipe   -D_THREAD_SAFE   -D_PTHREADS   -D_REENTRANT   -D_LARGEFILE_SOURCE   -D__STDC_CONSTANT_MACROS   -D__STDC_FORMAT_MACROS   -D__LONG_LONG_SUPPORTED  -D_GNU_SOURCE -DLIBCXX_BUILDING_LIBCXXRT -fuse-init-array -D_FILE_OFFSET_BITS=64 -fsigned-char   -Woverloaded-virtual   -Wimport-preprocessor-directive-pedantic   -Wno-undefined-var-template   -Wno-return-std-move   -Wno-defaulted-function-deleted   -Wno-pessimizing-move   -Wno-deprecated-anon-enum-enum-conversion   -Wno-deprecated-enum-enum-conversion   -Wno-deprecated-enum-float-conversion   -Wno-ambiguous-reversed-operator   -Wno-deprecated-volatile  -O3 -DNDEBUG -fuse-ld=lld -mno-outline-atomics    -nodefaultlibs -ldl -lrt -Wl,--no-as-needed -fPIC -lpthread contrib/tools/flatc/bin/CMakeFiles/flatc.dir/__/__/__/libs/flatbuffers/src/flatc_main.cpp.o contrib/tools/flatc/bin/CMakeFiles/flatc.dir/__vcs_version__.c.o -o contrib/tools/flatc/bin/flatc  contrib/libs/flatbuffers/flatc/liblibs-flatbuffers-flatc.a  library/cpp/malloc/jemalloc/libcpp-malloc-jemalloc.a  contrib/restricted/abseil-cpp/absl/base/libabseil-cpp-absl-base.a  library/cpp/malloc/api/libcpp-malloc-api.a  contrib/libs/jemalloc/libcontrib-libs-jemalloc.a  contrib/libs/cxxsupp/libcxx/liblibs-cxxsupp-libcxx.a  contrib/libs/cxxsupp/libcxxabi-parts/liblibs-cxxsupp-libcxxabi-parts.a  contrib/libs/cxxsupp/libcxxrt/liblibs-cxxsupp-libcxxrt.a  contrib/libs/cxxsupp/builtins/liblibs-cxxsupp-builtins.a  contrib/libs/libunwind/libcontrib-libs-libunwind.a  -lc -lm && :
ld.lld: error: undefined symbol: __aarch64_ldadd8_acq_rel
>>> referenced by flatc_main.cpp
...

Manual steps (without using build_native.py) - Ninja

The same compilation error occurs when following manual steps to build directly:

cmake $CATBOOST_SRC_ROOT -G "Ninja" -B$CMAKE_NATIVE_TOOLS_BINARY_DIR -DCATBOOST_COMPONENTS=none -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=$CATBOOST_SRC_ROOT/build/toolchains/clang.toolchain
# ok
ninja -C $CMAKE_NATIVE_TOOLS_BINARY_DIR archiver cpp_styleguide enum_parser flatc protoc rescompiler triecompiler
# ok - tools built succesfully

conan install -s build_type=Release -if $CMAKE_TARGET_PLATFORM_BINARY_DIR --build=missing $CATBOOST_SRC_ROOT/conanfile.txt
# ok
conan install -s build_type=Release -if $CMAKE_TARGET_PLATFORM_BINARY_DIR --build=missing --no-imports -pr:h=$CATBOOST_SRC_ROOT/cmake/conan-profiles/linux.aarch64.profile -pr:b=default $CATBOOST_SRC_ROOT/conanfile.txt
# ok
cmake $CATBOOST_SRC_ROOT -B $CMAKE_TARGET_PLATFORM_BINARY_DIR -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=$CATBOOST_SRC_ROOT/build/toolchains/cross-build.host.linux.target.aarch64-linux-gnu.clang.toolchain -DCATBOOST_COMPONENTS=libs -DTOOLS_ROOT=$CMAKE_NATIVE_TOOLS_BINARY_DIR
# ok (Build files have been written to: /catboost/out)

ninja -C $CMAKE_TARGET_PLATFORM_BINARY_DIR catboostmodel
# Build error (same as above)

Manual steps - Unix Makefile

cmake $CATBOOST_SRC_ROOT -G "Unix Makefiles" -B$CMAKE_NATIVE_TOOLS_BINARY_DIR -DCATBOOST_COMPONENTS=none -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=$CATBOOST_SRC_ROOT/build/toolchains/clang.toolchain
# ok
make -j 8 -C $CMAKE_NATIVE_TOOLS_BINARY_DIR archiver cpp_styleguide enum_parser flatc protoc rescompiler triecompiler
# Build error (same as above, when building archiver)

Any help on this issue would be greatly appreciated, or any information on the build environment used for the binary releases may also be useful.

Many thanks!

@andrey-khropov
Copy link
Member

andrey-khropov commented Nov 30, 2023

I've been able to reproduce this error, but on Ubuntu 22.04. The versions of the packages you've provided above suggest that you've likely made a mistake specifying Ubuntu 20.04 instead of 22.04.

But adding -mno-outline-atomics helps to resolve the issue: I've been able to build libcatboostmodel successfully and then tried to use it (link with it) in an executable program, also without errors.

I've added -mno-outline-atomics like that:

diff --git a/cmake/global_flags.compiler.gnu.cmake b/cmake/global_flags.compiler.gnu.cmake
index 3dcde4027f..62df102035 100644
--- a/cmake/global_flags.compiler.gnu.cmake
+++ b/cmake/global_flags.compiler.gnu.cmake
@@ -44,6 +44,10 @@ else()
   string(APPEND _GNU_COMMON_C_CXX_FLAGS " -D_FILE_OFFSET_BITS=64")
 endif()
 
+if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(arm.*|aarch64)")
+  string(APPEND _GNU_COMMON_C_CXX_FLAGS " -mno-outline-atomics")
+endif()
+
 if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(arm.*|aarch64|ppc64le)")
   string(APPEND _GNU_COMMON_C_CXX_FLAGS " -fsigned-char")
 endif()

It's likely that I will commit this fix permanently but please check that it works for you as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants