-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-8227: [C++] Refine SIMD feature definitions #6794
Conversation
This patch moves SIMD feature definitions from source code to cmake, and supports more flexible Arm64 CPU feature settings. Binary building is controlled by two factors: compiler capability and build requirement. Compiler capability is detected in cmake by trying flags like "-mavx2". Build requirement is passed by cmake command line such as "-DARROW_SIMD_LEVEL=AVX2". Combining these two factors, we can define SIMD feature macros ARROW_HAVE_AVX2, which controls conditional compiling of related SIMD implementations in source code. Currently we set compiler options(e.g. -msse4.2) in cmake but define SIMD features by checking compiler macros in source code like below: #if defined(__SSE4_2__) #define ARROW_HAVE_SSE4_2 1 #endif Putting them together in cmake eases maintenance.
@kou @jianxind this patch changes some of your previous code, please review if it's okay, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
set(CXX_SUPPORTS_SSE4_2 TRUE) | ||
else() | ||
set(ARROW_SSE2_FLAG "-msse2") | ||
set(ARROW_SSE42_FLAG "-msse4.2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use ARROW_SSE4_2_FLAG
name for consistency?
Other SSE4.2 variables use SSE4_2
instead of SSE42
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@github-actions crossbow submit -g linux-arm |
Revision: 3b5ad09 Submitted crossbow builds: ursa-labs/crossbow @ actions-60
|
ok for me, verified avx512 build and unittest locally. |
Umm, it seems that we can't use https://github.com/ursa-labs/crossbow/runs/550852423
Should we specify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much. This is a very nice improvement.
Is it desirable to use CMAKE_SYSTEM_PROCESSOR
at the toplevel to select some of the CMake behaviour? For example, it doesn't make sense to check Altivec support when compiling for x86 or ARM...
# power compiler flags | ||
check_cxx_compiler_flag("-maltivec" CXX_SUPPORTS_ALTIVEC) | ||
set(ARROW_ALTIVEC_FLAG "-maltivec") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should check any -m
flag under MSVC. It will fail and takes a bit of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(by a bit of time, I mean each of these checks seem to take 1 second on my Windows VM)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, better to avoid these unnecessary checking. Will do.
CMAKE_SYSTEM_PROCESSOR may break cross building, but I don't think it's reasonable to cross build a complex application like Arrow.
if(CXX_SUPPORTS_SSE4_2) | ||
set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${ARROW_SSE4_2_FLAG}") | ||
add_definitions(-DARROW_HAVE_SSE4_2 -DARROW_HAVE_SSE2) | ||
elseif(CXX_SUPPORTS_SSE2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should fall back here. Is it useful to support compilers with SSE2 but without SSE4.2 support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Arrow supported compilers(gcc4.8+, clang-7+, msvc2015+), SSE2(actually SSE4.2) must be available.
There are some source code with "ifdef SSE2". I will remove SSE2 definition here and refine the code using it.
else() | ||
set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} ${ARROW_ARMV8_CRC_FLAG}") | ||
if(CXX_SUPPORTS_ARMV8) | ||
if(NOT CXX_SUPPORTS_ARMV8_ARCH) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the difference between CXX_SUPPORTS_ARMV8
and CXX_SUPPORTS_ARMV8_ARCH
. Can you explain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be clear after refined with CMAKE_SYSTEM_PROCESSOR.
CXX_SUPPORTS_ARMV8 detects if compiler supports arm64 (-march=armv8-a)
CXX_SUPPORTS_ARMV8_ARCH detects if compiler supports arch requirement from cmake command line (-DARROW_ARMV8_ARCH=armv8.1-a+crypto)
@@ -22,31 +22,20 @@ | |||
|
|||
#include "arrow/util/macros.h" | |||
|
|||
#ifdef ARROW_USE_SIMD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ARROW_USE_SIMD
still referenced in the codebase? If not, can you remove it from CMake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not used anymore in codebase.
It's used in cmake, acts as a central switch to turn on/off all simd flags such as ARROW_HAVE_SSE4_2.
Looks it's still useful? It's listed in benchmark doc. Code comment below:
#Disable this option to exercise non-SIMD fallbacks
define_option(ARROW_USE_SIMD "Build with SIMD optimizations" ON)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. Let's keep it then.
@kou , about And from build log: https://github.com/ursa-labs/crossbow/runs/550852423 I see we're testing arm64 build inside a container based on arm64v8/centos:7. [EDIT] |
@cyb70289 Thanks for the information.
The test failure of them is Thrift download error. It'll be fixed another try. So we don't need to do anything for this. |
@kou CentOS7 eol is 2024, looks we should keep it. But gcc4.8 is too old with poor support for aarch64. I think we can install a newer gcc version on centos7 in our aarch64 CI job. Will do some tests to see if it's workable. |
We can use g++ 8 on CentOS 7 aarch64 by devtoolset-8 package. I used https://github.com/apache/arrow/blob/master/cpp/examples/arrow/row-wise-conversion-example.cc for test: $ c++ -o row-wise-conversion-example{,.cc} $(pkg-config --cflags --libs arrow) -std=gnu++11 -O0 -g3 -DNDEBUG
$ gdb --args ./row-wise-conversion-example
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /row-wise-conversion-example...done.
(gdb) r
Starting program: /./row-wise-conversion-example
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff3dff700 (LWP 12925)]
Program received signal SIGSEGV, Segmentation fault.
0x00000000004135bd in __gnu_cxx::new_allocator<arrow::ListType>::destroy<arrow::ListType> (
this=0x62c770, __p=0x61e310 <vtable for arrow::ListType+16>)
at /usr/include/c++/4.8.2/ext/new_allocator.h:124
124 destroy(_Up* __p) { __p->~_Up(); }
Missing separate debuginfos, use: debuginfo-install brotli-1.0.7-5.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-292.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64 libzstd-1.4.4-1.el7.x86_64 lz4-1.7.5-2.el7.x86_64 snappy-1.1.0-3.el7.x86_64 zlib-1.2.7-18.el7.x86_64 Here is the change to use g++ 8 on CentOS 7: diff --git a/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in b/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
index fc6f5158f..95444b4be 100644
--- a/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
+++ b/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in
@@ -19,6 +19,8 @@
%define _centos_ver %{?centos_ver:%{centos_ver}}%{!?centos_ver:8}
+%define use_cxx11_abi (%{_centos_ver} >= 8)
+
%define boost_version %( \
if [ "%{_centos_ver}" = 7 ]; then \
echo 169; \
@@ -62,7 +64,6 @@ BuildRequires: brotli-devel
%endif
BuildRequires: bzip2-devel
BuildRequires: cmake%{cmake_version}
-BuildRequires: gcc-c++
BuildRequires: gflags-devel
BuildRequires: git
%if %{_centos_ver} >= 7
@@ -114,6 +115,9 @@ Apache Arrow is a data processing library for analysis.
cpp_build_type=release
mkdir cpp/build
cd cpp/build
+%if !%{use_cxx11_abi}
+CXXFLAGS="%optflags -D_GLIBCXX_USE_CXX11_ABI=0"
+%endif
%cmake3 .. \
%if %{use_flight}
-DARROW_FLIGHT=ON \
diff --git a/dev/tasks/linux-packages/apache-arrow/yum/centos-7/Dockerfile b/dev/tasks/linux-packages/apache-arrow/yum/centos-7/Dockerfile
index bfc34819a..e0f366ea0 100644
--- a/dev/tasks/linux-packages/apache-arrow/yum/centos-7/Dockerfile
+++ b/dev/tasks/linux-packages/apache-arrow/yum/centos-7/Dockerfile
@@ -22,11 +22,15 @@ COPY qemu-* /usr/bin/
ARG DEBUG
+ENV \
+ DEVTOOLSET_VERSION=8
+
RUN \
quiet=$([ "${DEBUG}" = "yes" ] || echo "--quiet") && \
yum update -y ${quiet} && \
- yum install -y ${quiet} epel-release && \
- yum groupinstall -y ${quiet} "Development Tools" && \
+ yum install -y ${quiet} \
+ centos-release-scl \
+ epel-release && \
yum install -y ${quiet} \
autoconf-archive \
bison \
@@ -34,6 +38,7 @@ RUN \
brotli-devel \
bzip2-devel \
cmake3 \
+ devtoolset-${DEVTOOLSET_VERSION} \
flex \
gflags-devel \
git \ |
Not sure of the reason. I did a quick test in centos:7 x86 container. I manually install required packages and build/test arrow. It works correctly, row-wise-conversion-example.cc is also ok. For this aarch64 gcc4.8 |
Ah, sorry. We need to use g++ 8 for build (we can use g++ 8 in
I think so too. Could you work on this? We already have some |
Sure. I'll refine this patch to include the check. |
Main changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much. Looks mostly good, just one question.
@@ -36,10 +36,6 @@ | |||
#include "arrow/util/sse_util.h" | |||
|
|||
// enable SIMD whitespace skipping, if available | |||
#if defined(ARROW_HAVE_SSE2) | |||
#define RAPIDJSON_SSE2 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep defining RAPIDJSON_SSE2
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From below rapidjson code, looks RAPIDJSON_SSE2 should be kept (though I suspect that code should check SSE42 first). Will update this patch to include RAPIDJSON_SSE2.
https://github.com/Tencent/rapidjson/blob/master/test/perftest/rapidjsontest.cpp#L30
Change-Id: I7cc37a36a87c70b4e91387bb7450547e771a1d67
CI failures are unrelated, merging. |
This patch moves SIMD feature definitions from source code to cmake,
and supports more flexible Arm64 CPU feature settings.
Binary building is controlled by two factors: compiler capability and
build requirement. Compiler capability is detected in cmake by trying
flags like "-mavx2". Build requirement is passed by cmake command line
such as "-DARROW_SIMD_LEVEL=AVX2". Combining these two factors, we can
define SIMD feature macros ARROW_HAVE_AVX2, which controls conditional
compiling of related SIMD implementations in source code.
Currently we set compiler options(e.g. -msse4.2) in cmake but define
SIMD features by checking compiler macros in source code like below:
#if defined(SSE4_2)
#define ARROW_HAVE_SSE4_2 1
#endif
Putting them together in cmake eases maintenance.