ARROW-17790: [C++][Gandiva] Adapt to LLVM opaque pointer #14187

js8544 · 2022-09-21T06:19:17Z

Starting from LLVM 13, LLVM IR has been shifting towards a unified opaque pointer type, i.e. pointers without pointee types. It has provided workarounds until LLVM 15. The temporary workarounds need to be replaced in order to support LLVM 15 and onwards. We need to supply the pointee type to the CreateGEP and CreateLoad methods.

For more background info, see https://llvm.org/docs/OpaquePointers.html and https://lists.llvm.org/pipermail/llvm-dev/2015-February/081822.html

Related issues:

https://issues.apache.org/jira/browse/ARROW-14363

https://issues.apache.org/jira/browse/ARROW-17728

https://issues.apache.org/jira/browse/ARROW-17775

github-actions · 2022-09-21T06:19:48Z

https://issues.apache.org/jira/browse/ARROW-17790

pitrou · 2022-09-21T07:01:35Z

Is it possible to keep compatibility with pre-13 LLVM using some sort of compatibility wrappers?
cc @kou

js8544 · 2022-09-21T07:23:18Z

@pitrou Sorry to bother you antoine, but I have a newbie question: How I can run gandiva related benchmarks with ursabot ?

Is it possible to keep compatibility with pre-13 LLVM using some sort of compatibility wrappers? cc @kou

@pitrou As far as I understand it is compatible with previous versions. Previously we used a helper function to let pointers duduce their own types. Now we supply their types directly.
Is there a way to check compatibility with old versions?

pitrou · 2022-09-21T07:34:34Z

One simple possibility is to build and run benchmarks locally. Build Arrow C++ in release mode with -DARROW_BUILD_BENCHMARKS=ON. Then you'll get benchmarks as executable files in the build directory that you can run individually.

pitrou · 2022-09-21T07:35:34Z

Is there a way to check compatibility with old versions?

Not sure. That might be tested on some of our nightly builds...

pitrou · 2022-09-21T07:37:21Z

Hmm, judging by this error, we might have to bump the CLang version on the macOS C++ builder as well:
https://github.com/apache/arrow/actions/runs/3095675584/jobs/5010342464#step:9:1738

@kou @assignUser Is there a way to do that?

pitrou · 2022-09-21T11:27:03Z

@github-actions crossbow submit -g cpp

github-actions · 2022-09-21T11:58:30Z

Revision: 5f40ace

Submitted crossbow builds: ursacomputing/crossbow @ actions-5342584c40

Task	Status
test-alpine-linux-cpp
test-build-cpp-fuzz
test-conda-cpp
test-conda-cpp-valgrind
test-debian-10-cpp-amd64
test-debian-10-cpp-i386
test-debian-11-cpp-amd64
test-debian-11-cpp-i386
test-fedora-35-cpp
test-ubuntu-18.04-cpp
test-ubuntu-18.04-cpp-release
test-ubuntu-18.04-cpp-static
test-ubuntu-20.04-cpp
test-ubuntu-20.04-cpp-17
test-ubuntu-20.04-cpp-bundled
test-ubuntu-20.04-cpp-thread-sanitizer
test-ubuntu-22.04-cpp

js8544 · 2022-09-21T13:43:16Z

@pitrou I update the ci scripts to use the clang version installed by brew. It yields many new warnings. Should I fix those warnings in this PR?

pitrou · 2022-09-21T13:44:28Z

@js8544 If the warnings fail the build, then yes. Are you able to reproduce locally for faster iterations?

js8544 · 2022-09-21T14:40:52Z

I just installed clang 15 on my machine and can reproduce now. Let me fix the warnings.

js8544 · 2022-09-21T15:05:57Z

@pitrou Most of these warnings are unrelated to Gandiva and I feed like fixing them in this PR makes it confusing for future maintenance.
Can I create another issue and PR that: 1. Use brew installed clang for mac in ci scripts 2. Fix clang-15 compatibility?

pitrou · 2022-09-21T15:22:09Z

@js8544 Definitely!

js8544 · 2022-09-21T16:04:25Z

It turns out they have to be submited together because a successful clang-15 build requires this change from Gandiva. However I did create another issue to track it: https://issues.apache.org/jira/browse/ARROW-17805

js8544 · 2022-09-21T16:06:20Z

We need to temporarily turn off gcsfs_test, s3fs_test, flight_internals_test and flight_test due to this issue: boostorg/container_hash#24

pitrou · 2022-09-21T16:12:55Z

@js8544 Can you show a snippet of the error(s) with boost?

Edit: looking at https://github.com/boostorg/config/pull/440/files, perhaps we can just define BOOST_NO_CXX98_FUNCTION_BASE before including boost?

js8544 · 2022-09-21T16:15:05Z

In file included from /Users/jinshang/Projects/arrow/cpp/src/arrow/filesystem/gcsfs_test.cc:34:
In file included from /usr/local/include/boost/process.hpp:24:
In file included from /usr/local/include/boost/process/async_system.hpp:22:
In file included from /usr/local/include/boost/process/child.hpp:22:
In file included from /usr/local/include/boost/process/detail/execute_impl.hpp:24:
In file included from /usr/local/include/boost/process/detail/posix/executor.hpp:14:
In file included from /usr/local/include/boost/process/error.hpp:34:
In file included from /usr/local/include/boost/type_index.hpp:29:
In file included from /usr/local/include/boost/type_index/stl_type_index.hpp:47:
/usr/local/include/boost/container_hash/hash.hpp:132:33: error: no template named 'unary_function' in namespace 'std'; did you mean '__unary_function'?
struct hash_base : std::unary_function<T, std::size_t> {};
~~~~~^
/usr/local/opt/llvm/bin/../include/c++/v1/__functional/unary_function.h:46:1: note: '__unary_function' declared here
using __unary_function = __unary_function_keep_layout_base<_Arg, _Result>;
^
1 error generated.

js8544 · 2022-09-21T16:15:32Z

@js8544 Can you show a snippet of the error(s) with boost?

Edit: looking at https://github.com/boostorg/config/pull/440/files, perhaps we can just define BOOST_NO_CXX98_FUNCTION_BASE before including boost?

Makes sense, I'll try it.

js8544 · 2022-09-22T03:30:59Z

From https://github.com/apache/arrow/actions/runs/3099507453 and https://github.com/apache/arrow/actions/runs/3099507450 it seems to be working but the Mac build timed out. I force pushed with no change to trigger a rebuild.

js8544 · 2022-09-22T03:51:14Z

As far as I understand it is compatible with previous versions. Previously we used a helper function to let pointers duduce their own types. Now we supply their types directly.

Also from https://app.travis-ci.com/github/apache/arrow/jobs/583496235 it seems to be compatible with LLVM 10

kou

+1

Thanks!

kou · 2022-09-22T05:05:04Z

cpp/src/arrow/filesystem/gcsfs_test.cc

@@ -31,6 +31,7 @@
 // We need BOOST_USE_WINDOWS_H definition with MinGW when we use
 // boost/process.hpp. See BOOST_USE_WINDOWS_H=1 in
 // cpp/cmake_modules/ThirdpartyToolchain.cmake for details.
+#define BOOST_NO_CXX98_FUNCTION_BASE  // ARROW-17805


Could you move this to #20 (before // This boost/asio/...) to avoid confusing the above // We need ... comment target?

kou · 2022-09-22T05:05:10Z

cpp/src/arrow/filesystem/s3_test_util.cc

@@ -37,6 +37,7 @@
 // We need BOOST_USE_WINDOWS_H definition with MinGW when we use
 // boost/process.hpp. See BOOST_USE_WINDOWS_H=1 in
 // cpp/cmake_modules/ThirdpartyToolchain.cmake for details.
+#define BOOST_NO_CXX98_FUNCTION_BASE  // ARROW-17805


kou · 2022-09-22T05:06:00Z