Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Gandiva] Support LLVM 17 #37834

Closed
MehdiChinoune opened this issue Sep 22, 2023 · 7 comments · Fixed by #37867
Closed

[C++][Gandiva] Support LLVM 17 #37834

MehdiChinoune opened this issue Sep 22, 2023 · 7 comments · Fixed by #37867
Assignees
Milestone

Comments

@MehdiChinoune
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

Arrow couldn't find llvm 17.0.1 because It's not listed.

Component(s)

C++

@kou kou changed the title Support LLVM 17 [C++][Gandiva] Support LLVM 17 Sep 23, 2023
@kou
Copy link
Member

kou commented Sep 23, 2023

I tried this a bit:

diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
index f2906b960..f0acab038 100644
--- a/cpp/CMakeLists.txt
+++ b/cpp/CMakeLists.txt
@@ -152,6 +152,7 @@ set(ARROW_DOC_DIR "share/doc/${PROJECT_NAME}")
 set(BUILD_SUPPORT_DIR "${CMAKE_SOURCE_DIR}/build-support")
 
 set(ARROW_LLVM_VERSIONS
+    "17.0"
     "16.0"
     "15.0"
     "14.0"
diff --git a/cpp/src/gandiva/CMakeLists.txt b/cpp/src/gandiva/CMakeLists.txt
index db260b5ac..d2810c39f 100644
--- a/cpp/src/gandiva/CMakeLists.txt
+++ b/cpp/src/gandiva/CMakeLists.txt
@@ -31,8 +31,6 @@ if(ARROW_WITH_ZSTD AND "${zstd_SOURCE}" STREQUAL "SYSTEM")
   provide_find_module(zstdAlt "Gandiva")
 endif()
 
-add_definitions(-DGANDIVA_LLVM_VERSION=${LLVM_VERSION_MAJOR})
-
 # Set the path where the bitcode file generated, see precompiled/CMakeLists.txt
 set(GANDIVA_PRECOMPILED_BC_PATH "${CMAKE_CURRENT_BINARY_DIR}/irhelpers.bc")
 set(GANDIVA_PRECOMPILED_CC_PATH "${CMAKE_CURRENT_BINARY_DIR}/precompiled_bitcode.cc")
diff --git a/cpp/src/gandiva/engine.cc b/cpp/src/gandiva/engine.cc
index 7d75793a3..511f7cc11 100644
--- a/cpp/src/gandiva/engine.cc
+++ b/cpp/src/gandiva/engine.cc
@@ -53,7 +53,11 @@
 #include <llvm/IR/LegacyPassManager.h>
 #include <llvm/IR/Verifier.h>
 #include <llvm/Linker/Linker.h>
+#if LLVM_VERSION_MAJOR >= 17
+#include <llvm/TargetParser/SubtargetFeature.h>
+#else
 #include <llvm/MC/SubtargetFeature.h>
+#endif
 #include <llvm/Support/DynamicLibrary.h>
 #include <llvm/Support/Host.h>
 #if LLVM_VERSION_MAJOR >= 14

But it seems that we need to migrate our code to the new pass manager: https://llvm.org/docs/NewPassManager.html
Because the legacy pass manager was removed in LLVM 17.0.1: https://releases.llvm.org/17.0.1/docs/ReleaseNotes.html#changes-to-llvm-infrastructure

@js8544 @niyue Are you interested in working on this?

@niyue
Copy link
Contributor

niyue commented Sep 24, 2023

@kou, I haven't delved into the new pass manager API, but I've been contemplating the potential benefits of leveraging the newer LLVM JIT API. This could allow us to phase out some of the older API dependencies. To keep discussions focused and avoid sidetracking this thread, I've opened a separate feature request at #37848

@niyue
Copy link
Contributor

niyue commented Sep 25, 2023

@kou I gave it a try to update Gandiva's code to use the new LLVM PassManager:
main...niyue:arrow:feature/new-passmanager
But one of the test case failed:

TEST_F(TestLLVMGenerator, TestAdd) {
  // ....
  std::string fn_name = "codegen";

  ASSERT_OK(generator->engine_->LoadFunctionIRs());
  ASSERT_OK(generator->CodeGenExprValue(func_dex, 4, desc_sum, 0, fn_name,
                                        SelectionVector::MODE_NONE));

  ASSERT_OK(generator->engine_->FinalizeModule());
  auto ir = generator->engine_->DumpIR();
  EXPECT_THAT(ir, testing::HasSubstr("vector.body")); 

EXPECT_THAT(ir, testing::HasSubstr("vector.body")); This assertion failed with message like this:

Value of: ir
Expected: has substring "vector.body"
  Actual: "; ModuleID = 'codegen'\nsource_filename = \"codegen\"\ntarget datalayout = \"e-m:o-i64:64-i128:128-n32:64-S128\"\ntarget triple = \"arm64-apple-macosx13.0.0\"\n\n; Function Attrs: nofree norecurse nosync nounwind\ndefine i32 @codegen(i64* nocapture readonly %inputs_addr, i64* nocapture readonly %inputs_addr_offsets, i64* nocapture readnone %local_bitmaps, i64* nocapture readnone %holder_ptrs, i16* nocapture readnone %selection_vector, i64 %context_ptr, i64 %nrecords) local_unnamed_addr #0 {\nentry:\n  %out_mem_addr = getelementptr i64, i64* %inputs_addr, i64 4\n  %out_mem = load i64, i64* %out_mem_addr, align 8\n  %out_darray = inttoptr i64 %out_mem to i32*\n  %0 = load i64, i64* %inputs_addr_offsets, align 8\n  %1 = getelementptr i64, i64* %inputs_addr_offsets, i64 2\n  %2 = load i64, i64* %1, align 8\n  %f0_mem = load i64, i64* %inputs_addr, align 8\n  %f0_darray = inttoptr i64 %f0_mem to i32*\n  %f1_mem_addr = getelementptr i64, i64* %inputs_addr, i64 2\n  %f1_mem = load i64, i64* %f1_mem_addr, align 8\n  %f1_darray = inttoptr i64 %f1_mem to i32*\n  %smax = call i64 @llvm.smax.i64(i64 %nrecords, i64 1)\n  %scevgep6 = getelementptr i32, i32* %f1_darray, i64 %2\n  %scevgep9 = getelementptr i32, i32* %f0_darray, i64 %0\n  br label %loop\n\nloop:                                             ; preds = %loop, %entry\n  %lsr.iv10 = phi i32* [ %scevgep11, %loop ], [ %scevgep9, %entry ]\n  %lsr.iv7 = phi i32* [ %scevgep8, %loop ], [ %scevgep6, %entry ]\n  %lsr.iv5 = phi i32* [ %scevgep, %loop ], [ %out_darray, %entry ]\n  %lsr.iv = phi i64 [ %lsr.iv.next, %loop ], [ %smax, %entry ]\n  %f0 = load i32, i32* %lsr.iv10, align 4\n  %f1 = load i32, i32* %lsr.iv7, align 4\n  %3 = add nsw i32 %f1, %f0\n  store i32 %3, i32* %lsr.iv5, align 4\n  %lsr.iv.next = add nsw i64 %lsr.iv, -1\n  %scevgep = getelementptr i32, i32* %lsr.iv5, i64 1\n  %scevgep8 = getelementptr i32, i32* %lsr.iv7, i64 1\n  %scevgep11 = getelementptr i32, i32* %lsr.iv10, i64 1\n  %exitcond.not = icmp eq i64 %lsr.iv.next, 0\n  br i1 %exitcond.not, label %exit, label %loop\n\nexit:                                             ; preds = %loop\n  ret i32 0\n}\n\n; Function Attrs: nofree nosync nounwind readnone speculatable willreturn\ndeclare i64 @llvm.smax.i64(i64, i64) #1\n\nattributes #0 = { nofree norecurse nosync nounwind }\nattributes #1 = { nofree nosync nounwind readnone speculatable willreturn }\n\n!llvm.ident = !{!0, !0, !0, !0, !0, !0, !0, !0, !0, !0, !0}\n!llvm.module.flags = !{!1, !2, !3, !4, !5, !6, !7, !8, !9}\n\n!0 = !{!\"Homebrew clang version 14.0.6\"}\n!1 = !{i32 2, !\"SDK Version\", [2 x i32] [i32 13, i32 3]}\n!2 = !{i32 1, !\"wchar_size\", i32 4}\n!3 = !{i32 1, !\"branch-target-enforcement\", i32 0}\n!4 = !{i32 1, !\"sign-return-address\", i32 0}\n!5 = !{i32 1, !\"sign-return-address-all\", i32 0}\n!6 = !{i32 1, !\"sign-return-address-with-bkey\", i32 0}\n!7 = !{i32 7, !\"PIC Level\", i32 2}\n!8 = !{i32 7, !\"uwtable\", i32 1}\n!9 = !{i32 7, !\"frame-pointer\", i32 1}\n"

I've observed that the code I developed produces a different IR compared to the previously generated IR. Notably, the vector.body is absent in the IR, which I suspect is related to loop vectorization. I've attempted to troubleshoot by commenting out certain passes and adjusting the optimization level, but to no avail. I'm not well-versed in this area, so any insights or guidance would be greatly appreciated. Thank you.

@kou
Copy link
Member

kou commented Sep 25, 2023

Thanks!

It seems that the following passes are related:

      llvm::TargetIRAnalysis target_analysis =
          execution_engine_->getTargetMachine()->getTargetIRAnalysis();
      pass_manager->add(llvm::createTargetTransformInfoWrapperPass(target_analysis));
      pass_manager->add(llvm::createFunctionInliningPass());

Because vector.body can be emitted with the following patch:

diff --git a/cpp/src/gandiva/engine.cc b/cpp/src/gandiva/engine.cc
index 7d75793a3..59fbad1d3 100644
--- a/cpp/src/gandiva/engine.cc
+++ b/cpp/src/gandiva/engine.cc
@@ -297,14 +297,14 @@ Status Engine::FinalizeModule() {
           execution_engine_->getTargetMachine()->getTargetIRAnalysis();
       pass_manager->add(llvm::createTargetTransformInfoWrapperPass(target_analysis));
       pass_manager->add(llvm::createFunctionInliningPass());
-      pass_manager->add(llvm::createInstructionCombiningPass());
-      pass_manager->add(llvm::createPromoteMemoryToRegisterPass());
-      pass_manager->add(llvm::createGVNPass());
-      pass_manager->add(llvm::createNewGVNPass());
-      pass_manager->add(llvm::createCFGSimplificationPass());
-      pass_manager->add(llvm::createLoopVectorizePass());
-      pass_manager->add(llvm::createSLPVectorizerPass());
-      pass_manager->add(llvm::createGlobalOptimizerPass());
+      // pass_manager->add(llvm::createInstructionCombiningPass());
+      // pass_manager->add(llvm::createPromoteMemoryToRegisterPass());
+      // pass_manager->add(llvm::createGVNPass());
+      // pass_manager->add(llvm::createNewGVNPass());
+      // pass_manager->add(llvm::createCFGSimplificationPass());
+      // pass_manager->add(llvm::createLoopVectorizePass());
+      // pass_manager->add(llvm::createSLPVectorizerPass());
+      // pass_manager->add(llvm::createGlobalOptimizerPass());
 
       // run the optimiser
       llvm::PassManagerBuilder pass_builder;

If I also comment out pass_manager->add(llvm::createTargetTransformInfoWrapperPass(target_analysis)) or pass_manager->add(llvm::createFunctionInliningPass()), vector.body isn't emitted.

Are there corresponding passes for them in new LLVM PassManager?

@niyue
Copy link
Contributor

niyue commented Sep 26, 2023

Thanks for the pointer, I read previously the target analysis pass is done by the new pass manager automatically but this could be wrong, I will do some more investigation to see if I can figure it out.

@niyue
Copy link
Contributor

niyue commented Sep 26, 2023

@kou thanks for your pointer, I drafted a PR (#37867), which could pass the failed test case I mentioned above, but more tests, in particular performance tests may be needed to verify the change.

@kou
Copy link
Member

kou commented Sep 26, 2023

Thanks!

@kou kou closed this as completed in #37867 Oct 3, 2023
kou added a commit that referenced this issue Oct 3, 2023
### Rationale for this change
In #37834, to support LLVM 17, we need to migrate to use the new LLVM PassManager API.

### What changes are included in this PR?

 This PR tries to migrate the legacy PassManager to the new PassManager.

### Are these changes tested?
It should be covered by existing unit tests. But more performance tests may be needed to verify this change.

### Are there any user-facing changes?
No

* Closes: #37834

Lead-authored-by: Yue Ni <niyue.com@gmail.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou kou added this to the 14.0.0 milestone Oct 3, 2023
JerAguilon pushed a commit to JerAguilon/arrow that referenced this issue Oct 23, 2023
…#37867)

### Rationale for this change
In apache#37834, to support LLVM 17, we need to migrate to use the new LLVM PassManager API.

### What changes are included in this PR?

 This PR tries to migrate the legacy PassManager to the new PassManager.

### Are these changes tested?
It should be covered by existing unit tests. But more performance tests may be needed to verify this change.

### Are there any user-facing changes?
No

* Closes: apache#37834

Lead-authored-by: Yue Ni <niyue.com@gmail.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…#37867)

### Rationale for this change
In apache#37834, to support LLVM 17, we need to migrate to use the new LLVM PassManager API.

### What changes are included in this PR?

 This PR tries to migrate the legacy PassManager to the new PassManager.

### Are these changes tested?
It should be covered by existing unit tests. But more performance tests may be needed to verify this change.

### Are there any user-facing changes?
No

* Closes: apache#37834

Lead-authored-by: Yue Ni <niyue.com@gmail.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…#37867)

### Rationale for this change
In apache#37834, to support LLVM 17, we need to migrate to use the new LLVM PassManager API.

### What changes are included in this PR?

 This PR tries to migrate the legacy PassManager to the new PassManager.

### Are these changes tested?
It should be covered by existing unit tests. But more performance tests may be needed to verify this change.

### Are there any user-facing changes?
No

* Closes: apache#37834

Lead-authored-by: Yue Ni <niyue.com@gmail.com>
Co-authored-by: Sutou Kouhei <kou@clear-code.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants