New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM 10 #599
LLVM 10 #599
Conversation
LLVM 10 removes llvm::make_unique in favor of std::make_unique. However, this requires C++14 and is therefore unsuitable for LLVM 9 that forces -std=c++11. Update the code to use both conditionally. This fixes all issues with LLVM 10.
This is needed as the llvmlite package we are building needs to differentiate itself somehow from the 'other' dev builds.
Build-farm id: |
Note to self:
|
This works with the CUDA target, with the exception of
It seems that for code like: def foo(x):
# Triggers a generation of llvm.memset
for i in range(x.size):
x[i] = 0 a {
B0.endif:
%.74 = icmp sgt i64 %arg.x.2, 0
br i1 %.74, label %B26.loopexit, label %B26
B26.loopexit: ; preds = %B0.endif
%arg.x.41 = bitcast i32* %arg.x.4 to i8*
%0 = shl nuw i64 %arg.x.2, 2
call void @llvm.memset.p0i8.i64(i8* align 4 %arg.x.41, i8 0, i64 %0, i1 false)
br label %B26
B26: ; preds = %B26.loopexit, %B0.endif
store i8* null, i8** %.ret, align 8
ret i32 0
} but now it looks like: entry:
%.74.inv = icmp sgt i64 %arg.x.2, 0
br i1 %.74.inv, label %B14, label %B26
B14: ; preds = %entry, %B14
%.110.sroa.0.0114 = phi i64 [ %.110.sroa.0.0, %B14 ], [ 0, %entry ]
%.120113 = phi i1 [ %.120, %B14 ], [ %.74.inv, %entry ]
%.129108112 = phi i64 [ %.129107, %B14 ], [ %arg.x.2, %entry ]
%.133110111 = phi i64 [ %.133109, %B14 ], [ 0, %entry ]
%.129 = sext i1 %.120113 to i64
%.129107 = add nsw i64 %.129108112, %.129
%.133 = zext i1 %.120113 to i64
%.133109 = add i64 %.133110111, %.133
%.198 = icmp slt i64 %.110.sroa.0.0114, 0
%.199 = select i1 %.198, i64 %arg.x.5.0, i64 0
%.200 = add i64 %.199, %.110.sroa.0.0114
%.213 = getelementptr i32, i32* %arg.x.4, i64 %.200
store i32 0, i32* %.213, align 4
%.120 = icmp sgt i64 %.129107, 0
%.110.sroa.0.0 = select i1 %.120, i64 %.133109, i64 0
br i1 %.120, label %B14, label %B26
B26: ; preds = %B14, %entry
store i8* null, i8** %.ret, align 8
ret i32 0
} i.e. a loop is generated instead of the |
@gmarkall awesome, thanks for looking into this! |
It seems that LLVM 10 doesn't do loop idiom recognition at an opt level of 1 - If I apply to Numba: diff --git a/numba/cuda/codegen.py b/numba/cuda/codegen.py
index e201a2101..0d74b1edc 100644
--- a/numba/cuda/codegen.py
+++ b/numba/cuda/codegen.py
@@ -19,7 +19,7 @@ class CUDACodeLibrary(CodeLibrary):
# Run some lightweight optimization to simplify the module.
# This seems to workaround a libnvvm compilation bug (see #1341)
pmb = ll.PassManagerBuilder()
- pmb.opt_level = 1
+ pmb.opt_level = 3
pmb.disable_unit_at_a_time = False
pmb.disable_unroll_loops = True
pmb.loop_vectorize = False then the |
I can't see any negative knock-on effects of using the patch above to use optimization level 3 - the Numba and cuDF test suites pass with no issues, and it will probably result in easier to read optimized IR for future debugging purposes. @esc @sklam @stuartarchibald do you have any objections / concerns about just applying the above patch, when llvmlite moves to LLVM 10? |
@gmarkall I have an LLVM 10 enabled version of |
As suggested by @gmarkall : numba/llvmlite#599 (comment)
With LLVM 10 the following CMake line no longer appears to work. ``` llvm_map_components_to_libnames(llvm_libs all) ``` ... and this commit is a suitable workaround. Current working hypothesis is that this might have been caused by the change in: llvm/llvm-project@ab41180#diff-cebfde545aa260d20ca6b0cdc3da08daR270 However the docs at https://llvm.org/docs/CMake.html are not very clear about this. And in addition the output of `llvm-config --components` does list `all` as an option. Many thanks to @angloyna for debugging this!!
conda-recipes/llvmlite/meta.yaml
Outdated
@@ -1,4 +1,4 @@ | |||
{% set VERSION_SUFFIX = "" %} # debug version suffix, appended to the version | |||
{% set VERSION_SUFFIX = "_llvm10" %} # debug version suffix, appended to the version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: this change must be reverted before merging to master
.
@sklam @stuartarchibald I have implemented the LLVM version split for |
This reverts commit 3f66129. Once numba/numba#6030 is merged, it will no longer be necessary to disable the autoupgrade of atomic intrinsics for NVPTX, because LLVM from llvmlite will not be used to optimize the IR before sending it to NVVM.
The page at: https://releases.llvm.org/10.0.1/docs/ Currently (Tue Aug 4 10:48:42 2020 GMT+2) returns a 404 - page not found and this breaks our build as Sphinx can't setup the intersphinx links.
As we are doing a split release for 0.34.0 and `aarch64` will need LLVM 9.0.* -- we allow to build with this version again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch, have done a first pass review.
Fix missing word and missing punctuation. Co-authored-by: stuartarchibald <stuartarchibald@users.noreply.github.com>
We only support 10.0.* and 9.0.* from now on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Final builds should not have a version suffix.
the version revert looks good |
damn, you have taken too long to review my PRs that I am using llvm 12 currently. |
BFID: |
🎉 |
💯 |
As title, upgrade to LLVM 10.