-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVM][GPU][+refactoring] Replacement of math intrinsics with library calls #835
[LLVM][GPU][+refactoring] Replacement of math intrinsics with library calls #835
Conversation
Logfiles from GitLab pipeline #44990 (:white_check_mark:) have been uploaded here! Status and direct links: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Just a few small suggestions
DISPATCH("llvm.pow.f64", "_ZGVeN8vv_pow", FIXED(8)) | ||
// clang-format on | ||
}; | ||
#undef DISPATCH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXED
should also be undefined?
#undef DISPATCH | |
#undef DISPATCH | |
#undef FIXED |
// Add vectorizable functions to the target library info. | ||
switch (library->second) { | ||
case VecLib::LIBMVEC_X86: | ||
if (!triple.isX86() || !triple.isArch64Bit()) | ||
break; | ||
default: | ||
tli.addVectorizableFunctionsFromVecLib(library->second); | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a personal opinion, not sure what should be the proper way to do it or what could be the benefit of the switch
but I think it would be more understandable to write this like:
// Add vectorizable functions to the target library info. | |
switch (library->second) { | |
case VecLib::LIBMVEC_X86: | |
if (!triple.isX86() || !triple.isArch64Bit()) | |
break; | |
default: | |
tli.addVectorizableFunctionsFromVecLib(library->second); | |
break; | |
} | |
if (library->second != VecLib::LIBMVEC_X86 || (triple.isX86() && triple.isArch64Bit())) { | |
tli.addVectorizableFunctionsFromVecLib(library->second); | |
} |
Feel free to keep this as you prefer
|
||
// Map of supported replacements. For now it is only exp. | ||
static const std::map<std::string, std::string> libdevice_name = { | ||
{"llvm.exp.f32", "__nv_expf"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to add also pow
and maybe look for any other math function commonly used in the mod files and also add it to the x86
and aarch64
maps. I can look at this also the next days
Logfiles from GitLab pipeline #47044 (:no_entry:) have been uploaded here! Status and direct links: |
Logfiles from GitLab pipeline #47063 (:white_check_mark:) have been uploaded here! Status and direct links: |
Logfiles from GitLab pipeline #47062 (:white_check_mark:) have been uploaded here! Status and direct links: |
Logfiles from GitLab pipeline #47154 (:white_check_mark:) have been uploaded here! Status and direct links: |
Logfiles from GitLab pipeline #47153 (:white_check_mark:) have been uploaded here! Status and direct links: |
… calls (#835) Added an LLVM pass that replaces math intrinsics with calls to math library. In particular: * Functionality of replacement with SIMD functions is factored out into a separate file and LLVM version dependencies are dropped (LLVM 13 is already used anyway). * A pass to replace intrinsics with libdevice calls when targeting CUDA platforms has been added. So far only `exp` and `pow` are supported (single and double precision). * Added a test to check the replacement Co-authored-by: Ioannis Magkanaris <iomagkanaris@gmail.com>
… calls (#835) Added an LLVM pass that replaces math intrinsics with calls to math library. In particular: * Functionality of replacement with SIMD functions is factored out into a separate file and LLVM version dependencies are dropped (LLVM 13 is already used anyway). * A pass to replace intrinsics with libdevice calls when targeting CUDA platforms has been added. So far only `exp` and `pow` are supported (single and double precision). * Added a test to check the replacement Co-authored-by: Ioannis Magkanaris <iomagkanaris@gmail.com>
… calls (#835) Added an LLVM pass that replaces math intrinsics with calls to math library. In particular: * Functionality of replacement with SIMD functions is factored out into a separate file and LLVM version dependencies are dropped (LLVM 13 is already used anyway). * A pass to replace intrinsics with libdevice calls when targeting CUDA platforms has been added. So far only `exp` and `pow` are supported (single and double precision). * Added a test to check the replacement Co-authored-by: Ioannis Magkanaris <iomagkanaris@gmail.com>
… calls (#835) Added an LLVM pass that replaces math intrinsics with calls to math library. In particular: * Functionality of replacement with SIMD functions is factored out into a separate file and LLVM version dependencies are dropped (LLVM 13 is already used anyway). * A pass to replace intrinsics with libdevice calls when targeting CUDA platforms has been added. So far only `exp` and `pow` are supported (single and double precision). * Added a test to check the replacement Co-authored-by: Ioannis Magkanaris <iomagkanaris@gmail.com>
This PR adds a LLVM pass that replaces math intrinsics
with calls to math library. In particular:
Functionality of replacement with SIMD functions is factored
out into a separate file and LLVM version dependencies are
dropped (we use LLVM 13 already anyway).
A pass to replace intrinsics with libdevice calls when targeting
CUDA platforms has been added. So far only
exp
is supported(single and double precision)
Added a test to check the replacement
Note: factoring replacement functionality into a separate file
allows us to completely drop the dependency on target information
inside
LLVMCodegenVisitor
😊