-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVM][refactoring] Annotations pass and no more wrappers #893
Conversation
Additional comments: @ohm314 I noticed that in Python bindings we do not use fast math flags: is this intentional? nmodl::codegen::CodegenLLVMVisitor visitor(modname, cfg.output_dir, platform, 0); Also @pramodk @iomaganaris: originally we have |
Logfiles from GitLab pipeline #64962 (:white_check_mark:) have been uploaded here! Status and direct links: |
Logfiles from GitLab pipeline #64987 (:white_check_mark:) have been uploaded here! Status and direct links: |
Logfiles from GitLab pipeline #64995 (:white_check_mark:) have been uploaded here! Status and direct links: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
Just a couple of questions.
Also regarding your question about set_loop_metadata
, I think it's a good idea. Maybe for benchmarking different vector widths makes sense but apart from that in a production environment we would like to be able to vectorize as much as possible so we could enable auto vectorization. If auto vectorization is enabled in all cases I think we should also enable the math library replacement for all vector lengths. @pramodk what do you think?
45f422e
to
5539a06
Compare
Logfiles from GitLab pipeline #65951 (:white_check_mark:) have been uploaded here! Status and direct links: |
…ts (#894) * Made the pass to preserve analysis usage
Logfiles from GitLab pipeline #66033 (:white_check_mark:) have been uploaded here! Status and direct links: |
(If @iomaganaris have reviewed & approved this, we could get this in) |
Agree that there is no need for the use case you mentioned. I just wonder if there could be other metadata that could help. Not exactly applicable here but attributes like
I think this makes sense! |
@pramodk @iomaganaris took me a while to address one single comment 😄 |
Logfiles from GitLab pipeline #70450 (:white_check_mark:) have been uploaded here! Status and direct links: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
* New `Annotator` class to specialise how NMODL compute kernels (llvm::Functions) are annotated * New `ReplacePass` class that replaces the standard math function calls of the kernels with the optimised math libraries passed to NMODL * Replace kernel wrappers with generation of the compute functions with a `void*` parameter which is then casted to the proper struct type internally by the compute function * Identify compute kernels `CodegenFunction` with a new `is_kernel` flag - `is_kernel` is mapped to LLVM metadata node `nmodl.compute-kernel` that allows any to easily identify compute kernels in LLVM IR * Removes restriction for auto-vectorising loops by external compiler using certain LLVM IR loop metadata when the vector length is set to 1 in NMODL
* New `Annotator` class to specialise how NMODL compute kernels (llvm::Functions) are annotated * New `ReplacePass` class that replaces the standard math function calls of the kernels with the optimised math libraries passed to NMODL * Replace kernel wrappers with generation of the compute functions with a `void*` parameter which is then casted to the proper struct type internally by the compute function * Identify compute kernels `CodegenFunction` with a new `is_kernel` flag - `is_kernel` is mapped to LLVM metadata node `nmodl.compute-kernel` that allows any to easily identify compute kernels in LLVM IR * Removes restriction for auto-vectorising loops by external compiler using certain LLVM IR loop metadata when the vector length is set to 1 in NMODL
This PR fixes two things:
A new
Annotator
class is added with an abstractannotate()
method. The users of MOD2IR can override this method (seeCUDAAnnotator
for example) to specialise how NMODL compute kernels (llvm::Function
s) are annotated. This includes a helper LLVM module pass as well. Ideally, we want some config class (Platform
?) to set up the rightAnnotator
, but since there is no such class yet it is done innmodl::utils
similarly to MathFunctionReplacement.It turns out that NVPTX backend infers address spaces only for "kernel"s and not for "device" functions. One of the reasons why the current GPU code generation was working was inlining of the compute kernel ("device") functions into wrapper ("kernel") functions. This PR fixes this issue by removing wrapping altogether, instead providing a new flag
wrap_kernel_functions
to visitor. If flag is set, compute kernel is generated withvoid*
ptr and additionalbitcast
, otherwise struct pointer type is used directly.CodegenFunction
has a newis_kernel
flag. Hence, LLVM visitor knows straight away if the function is a kernel or not. No more type checks (single argument, struct pointer, etc.) needed.is_kernel
is mapped to LLVM metadata nodenmodl.compute-kernel
that allows any to easily identify compute kernels in LLVM IR. This metadata handling is part ofAnnotator
s API.