Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLVM][refactoring] Annotations pass and no more wrappers #893

Merged
merged 11 commits into from
Aug 22, 2022

Conversation

georgemitenkov
Copy link
Collaborator

This PR fixes two things:

  1. A new Annotator class is added with an abstract annotate() method. The users of MOD2IR can override this method (see CUDAAnnotator for example) to specialise how NMODL compute kernels (llvm::Functions) are annotated. This includes a helper LLVM module pass as well. Ideally, we want some config class (Platform?) to set up the right Annotator, but since there is no such class yet it is done in nmodl::utils similarly to MathFunctionReplacement.

  2. It turns out that NVPTX backend infers address spaces only for "kernel"s and not for "device" functions. One of the reasons why the current GPU code generation was working was inlining of the compute kernel ("device") functions into wrapper ("kernel") functions. This PR fixes this issue by removing wrapping altogether, instead providing a new flag wrap_kernel_functions to visitor. If flag is set, compute kernel is generated with void* ptr and additional bitcast, otherwise struct pointer type is used directly.

;; wrap_kernel_functions = false
define void @nrn_state_hh(%__instance_var__type* noalias nocapture readonly %mech1)

;; wrap_kernel_functions = true
define void @nrn_state_hh(%i8* noalias nocapture readonly %mech1)
  1. Finally, to identify compute kernels CodegenFunction has a new is_kernel flag. Hence, LLVM visitor knows straight away if the function is a kernel or not. No more type checks (single argument, struct pointer, etc.) needed. is_kernel is mapped to LLVM metadata node nmodl.compute-kernel that allows any to easily identify compute kernels in LLVM IR. This metadata handling is part of Annotators API.
define void @nrn_state_hh(%__instance_var__type* %mech1) #0 !nmodl.annotations !0 {
  ;; some code
}

;; This says that `nrn_state_hh` is a kernel!
!0 = !{!"nmodl.compute-kernel"}

@georgemitenkov
Copy link
Collaborator Author

georgemitenkov commented Jul 11, 2022

Additional comments:

@ohm314 I noticed that in Python bindings we do not use fast math flags: is this intentional?

nmodl::codegen::CodegenLLVMVisitor visitor(modname, cfg.output_dir, platform, 0);

Also @pramodk @iomaganaris: originally we have set_loop_metdata call when generating FOR loop which disables vectorisation when we run with vector-width 1. This makes comparison with "auto-vectorization" meaningless. Maybe we want to remove it? I put a TODO in changed files.

@bbpbuildbot
Copy link
Collaborator

Logfiles from GitLab pipeline #64962 (:white_check_mark:) have been uploaded here!

Status and direct links:

@bbpbuildbot
Copy link
Collaborator

Logfiles from GitLab pipeline #64987 (:white_check_mark:) have been uploaded here!

Status and direct links:

@bbpbuildbot
Copy link
Collaborator

Logfiles from GitLab pipeline #64995 (:white_check_mark:) have been uploaded here!

Status and direct links:

Copy link
Contributor

@iomaganaris iomaganaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM
Just a couple of questions.
Also regarding your question about set_loop_metadata, I think it's a good idea. Maybe for benchmarking different vector widths makes sense but apart from that in a production environment we would like to be able to vectorize as much as possible so we could enable auto vectorization. If auto vectorization is enabled in all cases I think we should also enable the math library replacement for all vector lengths. @pramodk what do you think?

src/codegen/llvm/codegen_llvm_visitor.cpp Show resolved Hide resolved
@bbpbuildbot
Copy link
Collaborator

Logfiles from GitLab pipeline #65951 (:white_check_mark:) have been uploaded here!

Status and direct links:

…ts (#894)

* Made the pass to preserve analysis usage
@bbpbuildbot
Copy link
Collaborator

Logfiles from GitLab pipeline #66033 (:white_check_mark:) have been uploaded here!

Status and direct links:

@pramodk
Copy link
Contributor

pramodk commented Jul 22, 2022

(If @iomaganaris have reviewed & approved this, we could get this in)

@pramodk
Copy link
Contributor

pramodk commented Jul 22, 2022

originally we have set_loop_metdata call when generating FOR loop which disables vectorisation when we run with vector-width 1.
Also regarding your question about set_loop_metadata, I think it's a good idea. Maybe for benchmarking different vector widths makes sense but apart from that in a production environment we would like to be able to vectorize as much as possible so we could enable auto vectorization.

Agree that there is no need for the use case you mentioned. I just wonder if there could be other metadata that could help. Not exactly applicable here but attributes like #pragma loop ivdep are kind of hints/metadatas? 🤔
But anyway, if we don't have any use case now then just remove it.

If auto vectorization is enabled in all cases I think we should also enable the math library replacement for all vector lengths. @pramodk what do you think?

I think this makes sense!

@georgemitenkov
Copy link
Collaborator Author

@pramodk @iomaganaris took me a while to address one single comment 😄

@bbpbuildbot
Copy link
Collaborator

Logfiles from GitLab pipeline #70450 (:white_check_mark:) have been uploaded here!

Status and direct links:

Copy link
Contributor

@iomaganaris iomaganaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@iomaganaris iomaganaris merged commit d06efbe into llvm Aug 22, 2022
@iomaganaris iomaganaris deleted the george/annotations branch August 22, 2022 13:44
iomaganaris pushed a commit that referenced this pull request Sep 15, 2022
* New `Annotator` class to specialise how NMODL compute kernels (llvm::Functions) are annotated
* New `ReplacePass` class that replaces the standard math function calls of the kernels with the optimised math libraries passed to NMODL
* Replace kernel wrappers with generation of the compute functions with a `void*` parameter which is then casted to the proper struct type internally by the compute function
* Identify compute kernels `CodegenFunction` with a new `is_kernel` flag
  - `is_kernel` is mapped to LLVM metadata node `nmodl.compute-kernel` that allows any to easily identify compute kernels in LLVM IR
* Removes restriction for auto-vectorising loops by external compiler using certain LLVM IR loop metadata when the vector length is set to 1 in NMODL
iomaganaris pushed a commit that referenced this pull request Sep 15, 2022
* New `Annotator` class to specialise how NMODL compute kernels (llvm::Functions) are annotated
* New `ReplacePass` class that replaces the standard math function calls of the kernels with the optimised math libraries passed to NMODL
* Replace kernel wrappers with generation of the compute functions with a `void*` parameter which is then casted to the proper struct type internally by the compute function
* Identify compute kernels `CodegenFunction` with a new `is_kernel` flag
  - `is_kernel` is mapped to LLVM metadata node `nmodl.compute-kernel` that allows any to easily identify compute kernels in LLVM IR
* Removes restriction for auto-vectorising loops by external compiler using certain LLVM IR loop metadata when the vector length is set to 1 in NMODL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm refactoring Code refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants