New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flang][OpenMP] Parallel region codegen support #424
Conversation
Pattern matching at LLVMIR level is getting a bit problematic to deal with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @SouraVX. Great going!
A few comments inline.
flang/include/flang/Lower/OpenMP.h
Outdated
void genOpenMPConstruct(AbstractConverter &, pft::Evaluation &, | ||
const parser::OpenMPConstruct &); | ||
const parser::OpenMPConstruct &, GenFIRCBTy genFIRCB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are callbacks used to match the OpenMPIRBuilder style? Or do you think it is necessary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Callbacks approach has following advantages:
- No duplication, otherwise might need to pull out lot of code from
Bridge.cpp
(where most of the lowering code is residing). - Less error prone, since for actual codegen we fall back to
Flang
codegen cleanly. - Code flow and interfacing of
OpenMP
codegen from rest of the codegen remains pretty independent/good/.
flang/lib/Lower/OpenMP.cpp
Outdated
@@ -21,6 +21,8 @@ | |||
|
|||
#define TODO() llvm_unreachable("not yet implemented") | |||
|
|||
using GenFIRCBTy = std::function<void(Fortran::lower::pft::Evaluation &eval)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a re-declaration? There is one in OpenMP.h as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's needed. I cross-checked once again :)
@@ -0,0 +1,84 @@ | |||
! This test checks lowering of OpenMP parallel Directive with arbitrary code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more tests?
- With a fir loop inside.
- With an if statement.
- with a goto statement (branching to inside the region).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure will take care of this in next revision.
4c3904b
to
9cdf257
Compare
2607340
to
edfb6c9
Compare
bbc -fopenmp loop.f90 -emit-fir -o - |tco
|
Might need the conversion pattern check openmptollvm in the conversions directory. |
Thanks @kiranchandramohan, legalization worked. But some invalid type conversion getting triggered --- related to
|
There is probably an ordering issue here. The openmp to llvm conversion is not aware of fir. I was hopping that the fir to llvm pass will be called automatically. Will swapping the order of fir to llvm and openmp to llvm get it to work? @ericschweitz can probably help here. |
Yes, observing same. IIUC due to current ordering this pass is lowering all
Swapping or ordering this way works fine: but earlier issue persist(i.e
|
May be what is required is an openmp to fir pass. |
Will adding the parallel conversion pattern in the fir to llvm pass make it work? You will have to mark the parallel operation as illegal. |
@kiranchandramohan, this error:
is due to
After investigating through lowering part, I'm also sceptic WRT lowering that has happened. Code inside the region utilizes
@schweitzpgi @jeanPerier , do you guys think this lowering is legit ? Also I had a brief discussion with @kiranchandramohan , he was suggesting to proceed with this PR without the |
This looks strange and (likely) is failing legalization.
Are you converting OMP dialect to standard dialect or straight to LLVM-IR? Code gen to LLVM IR dialect in |
flang/lib/Lower/Bridge.cpp
Outdated
auto genFIRCB = [&](Fortran::lower::pft::Evaluation &eval) { | ||
genFIR(eval, /*unstructuredContext=*/false); | ||
}; | ||
genOpenMPConstruct(*this, getEval(), omp, genFIRCB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also set the insertionPoint correctly in genOpenMPConstruct
and call genFIR
in the eval loop right here so you don't need to pass a callback function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestion! it simplified the change a lot. 👍
Executable can be generated and tested as: ``` $ bbc -fopenmp -emit-fir parallel.f90 -o -| tco | llc -filetype=obj -o parallel.o $ clang parallel.o -L/PATH/lib -lFortranRuntime -lFortranDecimal -L/PATH/lib/ -lomp -lstdc++ -lm $ ./a.out ``` Extended test case for `if statement`. TODO: extend for `do loop` Added OpenMPToLLVMPass to bbc pass pipeline
ebe0d09
to
4f1e425
Compare
This only happens if we add But primary problem still persist, we(@kiranchandramohan too) are considering to create separate issue and work on it.
I've squashed this PR. |
Yes, I agree with @SouraVX that we can handle the ordering of lowering passes or addition of conversion patterns as a separate patch. If the basic lowering of a parallel region works in this patch then that is fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I'm good with making progress and noting TODO items. I assume this is ready to merge now? |
After skeleton of the `Parallel Op` is created set the insertion point to start of the block. So that later `CodeGen` can proceed. Note: This patch reflects the work that can be upstreamed from PR(merged) PR: flang-compiler#424 Reviewed By: schweitz, kiranchandramohan Differential Revision: https://reviews.llvm.org/D88221
After skeleton of the `Parallel Op` is created set the insertion point to start of the block. So that later `CodeGen` can proceed. Note: This patch reflects the work that can be upstreamed from PR(merged) PR: flang-compiler/f18-llvm-project#424 Reviewed By: schweitz, kiranchandramohan Differential Revision: https://reviews.llvm.org/D88221
After skeleton of the `Parallel Op` is created set the insertion point to start of the block. So that later `CodeGen` can proceed. Note: This patch reflects the work that can be upstreamed from PR(merged) PR: flang-compiler/f18-llvm-project#424 Reviewed By: schweitz, kiranchandramohan Differential Revision: https://reviews.llvm.org/D88221
Executable can be generated and tested as: