Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][OpenMP] Parallel region codegen support #424

Merged
merged 1 commit into from Sep 22, 2020

Conversation

SouraVX
Copy link
Collaborator

@SouraVX SouraVX commented Sep 13, 2020

Executable can be generated and tested as:

$ bbc -fopenmp -emit-fir parallel.f90 -o -| tco | llc -filetype=obj -o parallel.o
$ clang parallel.o -L/PATH/lib -lFortranRuntime -lFortranDecimal -L/PATH/lib/ -lomp -lstdc++ -lm
$ ./a.out

@SouraVX
Copy link
Collaborator Author

SouraVX commented Sep 13, 2020

Pattern matching at LLVMIR level is getting a bit problematic to deal with.

Copy link
Collaborator

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SouraVX. Great going!
A few comments inline.

void genOpenMPConstruct(AbstractConverter &, pft::Evaluation &,
const parser::OpenMPConstruct &);
const parser::OpenMPConstruct &, GenFIRCBTy genFIRCB);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are callbacks used to match the OpenMPIRBuilder style? Or do you think it is necessary here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Callbacks approach has following advantages:

  1. No duplication, otherwise might need to pull out lot of code from Bridge.cpp(where most of the lowering code is residing).
  2. Less error prone, since for actual codegen we fall back to Flang codegen cleanly.
  3. Code flow and interfacing of OpenMP codegen from rest of the codegen remains pretty independent/good/.

@@ -21,6 +21,8 @@

#define TODO() llvm_unreachable("not yet implemented")

using GenFIRCBTy = std::function<void(Fortran::lower::pft::Evaluation &eval)>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a re-declaration? There is one in OpenMP.h as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's needed. I cross-checked once again :)

@@ -0,0 +1,84 @@
! This test checks lowering of OpenMP parallel Directive with arbitrary code
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more tests?

  1. With a fir loop inside.
  2. With an if statement.
  3. with a goto statement (branching to inside the region).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure will take care of this in next revision.

@SouraVX
Copy link
Collaborator Author

SouraVX commented Sep 16, 2020

do loop test: is failing at legalization to LLVMIR Dialect. I'm looking into it.

program main
!$OMP PARALLEL NUM_THREADS(2)
      do i = 2, 5
      print*, "Hello"
      end do
!$OMP END PARALLEL
end

bbc -fopenmp loop.f90 -emit-fir -o - |tco

loc("<stdin>":24:13): error: failed to legalize operation 'std.addi'
error: error in converting to LLVM-IR dialect

@kiranchandramohan
Copy link
Collaborator

Might need the conversion pattern check openmptollvm in the conversions directory.

@SouraVX
Copy link
Collaborator Author

SouraVX commented Sep 16, 2020

Thanks @kiranchandramohan, legalization worked. But some invalid type conversion getting triggered --- related to fir.
@schweitzpgi Any hints ? is this sort of bug reported previously ?

bbc -fopenmp loop.f90 --dump-module-on-failure -emit-llvm
error: 'fir.convert' op invalid type conversion
oops, pass manager reported failure
....
%0 = llvm.mlir.constant(2 : i32) : !llvm.i32
      %1 = llvm.mlir.constant(2 : index) : !llvm.i64
      %2 = llvm.mlir.constant(1 : index) : !llvm.i64
      %3 = llvm.mlir.constant(-1 : i32) : !llvm.i32
      %4 = llvm.mlir.constant(4 : i32) : !llvm.i32
      %5 = llvm.mlir.constant(5 : index) : !llvm.i64
      %6 = llvm.mlir.constant(4 : index) : !llvm.i64
      %7 = llvm.mlir.constant(0 : index) : !llvm.i64

....
omp.parallel num_threads(%0 : !llvm.i32) {
      llvm.br ^bb1(%1, %1, %6 : !llvm.i64, !llvm.i64, !llvm.i64)
    ^bb1(%9: !llvm.i64, %10: !llvm.i64, %11: !llvm.i64):  // 2 preds: ^bb0, ^bb2
      %12 = llvm.icmp "sgt" %11, %7 : !llvm.i64
      llvm.cond_br %12, ^bb2, ^bb3
    ^bb2:  // pred: ^bb1
      %13 = fir.convert %9 : (!llvm.i64) -> i32  // This is triggering invalid type conversion
      fir.store %13 to %8 : !fir.ref<i32>
      %14 = fir.address_of(@_QQcl.6C6F6F702E663930) : !fir.ref<!fir.array<9x!fir.char<1>>>
      %15 = fir.convert %14 : (!fir.ref<!fir.array<9x!fir.char<1>>>) -> !fir.ref<i8>
      %16 = fir.call @_FortranAioBeginExternalListOutput(%3, %15, %4) : (!llvm.i32, !fir.ref<i8>, !llvm.i32) -> !fir.ref<i8>
      %17 = fir.address_of(@_QQcl.48656C6C6F) : !fir.ref<!fir.array<5x!fir.char<1>>>
      %18 = fir.convert %17 : (!fir.ref<!fir.array<5x!fir.char<1>>>) -> !fir.ref<i8>
      %19 = fir.convert %5 : (!llvm.i64) -> i64
      %20 = fir.call @_FortranAioOutputAscii(%16, %18, %19) : (!fir.ref<i8>, !fir.ref<i8>, i64) -> i1
      %21 = fir.call @_FortranAioEndIoStatement(%16) : (!fir.ref<i8>) -> i32
      %22 = llvm.add %9, %2 : !llvm.i64
      %23 = llvm.sub %11, %2 : !llvm.i64
      llvm.br ^bb1(%22, %22, %23 : !llvm.i64, !llvm.i64, !llvm.i64)
    ^bb3:  // pred: ^bb1
      %24 = fir.convert %10 : (!llvm.i64) -> i32
      fir.store %24 to %8 : !fir.ref<i32>
      omp.terminator
...
    }

@kiranchandramohan
Copy link
Collaborator

There is probably an ordering issue here. The openmp to llvm conversion is not aware of fir. I was hopping that the fir to llvm pass will be called automatically. Will swapping the order of fir to llvm and openmp to llvm get it to work? @ericschweitz can probably help here.

@SouraVX
Copy link
Collaborator Author

SouraVX commented Sep 16, 2020

There is probably an ordering issue here. The openmp to llvm conversion is not aware of fir. I was hopping that the fir to llvm pass will be called automatically. Will swapping the order of fir to llvm and openmp to llvm get it to work? @ericschweitz can probably help here.

Yes, observing same. IIUC due to current ordering this pass is lowering all std operations to llvm dialect too early and once the operation is lowered it's probably not feasible for FIR to lift up the abstraction and work.
After inserting this pass in pipeline noting following failures:

Failed Tests (5):
  Flang :: Lower/OpenMP/empty-omp-parallel.f90
  Flang :: Lower/OpenMP/omp-parallel-region.f90
  Flang :: Lower/array-init.f90
  Flang :: Lower/end-to-end-character-assignment.f90
  Flang :: Lower/intrinsic-wrappers.f90

Will swapping the order of fir to llvm and openmp to llvm get it to work?

Swapping or ordering this way works fine: but earlier issue persist(i.e failed to legalize operation 'std.addi')

 pm.addPass(fir::createFIRToLLVMPass(nameUniquer));
 pm.addPass(mlir::createConvertOpenMPToLLVMPass());

@kiranchandramohan
Copy link
Collaborator

May be what is required is an openmp to fir pass.

@kiranchandramohan
Copy link
Collaborator

Will adding the parallel conversion pattern in the fir to llvm pass make it work? You will have to mark the parallel operation as illegal.

@SouraVX
Copy link
Collaborator Author

SouraVX commented Sep 17, 2020

@kiranchandramohan, this error:

loc("<stdin>":24:13): error: failed to legalize operation 'std.addi'
error: error in converting to LLVM-IR dialect

is due to Standard Dialect is not registered as Legal Dialect in: Adding it as legal doesn't make sense, since we added conversion patterns populated there it self.

target.addLegalDialect<mlir::omp::OpenMPDialect>();

After investigating through lowering part, I'm also sceptic WRT lowering that has happened. Code inside the region utilizes fir lowering code.
Then how come it's creating Operation from Standard Dialect it should've created fir analog of these operations ?

 %14 = addi %1, %c1 : index
 %15 = subi %3, %c1 : index

@schweitzpgi @jeanPerier , do you guys think this lowering is legit ?

Also I had a brief discussion with @kiranchandramohan , he was suggesting to proceed with this PR without the Loop test case and to handle this issue separately.
Does that sounds good to you folks ?

@schweitzpgi
Copy link

This looks strange and (likely) is failing legalization.

%19 = fir.convert %5 : (!llvm.i64) -> i64

Are you converting OMP dialect to standard dialect or straight to LLVM-IR? Code gen to LLVM IR dialect in tco does both standard and fir dialects together. It should be easy enough to add more patterns.

auto genFIRCB = [&](Fortran::lower::pft::Evaluation &eval) {
genFIR(eval, /*unstructuredContext=*/false);
};
genOpenMPConstruct(*this, getEval(), omp, genFIRCB);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also set the insertionPoint correctly in genOpenMPConstruct and call genFIR in the eval loop right here so you don't need to pass a callback function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion! it simplified the change a lot. 👍

Executable can be generated and tested as:
```
$ bbc -fopenmp -emit-fir parallel.f90 -o -| tco | llc -filetype=obj -o parallel.o
$ clang parallel.o -L/PATH/lib -lFortranRuntime -lFortranDecimal -L/PATH/lib/ -lomp -lstdc++ -lm
$ ./a.out
```

Extended test case for `if statement`. TODO: extend for `do loop`

Added OpenMPToLLVMPass to bbc pass pipeline
@SouraVX
Copy link
Collaborator Author

SouraVX commented Sep 21, 2020

This looks strange and (likely) is failing legalization.

%19 = fir.convert %5 : (!llvm.i64) -> i64

Are you converting OMP dialect to standard dialect or straight to LLVM-IR? Code gen to LLVM IR dialect in tco does both standard and fir dialects together. It should be easy enough to add more patterns.

This only happens if we add pm.addPass(mlir::createConvertOpenMPToLLVMPass()); pass before FIRToLLVM and we figured out that this is too early and wrong.

But primary problem still persist, we(@kiranchandramohan too) are considering to create separate issue and work on it.

loc("<stdin>":24:13): error: failed to legalize operation 'std.addi'
error: error in converting to LLVM-IR dialect

I've squashed this PR.
Does that sound good to you @schweitzpgi ?

@kiranchandramohan
Copy link
Collaborator

Yes, I agree with @SouraVX that we can handle the ordering of lowering passes or addition of conversion patterns as a separate patch. If the basic lowering of a parallel region works in this patch then that is fine.

Copy link
Collaborator

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@schweitzpgi
Copy link

I'm good with making progress and noting TODO items.

I assume this is ready to merge now?

@schweitzpgi schweitzpgi merged commit 2f3a180 into flang-compiler:fir-dev Sep 22, 2020
SouraVX added a commit to llvm/llvm-project that referenced this pull request Sep 25, 2020
After skeleton of the `Parallel Op` is created set the insertion point to start of the block. So that later `CodeGen` can proceed.

Note: This patch reflects the work that can be upstreamed from PR(merged)
PR: flang-compiler#424

Reviewed By: schweitz, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D88221
arichardson pushed a commit to arichardson/llvm-project that referenced this pull request Mar 24, 2021
After skeleton of the `Parallel Op` is created set the insertion point to start of the block. So that later `CodeGen` can proceed.

Note: This patch reflects the work that can be upstreamed from PR(merged)
PR: flang-compiler/f18-llvm-project#424

Reviewed By: schweitz, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D88221
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this pull request Oct 7, 2022
After skeleton of the `Parallel Op` is created set the insertion point to start of the block. So that later `CodeGen` can proceed.

Note: This patch reflects the work that can be upstreamed from PR(merged)
PR: flang-compiler/f18-llvm-project#424

Reviewed By: schweitz, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D88221
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants