Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenMP] Fix target data region codegen being omitted for device pass #85218

Merged
merged 3 commits into from
Mar 19, 2024

Conversation

TIFitis
Copy link
Member

@TIFitis TIFitis commented Mar 14, 2024

This patch enables the BodyCodeGen callback to still trigger for the TargetData nested region during the device pass. There maybe Target code nested within the TargetData region for which this is required.

Also add tests for the same.

@llvmbot llvmbot added flang:openmp clang:openmp OpenMP related changes to Clang openmp:libomptarget OpenMP offload runtime labels Mar 14, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 14, 2024

@llvm/pr-subscribers-flang-openmp

Author: Akash Banerjee (TIFitis)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/85218.diff

2 Files Affected:

  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+4-1)
  • (added) openmp/libomptarget/test/offloading/fortran/target-nested-target-data.f90 (+31)
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index f6cf358119fb71..bd58f9ebba1c21 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -4791,8 +4791,11 @@ OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createTargetData(
     return InsertPointTy();
 
   // Disable TargetData CodeGen on Device pass.
-  if (Config.IsTargetDevice.value_or(false))
+  if (Config.IsTargetDevice.value_or(false)){
+    if(BodyGenCB)
+      Builder.restoreIP(BodyGenCB(Builder.saveIP(), BodyGenTy::NoPriv));
     return Builder.saveIP();
+  }
 
   Builder.restoreIP(CodeGenIP);
   bool IsStandAlone = !BodyGenCB;
diff --git a/openmp/libomptarget/test/offloading/fortran/target-nested-target-data.f90 b/openmp/libomptarget/test/offloading/fortran/target-nested-target-data.f90
new file mode 100644
index 00000000000000..5508690c82ddd9
--- /dev/null
+++ b/openmp/libomptarget/test/offloading/fortran/target-nested-target-data.f90
@@ -0,0 +1,31 @@
+! Basic offloading test of arrays with provided lower
+! and upper bounds as specified by OpenMP's sectioning
+! REQUIRES: flang
+! UNSUPPORTED: nvptx64-nvidia-cuda-LTO
+! UNSUPPORTED: aarch64-unknown-linux-gnu
+! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO
+! UNSUPPORTED: x86_64-pc-linux-gnu
+! UNSUPPORTED: x86_64-pc-linux-gnu-LTO
+
+! RUN: %libomptarget-compile-fortran-run-and-check-generic
+program main
+   integer :: A(10), B(10), C(10)
+
+   do I = 1, 10
+      A(I) = 1
+      B(I) = 2
+   end do
+   !$omp target data map(to: A, B) map(alloc: C)
+   !$omp target map(from: C)
+   do I = 1, 10
+      C(I) = A(I) + B(I) ! assigns 3, A:1 + B:2
+   end do
+   !$omp end target
+   !$omp target update from(C) ! updates C device -> host
+   !$omp end target data
+
+   print *, C ! should be all 3's
+
+end program
+
+! CHECK: 3 3 3 3 3 3 3 3 3 3

Copy link

github-actions bot commented Mar 14, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@agozillon agozillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, minus the code formatter issues! Please do wait on a secondary review though. It would also be very cool to have another runtime test or two as well, just to see if we can dig up any other issues if they exist or get a little more code coverage. But that's only if you wish to do so, I imagine there's other more critical things to be done elsewhere.

@skatrak
Copy link
Contributor

skatrak commented Mar 14, 2024

Thanks Akash, it looks ok for me as well, but I'm missing one OMPIRBuilder test to check not just the overall behavior but also the IR produced (llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp).

@TIFitis
Copy link
Member Author

TIFitis commented Mar 18, 2024

@skatrak I've added a test to the OpenMPIRBuilderTest file. Let me know if it's how you wanted.

Copy link
Contributor

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Akash, LGTM. Could you add a brief commit description before you merge?

@TIFitis TIFitis merged commit e9da5f0 into llvm:main Mar 19, 2024
3 of 4 checks passed
@jdoerfert
Copy link
Member

This patch enables the BodyCodeGen callback to still trigger for the TargetData nested region during the device pass. There maybe Target code nested within the TargetData region for which this is required.

Could you elaborate on this? What code associated with the target data is required on the device? That doesn't connect in my mind and I'm worried this hides a different problem.
The test doesn't help me understand what happened before either.

@TIFitis
Copy link
Member Author

TIFitis commented Mar 19, 2024

Could you elaborate on this? What code associated with the target data is required on the device? That doesn't connect in my mind and I'm worried this hides a different problem. The test doesn't help me understand what happened before either.

Take the following test:

!$omp target data map(to: A, B) map(alloc: C)
!$omp target
  do I = 1, 10
     C(I) = A(I) + B(I) ! assigns 3, A:1 + B:2
  end do
!$omp end target
!$omp end target data

We don't need to emit any code for target data on the device pass, but the target directive nested inside it needs device pass codegen. By not having the BodyGenCB callback, we were omitting this codegen which this patch brings back.

Note that both Clang and Flang don't actually emit any code for the target data region unless it is target code.

The following example still doesn't produce any code for the device pass related to target data or it's nested region.

!$omp target data map(to: A, B) map(alloc: C)
  do I = 1, 10
     C(I) = A(I) + B(I) ! assigns 3, A:1 + B:2
  end do
!$omp end target data

@skatrak
Copy link
Contributor

skatrak commented Mar 20, 2024

This seems to be yet another manifestation of this problem: #85239, #84611. It appears the workaround implemented in this patch wouldn't be needed (at least for Flang) after solving the problem of skipping host operations during device compilation while still processing target regions nested inside them.

chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
…llvm#85218)

This patch enables the BodyCodeGen callback to still trigger for the
TargetData nested region during the device pass. There maybe Target code
nested within the TargetData region for which this is required.

Also add tests for the same.
@jdoerfert
Copy link
Member

I think I agree with @skatrak, this looks like a narrow workaround.
We should not do any host codegen on the device lowering path, target data or other things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:openmp OpenMP related changes to Clang flang:openmp openmp:libomptarget OpenMP offload runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants