-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[OpenMP][MLIR] Fix initTargetRuntimeAttrs, LoopTripCount calculation when host_eval vars is empty #157717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir-openmp @llvm/pr-subscribers-mlir Author: Chaitanya (skc7) ChangesIn OpenMPToLLVMIRTranslation, If This PR fixes this issue by adding a check to presence of host_eval vars before calculating the loopTripCount. Full diff: https://github.com/llvm/llvm-project/pull/157717.diff 2 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 6694de8383534..019ce626903b2 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -5393,8 +5393,11 @@ initTargetRuntimeAttrs(llvm::IRBuilderBase &builder,
Value numThreads, numTeamsLower, numTeamsUpper, teamsThreadLimit;
llvm::SmallVector<Value> lowerBounds(numLoops), upperBounds(numLoops),
steps(numLoops);
- extractHostEvalClauses(targetOp, numThreads, numTeamsLower, numTeamsUpper,
- teamsThreadLimit, &lowerBounds, &upperBounds, &steps);
+ if (!targetOp.getHostEvalVars().empty()) {
+ extractHostEvalClauses(targetOp, numThreads, numTeamsLower, numTeamsUpper,
+ teamsThreadLimit, &lowerBounds, &upperBounds,
+ &steps);
+ }
// TODO: Handle constant 'if' clauses.
if (Value targetThreadLimit = targetOp.getThreadLimit())
@@ -5424,6 +5427,8 @@ initTargetRuntimeAttrs(llvm::IRBuilderBase &builder,
// here, since we're only interested in the trip count.
for (auto [loopLower, loopUpper, loopStep] :
llvm::zip_equal(lowerBounds, upperBounds, steps)) {
+ if (!loopLower || !loopUpper || !loopStep)
+ break;
llvm::Value *lowerBound = moduleTranslation.lookupValue(loopLower);
llvm::Value *upperBound = moduleTranslation.lookupValue(loopUpper);
llvm::Value *step = moduleTranslation.lookupValue(loopStep);
diff --git a/mlir/test/Target/LLVMIR/omptarget-loopnest-no-host-eval.mlir b/mlir/test/Target/LLVMIR/omptarget-loopnest-no-host-eval.mlir
new file mode 100644
index 0000000000000..8628c115364f9
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/omptarget-loopnest-no-host-eval.mlir
@@ -0,0 +1,28 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+// Ensure that the -mlir-to-llmvir pass doesn't crash.
+
+// CHECK: define void @_QQmain()
+
+module attributes {llvm.target_triple = "x86_64-unknown-linux-gnu", omp.is_gpu = false, omp.is_target_device = false} {
+ llvm.func @_QQmain() {
+ omp.target {
+ %0 = llvm.mlir.constant(1000 : i32) : i32
+ %1 = llvm.mlir.constant(1 : i32) : i32
+ omp.teams {
+ omp.parallel {
+ omp.distribute {
+ omp.wsloop {
+ omp.loop_nest (%arg0) : i32 = (%1) to (%0) inclusive step (%1) {
+ omp.yield
+ }
+ } {omp.composite}
+ } {omp.composite}
+ omp.terminator
+ } {omp.composite}
+ omp.terminator
+ }
+ omp.terminator
+ }
+ llvm.return
+ }
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this change is correct. Having host-evaluated loop bounds is a hard requirement for SPMD kernels (i.e. target teams distribute parallel do
). Even if they are constants, they must be passed as host_eval
arguments to omp.target
when compiling for the host. See for example the unit test at "flang/test/Lower/OpenMP/target-spmd.f90".
It is only when compiling for a target device (i.e. the module attribute omp.is_target_device=true
) that we don't use host_eval
.
What is the motivation for this change, is there any code currently producing such invalid MLIR?
This change is required for #140523 workdistribute construct lowering, where in a fir.do_loop unordered is identified under The test mentioned in this PR passes the verifier, but crashes at lowering from openmp to llvm ir. |
If that's the case, then I think some changes would probably have to be done to workdistribute lowering. I think it should conform to the rules of the dialect, so we can properly codegen for the resulting target region. If you look at the current work being done for automatic offloading of do-concurrent (#155987), you can see it is able to follow these rules and produce the expected MLIR. I think the core issue here is that, whenever |
Closing this PR, since host_eval is needed in host compilation. |
In OpenMPToLLVMIRTranslation, If
host_eval
vars are empty, LoopTripCount cannot be calculated.This PR fixes this issue by adding a check to presence of host_eval vars before calculating the loopTripCount.