Skip to content

[flang][OpenMP] Skip runtime mapping with no offload targets #144534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kparzysz
Copy link
Contributor

When no offload targets are specified flang will ignore "target" constructs, but not "target data" constructs. This patch makes the behavior consistent across all offload-related operations.

While ignoring "target" may produce semantically incorrect code, it may still be a useful debugging tool.

When no offload targets are specified flang will ignore "target"
constructs, but not "target data" constructs. This patch makes the
behavior consistent across all offload-related operations.

While ignoring "target" may produce semantically incorrect code, it
may still be a useful debugging tool.
@kparzysz kparzysz requested review from skatrak, tblah and mjklemm June 17, 2025 14:38
@llvmbot llvmbot added mlir:llvm mlir flang Flang issues not falling into any other category mlir:openmp flang:fir-hlfir flang:openmp labels Jun 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 17, 2025

@llvm/pr-subscribers-mlir-openmp
@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-flang-openmp

Author: Krzysztof Parzyszek (kparzysz)

Changes

When no offload targets are specified flang will ignore "target" constructs, but not "target data" constructs. This patch makes the behavior consistent across all offload-related operations.

While ignoring "target" may produce semantically incorrect code, it may still be a useful debugging tool.


Patch is 30.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144534.diff

5 Files Affected:

  • (added) flang/test/Lower/ignore-target-data.f90 (+20)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+11)
  • (modified) mlir/test/Target/LLVMIR/omptarget-llvm.mlir (+184-164)
  • (modified) mlir/test/Target/LLVMIR/omptargetdata-nowait-llvm.mlir (+24-18)
  • (modified) mlir/test/Target/LLVMIR/openmp-data-target-device.mlir (+1-1)
diff --git a/flang/test/Lower/ignore-target-data.f90 b/flang/test/Lower/ignore-target-data.f90
new file mode 100644
index 0000000000000..a2182ee0f5b0b
--- /dev/null
+++ b/flang/test/Lower/ignore-target-data.f90
@@ -0,0 +1,20 @@
+!RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s
+
+!Make sure that there are no calls to the mapper.
+
+!CHECK-NOT: call{{.*}}__tgt_target_data_begin_mapper
+!CHECK-NOT: call{{.*}}__tgt_target_data_end_mapper
+
+program test
+
+call f(1, 2)
+
+contains
+
+subroutine f(x, y)
+  integer :: x, y
+  !$omp target data map(tofrom: x, y)
+  x = x + y
+  !$omp end target data
+end subroutine
+end
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 90ce06a0345c0..5985f2b0f5621 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -4378,6 +4378,9 @@ convertOmpTargetData(Operation *op, llvm::IRBuilderBase &builder,
   llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
   llvm::OpenMPIRBuilder::TargetDataInfo info(/*RequiresDevicePointerInfo=*/true,
                                              /*SeparateBeginEndCalls=*/true);
+  bool isTargetDevice = ompBuilder->Config.isTargetDevice();
+  bool isOffloadEntry =
+      isTargetDevice || !ompBuilder->Config.TargetTriples.empty();
 
   LogicalResult result =
       llvm::TypeSwitch<Operation *, LogicalResult>(op)
@@ -4402,6 +4405,8 @@ convertOmpTargetData(Operation *op, llvm::IRBuilderBase &builder,
           .Case([&](omp::TargetEnterDataOp enterDataOp) -> LogicalResult {
             if (failed(checkImplementationStatus(*enterDataOp)))
               return failure();
+            if (!isOffloadEntry)
+              return success();
 
             if (auto ifVar = enterDataOp.getIfExpr())
               ifCond = moduleTranslation.lookupValue(ifVar);
@@ -4422,6 +4427,8 @@ convertOmpTargetData(Operation *op, llvm::IRBuilderBase &builder,
           .Case([&](omp::TargetExitDataOp exitDataOp) -> LogicalResult {
             if (failed(checkImplementationStatus(*exitDataOp)))
               return failure();
+            if (!isOffloadEntry)
+              return success();
 
             if (auto ifVar = exitDataOp.getIfExpr())
               ifCond = moduleTranslation.lookupValue(ifVar);
@@ -4442,6 +4449,8 @@ convertOmpTargetData(Operation *op, llvm::IRBuilderBase &builder,
           .Case([&](omp::TargetUpdateOp updateDataOp) -> LogicalResult {
             if (failed(checkImplementationStatus(*updateDataOp)))
               return failure();
+            if (!isOffloadEntry)
+              return success();
 
             if (auto ifVar = updateDataOp.getIfExpr())
               ifCond = moduleTranslation.lookupValue(ifVar);
@@ -4467,6 +4476,8 @@ convertOmpTargetData(Operation *op, llvm::IRBuilderBase &builder,
 
   if (failed(result))
     return failure();
+  if (!isOffloadEntry)
+    return success();
 
   using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
   MapInfoData mapData;
diff --git a/mlir/test/Target/LLVMIR/omptarget-llvm.mlir b/mlir/test/Target/LLVMIR/omptarget-llvm.mlir
index 971bea2068544..e6ea3aaeec656 100644
--- a/mlir/test/Target/LLVMIR/omptarget-llvm.mlir
+++ b/mlir/test/Target/LLVMIR/omptarget-llvm.mlir
@@ -1,15 +1,17 @@
 // RUN: mlir-translate -mlir-to-llvmir -split-input-file %s | FileCheck %s
 
-llvm.func @_QPopenmp_target_data() {
-  %0 = llvm.mlir.constant(1 : i64) : i64
-  %1 = llvm.alloca %0 x i32 {bindc_name = "i", in_type = i32, operand_segment_sizes = array<i32: 0, 0>, uniq_name = "_QFopenmp_target_dataEi"} : (i64) -> !llvm.ptr
-  %2 = omp.map.info var_ptr(%1 : !llvm.ptr, i32)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
-  omp.target_data map_entries(%2 : !llvm.ptr) {
-    %3 = llvm.mlir.constant(99 : i32) : i32
-    llvm.store %3, %1 : i32, !llvm.ptr
-    omp.terminator
+module attributes {omp.target_triples = ["amdgcn-amd-amdhsa"]} {
+  llvm.func @_QPopenmp_target_data() {
+    %0 = llvm.mlir.constant(1 : i64) : i64
+    %1 = llvm.alloca %0 x i32 {bindc_name = "i", in_type = i32, operand_segment_sizes = array<i32: 0, 0>, uniq_name = "_QFopenmp_target_dataEi"} : (i64) -> !llvm.ptr
+    %2 = omp.map.info var_ptr(%1 : !llvm.ptr, i32)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
+    omp.target_data map_entries(%2 : !llvm.ptr) {
+      %3 = llvm.mlir.constant(99 : i32) : i32
+      llvm.store %3, %1 : i32, !llvm.ptr
+      omp.terminator
+    }
+    llvm.return
   }
-  llvm.return
 }
 
 // CHECK:         @.offload_sizes = private unnamed_addr constant [1 x i64] [i64 4]
@@ -38,23 +40,25 @@ llvm.func @_QPopenmp_target_data() {
 
 // -----
 
-llvm.func @_QPopenmp_target_data_region(%0 : !llvm.ptr) {
-  %1 = llvm.mlir.constant(1023 : index) : i64
-  %2 = llvm.mlir.constant(0 : index) : i64
-  %3 = llvm.mlir.constant(1024 : index) : i64
-  %4 = llvm.mlir.constant(1 : index) : i64
-  %5 = omp.map.bounds   lower_bound(%2 : i64) upper_bound(%1 : i64) extent(%3 : i64) stride(%4 : i64) start_idx(%4 : i64)
-  %6 = omp.map.info var_ptr(%0 : !llvm.ptr, !llvm.array<1024 x i32>)   map_clauses(from) capture(ByRef) bounds(%5)  -> !llvm.ptr {name = ""}
-  omp.target_data map_entries(%6 : !llvm.ptr) {
-    %7 = llvm.mlir.constant(99 : i32) : i32
-    %8 = llvm.mlir.constant(1 : i64) : i64
-    %9 = llvm.mlir.constant(1 : i64) : i64
-    %10 = llvm.mlir.constant(0 : i64) : i64
-    %11 = llvm.getelementptr %0[0, %10] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.array<1024 x i32>
-    llvm.store %7, %11 : i32, !llvm.ptr
-    omp.terminator
+module attributes {omp.target_triples = ["amdgcn-amd-amdhsa"]} {
+  llvm.func @_QPopenmp_target_data_region(%0 : !llvm.ptr) {
+    %1 = llvm.mlir.constant(1023 : index) : i64
+    %2 = llvm.mlir.constant(0 : index) : i64
+    %3 = llvm.mlir.constant(1024 : index) : i64
+    %4 = llvm.mlir.constant(1 : index) : i64
+    %5 = omp.map.bounds   lower_bound(%2 : i64) upper_bound(%1 : i64) extent(%3 : i64) stride(%4 : i64) start_idx(%4 : i64)
+    %6 = omp.map.info var_ptr(%0 : !llvm.ptr, !llvm.array<1024 x i32>)   map_clauses(from) capture(ByRef) bounds(%5)  -> !llvm.ptr {name = ""}
+    omp.target_data map_entries(%6 : !llvm.ptr) {
+      %7 = llvm.mlir.constant(99 : i32) : i32
+      %8 = llvm.mlir.constant(1 : i64) : i64
+      %9 = llvm.mlir.constant(1 : i64) : i64
+      %10 = llvm.mlir.constant(0 : i64) : i64
+      %11 = llvm.getelementptr %0[0, %10] : (!llvm.ptr, i64) -> !llvm.ptr, !llvm.array<1024 x i32>
+      llvm.store %7, %11 : i32, !llvm.ptr
+      omp.terminator
+    }
+    llvm.return
   }
-  llvm.return
 }
 
 // CHECK:         @.offload_sizes = private unnamed_addr constant [1 x i64] [i64 4096]
@@ -85,50 +89,52 @@ llvm.func @_QPopenmp_target_data_region(%0 : !llvm.ptr) {
 
 // -----
 
-llvm.func @_QPomp_target_enter_exit(%1 : !llvm.ptr, %3 : !llvm.ptr) {
-  %4 = llvm.mlir.constant(1 : i64) : i64
-  %5 = llvm.alloca %4 x i32 {bindc_name = "dvc", in_type = i32, operandSegmentSizes = array<i32: 0, 0>, uniq_name = "_QFomp_target_enter_exitEdvc"} : (i64) -> !llvm.ptr
-  %6 = llvm.mlir.constant(1 : i64) : i64
-  %7 = llvm.alloca %6 x i32 {bindc_name = "i", in_type = i32, operandSegmentSizes = array<i32: 0, 0>, uniq_name = "_QFomp_target_enter_exitEi"} : (i64) -> !llvm.ptr
-  %8 = llvm.mlir.constant(5 : i32) : i32
-  llvm.store %8, %7 : i32, !llvm.ptr
-  %9 = llvm.mlir.constant(2 : i32) : i32
-  llvm.store %9, %5 : i32, !llvm.ptr
-  %10 = llvm.load %7 : !llvm.ptr -> i32
-  %11 = llvm.mlir.constant(10 : i32) : i32
-  %12 = llvm.icmp "slt" %10, %11 : i32
-  %13 = llvm.load %5 : !llvm.ptr -> i32
-  %14 = llvm.mlir.constant(1023 : index) : i64
-  %15 = llvm.mlir.constant(0 : index) : i64
-  %16 = llvm.mlir.constant(1024 : index) : i64
-  %17 = llvm.mlir.constant(1 : index) : i64
-  %18 = omp.map.bounds   lower_bound(%15 : i64) upper_bound(%14 : i64) extent(%16 : i64) stride(%17 : i64) start_idx(%17 : i64)
-  %map1 = omp.map.info var_ptr(%1 : !llvm.ptr, !llvm.array<1024 x i32>)   map_clauses(to) capture(ByRef) bounds(%18) -> !llvm.ptr {name = ""}
-  %19 = llvm.mlir.constant(511 : index) : i64
-  %20 = llvm.mlir.constant(0 : index) : i64
-  %21 = llvm.mlir.constant(512 : index) : i64
-  %22 = llvm.mlir.constant(1 : index) : i64
-  %23 = omp.map.bounds   lower_bound(%20 : i64) upper_bound(%19 : i64) extent(%21 : i64) stride(%22 : i64) start_idx(%22 : i64)
-  %map2 = omp.map.info var_ptr(%3 : !llvm.ptr, !llvm.array<512 x i32>)   map_clauses(exit_release_or_enter_alloc) capture(ByRef) bounds(%23) -> !llvm.ptr {name = ""}
-  omp.target_enter_data   if(%12) device(%13 : i32) map_entries(%map1, %map2 : !llvm.ptr, !llvm.ptr)
-  %24 = llvm.load %7 : !llvm.ptr -> i32
-  %25 = llvm.mlir.constant(10 : i32) : i32
-  %26 = llvm.icmp "sgt" %24, %25 : i32
-  %27 = llvm.load %5 : !llvm.ptr -> i32
-  %28 = llvm.mlir.constant(1023 : index) : i64
-  %29 = llvm.mlir.constant(0 : index) : i64
-  %30 = llvm.mlir.constant(1024 : index) : i64
-  %31 = llvm.mlir.constant(1 : index) : i64
-  %32 = omp.map.bounds   lower_bound(%29 : i64) upper_bound(%28 : i64) extent(%30 : i64) stride(%31 : i64) start_idx(%31 : i64)
-  %map3 = omp.map.info var_ptr(%1 : !llvm.ptr, !llvm.array<1024 x i32>)   map_clauses(from) capture(ByRef) bounds(%32) -> !llvm.ptr {name = ""}
-  %33 = llvm.mlir.constant(511 : index) : i64
-  %34 = llvm.mlir.constant(0 : index) : i64
-  %35 = llvm.mlir.constant(512 : index) : i64
-  %36 = llvm.mlir.constant(1 : index) : i64
-  %37 = omp.map.bounds   lower_bound(%34 : i64) upper_bound(%33 : i64) extent(%35 : i64) stride(%36 : i64) start_idx(%36 : i64)
-  %map4 = omp.map.info var_ptr(%3 : !llvm.ptr, !llvm.array<512 x i32>)   map_clauses(exit_release_or_enter_alloc) capture(ByRef) bounds(%37) -> !llvm.ptr {name = ""}
-  omp.target_exit_data   if(%26) device(%27 : i32) map_entries(%map3, %map4 : !llvm.ptr, !llvm.ptr)
-  llvm.return
+module attributes {omp.target_triples = ["amdgcn-amd-amdhsa"]} {
+  llvm.func @_QPomp_target_enter_exit(%1 : !llvm.ptr, %3 : !llvm.ptr) {
+    %4 = llvm.mlir.constant(1 : i64) : i64
+    %5 = llvm.alloca %4 x i32 {bindc_name = "dvc", in_type = i32, operandSegmentSizes = array<i32: 0, 0>, uniq_name = "_QFomp_target_enter_exitEdvc"} : (i64) -> !llvm.ptr
+    %6 = llvm.mlir.constant(1 : i64) : i64
+    %7 = llvm.alloca %6 x i32 {bindc_name = "i", in_type = i32, operandSegmentSizes = array<i32: 0, 0>, uniq_name = "_QFomp_target_enter_exitEi"} : (i64) -> !llvm.ptr
+    %8 = llvm.mlir.constant(5 : i32) : i32
+    llvm.store %8, %7 : i32, !llvm.ptr
+    %9 = llvm.mlir.constant(2 : i32) : i32
+    llvm.store %9, %5 : i32, !llvm.ptr
+    %10 = llvm.load %7 : !llvm.ptr -> i32
+    %11 = llvm.mlir.constant(10 : i32) : i32
+    %12 = llvm.icmp "slt" %10, %11 : i32
+    %13 = llvm.load %5 : !llvm.ptr -> i32
+    %14 = llvm.mlir.constant(1023 : index) : i64
+    %15 = llvm.mlir.constant(0 : index) : i64
+    %16 = llvm.mlir.constant(1024 : index) : i64
+    %17 = llvm.mlir.constant(1 : index) : i64
+    %18 = omp.map.bounds   lower_bound(%15 : i64) upper_bound(%14 : i64) extent(%16 : i64) stride(%17 : i64) start_idx(%17 : i64)
+    %map1 = omp.map.info var_ptr(%1 : !llvm.ptr, !llvm.array<1024 x i32>)   map_clauses(to) capture(ByRef) bounds(%18) -> !llvm.ptr {name = ""}
+    %19 = llvm.mlir.constant(511 : index) : i64
+    %20 = llvm.mlir.constant(0 : index) : i64
+    %21 = llvm.mlir.constant(512 : index) : i64
+    %22 = llvm.mlir.constant(1 : index) : i64
+    %23 = omp.map.bounds   lower_bound(%20 : i64) upper_bound(%19 : i64) extent(%21 : i64) stride(%22 : i64) start_idx(%22 : i64)
+    %map2 = omp.map.info var_ptr(%3 : !llvm.ptr, !llvm.array<512 x i32>)   map_clauses(exit_release_or_enter_alloc) capture(ByRef) bounds(%23) -> !llvm.ptr {name = ""}
+    omp.target_enter_data   if(%12) device(%13 : i32) map_entries(%map1, %map2 : !llvm.ptr, !llvm.ptr)
+    %24 = llvm.load %7 : !llvm.ptr -> i32
+    %25 = llvm.mlir.constant(10 : i32) : i32
+    %26 = llvm.icmp "sgt" %24, %25 : i32
+    %27 = llvm.load %5 : !llvm.ptr -> i32
+    %28 = llvm.mlir.constant(1023 : index) : i64
+    %29 = llvm.mlir.constant(0 : index) : i64
+    %30 = llvm.mlir.constant(1024 : index) : i64
+    %31 = llvm.mlir.constant(1 : index) : i64
+    %32 = omp.map.bounds   lower_bound(%29 : i64) upper_bound(%28 : i64) extent(%30 : i64) stride(%31 : i64) start_idx(%31 : i64)
+    %map3 = omp.map.info var_ptr(%1 : !llvm.ptr, !llvm.array<1024 x i32>)   map_clauses(from) capture(ByRef) bounds(%32) -> !llvm.ptr {name = ""}
+    %33 = llvm.mlir.constant(511 : index) : i64
+    %34 = llvm.mlir.constant(0 : index) : i64
+    %35 = llvm.mlir.constant(512 : index) : i64
+    %36 = llvm.mlir.constant(1 : index) : i64
+    %37 = omp.map.bounds   lower_bound(%34 : i64) upper_bound(%33 : i64) extent(%35 : i64) stride(%36 : i64) start_idx(%36 : i64)
+    %map4 = omp.map.info var_ptr(%3 : !llvm.ptr, !llvm.array<512 x i32>)   map_clauses(exit_release_or_enter_alloc) capture(ByRef) bounds(%37) -> !llvm.ptr {name = ""}
+    omp.target_exit_data   if(%26) device(%27 : i32) map_entries(%map3, %map4 : !llvm.ptr, !llvm.ptr)
+    llvm.return
+  }
 }
 
 // CHECK:         @.offload_sizes = private unnamed_addr constant [2 x i64] [i64 4096, i64 2048]
@@ -205,18 +211,20 @@ llvm.func @_QPomp_target_enter_exit(%1 : !llvm.ptr, %3 : !llvm.ptr) {
 
 // -----
 
-llvm.func @_QPopenmp_target_use_dev_ptr() {
-  %0 = llvm.mlir.constant(1 : i64) : i64
-  %a = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
-  %map1 = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
-  %map2 = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
-  omp.target_data  map_entries(%map1 : !llvm.ptr) use_device_ptr(%map2 -> %arg0 : !llvm.ptr)  {
-    %1 = llvm.mlir.constant(10 : i32) : i32
-    %2 = llvm.load %arg0 : !llvm.ptr -> !llvm.ptr
-    llvm.store %1, %2 : i32, !llvm.ptr
-    omp.terminator
+module attributes {omp.target_triples = ["amdgcn-amd-amdhsa"]} {
+  llvm.func @_QPopenmp_target_use_dev_ptr() {
+    %0 = llvm.mlir.constant(1 : i64) : i64
+    %a = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
+    %map1 = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
+    %map2 = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
+    omp.target_data  map_entries(%map1 : !llvm.ptr) use_device_ptr(%map2 -> %arg0 : !llvm.ptr)  {
+      %1 = llvm.mlir.constant(10 : i32) : i32
+      %2 = llvm.load %arg0 : !llvm.ptr -> !llvm.ptr
+      llvm.store %1, %2 : i32, !llvm.ptr
+      omp.terminator
+    }
+    llvm.return
   }
-  llvm.return
 }
 
 // CHECK:         @.offload_sizes = private unnamed_addr constant [1 x i64] [i64 8]
@@ -249,18 +257,20 @@ llvm.func @_QPopenmp_target_use_dev_ptr() {
 
 // -----
 
-llvm.func @_QPopenmp_target_use_dev_addr() {
-  %0 = llvm.mlir.constant(1 : i64) : i64
-  %a = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
-  %map = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
-  %map2 = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
-  omp.target_data  map_entries(%map : !llvm.ptr) use_device_addr(%map2 -> %arg0 : !llvm.ptr)  {
-    %1 = llvm.mlir.constant(10 : i32) : i32
-    %2 = llvm.load %arg0 : !llvm.ptr -> !llvm.ptr
-    llvm.store %1, %2 : i32, !llvm.ptr
-    omp.terminator
+module attributes {omp.target_triples = ["amdgcn-amd-amdhsa"]} {
+  llvm.func @_QPopenmp_target_use_dev_addr() {
+    %0 = llvm.mlir.constant(1 : i64) : i64
+    %a = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
+    %map = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
+    %map2 = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
+    omp.target_data  map_entries(%map : !llvm.ptr) use_device_addr(%map2 -> %arg0 : !llvm.ptr)  {
+      %1 = llvm.mlir.constant(10 : i32) : i32
+      %2 = llvm.load %arg0 : !llvm.ptr -> !llvm.ptr
+      llvm.store %1, %2 : i32, !llvm.ptr
+      omp.terminator
+    }
+    llvm.return
   }
-  llvm.return
 }
 
 // CHECK:         @.offload_sizes = private unnamed_addr constant [1 x i64] [i64 8]
@@ -291,17 +301,19 @@ llvm.func @_QPopenmp_target_use_dev_addr() {
 
 // -----
 
-llvm.func @_QPopenmp_target_use_dev_addr_no_ptr() {
-  %0 = llvm.mlir.constant(1 : i64) : i64
-  %a = llvm.alloca %0 x i32 : (i64) -> !llvm.ptr
-  %map = omp.map.info var_ptr(%a : !llvm.ptr, i32)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
-  %map2 = omp.map.info var_ptr(%a : !llvm.ptr, i32)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
-  omp.target_data  map_entries(%map : !llvm.ptr) use_device_addr(%map2 -> %arg0 : !llvm.ptr)  {
-    %1 = llvm.mlir.constant(10 : i32) : i32
-    llvm.store %1, %arg0 : i32, !llvm.ptr
-    omp.terminator
+module attributes {omp.target_triples = ["amdgcn-amd-amdhsa"]} {
+  llvm.func @_QPopenmp_target_use_dev_addr_no_ptr() {
+    %0 = llvm.mlir.constant(1 : i64) : i64
+    %a = llvm.alloca %0 x i32 : (i64) -> !llvm.ptr
+    %map = omp.map.info var_ptr(%a : !llvm.ptr, i32)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
+    %map2 = omp.map.info var_ptr(%a : !llvm.ptr, i32)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
+    omp.target_data  map_entries(%map : !llvm.ptr) use_device_addr(%map2 -> %arg0 : !llvm.ptr)  {
+      %1 = llvm.mlir.constant(10 : i32) : i32
+      llvm.store %1, %arg0 : i32, !llvm.ptr
+      omp.terminator
+    }
+    llvm.return
   }
-  llvm.return
 }
 
 // CHECK:         @.offload_sizes = private unnamed_addr constant [1 x i64] [i64 4]
@@ -331,23 +343,25 @@ llvm.func @_QPopenmp_target_use_dev_addr_no_ptr() {
 
 // -----
 
-llvm.func @_QPopenmp_target_use_dev_addr_nomap() {
-  %0 = llvm.mlir.constant(1 : i64) : i64
-  %a = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
-  %1 = llvm.mlir.constant(1 : i64) : i64
-  %b = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
-  %map = omp.map.info var_ptr(%b : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
-  %map2 = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
-  omp.target_data  map_entries(%map : !llvm.ptr) use_device_addr(%map2 -> %arg0 : !llvm.ptr)  {
-    %2 = llvm.mlir.constant(10 : i32) : i32
-    %3 = llvm.load %arg0 : !llvm.ptr -> !llvm.ptr
-    llvm.store %2, %3 : i32, !llvm.ptr
-    %4 = llvm.mlir.constant(20 : i32) : i32
-    %5 = llvm.load %b : !llvm.ptr -> !llvm.ptr
-    llvm.store %4, %5 : i32, !llvm.ptr
-    omp.terminator
+module attributes {omp.target_triples = ["amdgcn-amd-amdhsa"]} {
+  llvm.func @_QPopenmp_target_use_dev_addr_nomap() {
+    %0 = llvm.mlir.constant(1 : i64) : i64
+    %a = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
+    %1 = llvm.mlir.constant(1 : i64) : i64
+    %b = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
+    %map = omp.map.info var_ptr(%b : !llvm.ptr, !llvm.ptr)   map_clauses(from) capture(ByRef) -> !llvm.ptr {name = ""}
+    %map2 = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
+    omp.target_data  map_entries(%map : !llvm.ptr) use_device_addr(%map2 -> %arg0 : !llvm.ptr)  {
+      %2 = llvm.mlir.constant(10 : i32) : i32
+      %3 = llvm.load %arg0 : !llvm.ptr -> !llvm.ptr
+      llvm.store %2, %3 : i32, !llvm.ptr
+      %4 = llvm.mlir.constant(20 : i32) : i32
+      %5 = llvm.load %b : !llvm.ptr -> !llvm.ptr
+      llvm.store %4, %5 : i32, !llvm.ptr
+      omp.terminator
+    }
+    llvm.return
   }
-  llvm.return
 }
 
 // CHECK:         @.offload_sizes = private unnamed_addr constant [2 x i64] [i64 8, i64 0]
@@ -387,25 +401,27 @@ llvm.func @_QPopenmp_target_use_dev_addr_nomap() {
 
 // -----
 
-llvm.func @_QPopenmp_target_use_dev_both() {
-  %0 = llvm.mlir.constant(1 : i64) : i64
-  %a = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
-  %1 = llvm.mlir.constant(1 : i64) : i64
-  %b = llvm.alloca %0 x !llvm.ptr : (i64) -> !llvm.ptr
-  %map = omp.map.info var_ptr(%a : !llvm.ptr, !llvm.ptr)   map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
-  %map1...
[truncated]

@kparzysz
Copy link
Contributor Author

In the 3 mlir tests I added a module with an "omp.target_triples" attribute to make sure that the expected code is still generated.

Copy link
Member

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Krzysztof!

When no offload targets are specified flang will ignore "target" constructs, but not "target data" constructs.

To point something out, this is not exactly what happens. Host implementations of target constructs will still be produced regardless of whether offload targets were specified. The difference when they aren't is that no kernel launches (i.e. __tgt_target_kernel plus kernel argument structure setup) are produced in the host, but rather host fallback code is branched to directly.

Having said that, I can't imagine why we would need to keep "target data"-related runtime calls when there's no offloading target, so the approach sounds good to me. Maybe it would be good for someone more familiar with current map support to confirm the removal of __tgt_target_data_{begin,end}_mapper is fine when running only on the host (CC: @agozillon, @TIFitis).

Comment on lines +4408 to +4409
if (!isOffloadEntry)
return success();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think it would be more readable if we checked this condition and exited early before this TypeSwitch, rather than checking on each case and then later after it was executed.

Also, shouldn't we still process the body of omp.target_data? That code still should run on the host, AFAIU.

if (!isOffloadEntry && !isa<omp::TargetDataOp>(op))
  return success();

LogicalResult result = llvm::TypeSwitch...;

if (failed(result))
  return failure();

using InsertPointTy = ...

@kparzysz
Copy link
Contributor Author

but rather host fallback code is branched to directly.

Well... That was the intent here. I dug up an old patch I had, so it's possible that something is missing. I'll take a closer look at it.

@kparzysz kparzysz changed the title [flang][OpenMP] Ignore target-data operations with no offload targets [flang][OpenMP] Skip runtime mapping with no offload targets Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants