Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flang][MLIR][OpenMP] Allow setting OMP_MAP_PTR_AND_OBJ by frontends #84328

Closed
wants to merge 2 commits into from

Conversation

agozillon
Copy link
Contributor

This PR seeks to allow representation of the OMP_MAP_PTR_AND_OBJ flag in the OpenMP dialect, this allows more flexibility in where it is set.

In this particular case, the PR seeks to move the setting of this flag to the Flang frontend rather than have it assigned as a last minute thing in the MLIR -> LLVM-IR lowering. This allows us to better assign it in the cases where it's required, it also allows this flag to be utilised as an naïve indicator whether something is a pointer to some data (e.g. C/C++ pointer to malloc'd data or object) without complicating the existing IR much further for the moment.

Also some minor corrections to checking for pointer types in the lowering, swapping to utilising the OMP_MAP_PTR_AND_OBJ flag. As everything is now an opaque pointer now (which is around when these checks became a little redundant I imagine) and we effectively decay the type stored in the map to a non-ptr type it becomes difficult to detect what in actual fact is a pointer in the sense that clang checks if something is a pointer, i.e. at the AST type level, is it a C/C++ pointer type (via utility calls like isAnyPointerType). This is an area that needs some further verification and improvement in the near future, for the moment this suffices and makes the checks more correct.

This change is a pre-cursor refactoring PR for the derived type member mapping PR stack.

…he frontend

This PR seeks to allow representation of the OMP_MAP_PTR_AND_OBJ flag in the
OpenMP dialect, this allows more flexibility in where it is set.

In this particular case, the PR seeks to move the setting of this flag to the Flang frontend rather than have it assigned as a last minute thing in the MLIR -> LLVM-IR lowering. This allows us to better assign it in the cases where it's required, it also
allows this flag to be utilised as an naive indicator whether something is a pointer to
some data (e.g. C/C++ pointer to malloc'd data or object) without complicating the
existing IR much further for the moment.

Also some minor corrections to checking for pointer types in
the lowering, swapping to utilising the OMP_MAP_PTR_AND_OBJ
flag. As everything is now an opaque pointer now (which is around
when these checks became a little redundant I imagine) and we
effectively decay the type stored in the map to a non-ptr type it
becomes difficult to detect what in actual fact is a pointer in the
sense that clang checks if something is a pointer, i.e. at the
AST type level, is it a C/C++ pointer type (isAnyPointerType).
This is an area that needs some further verification and
improvement in the near future, for the moment this suffices
and makes the checks more correct.

This change is a pre-cursor refactoring PR for the derived
type member mapping PR stack.
@llvmbot llvmbot added mlir:llvm mlir flang Flang issues not falling into any other category mlir:openmp flang:fir-hlfir flang:openmp labels Mar 7, 2024
@agozillon agozillon requested a review from skatrak March 7, 2024 14:50
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 7, 2024

@llvm/pr-subscribers-flang-fir-hlfir
@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-mlir-openmp

@llvm/pr-subscribers-flang-openmp

Author: None (agozillon)

Changes

This PR seeks to allow representation of the OMP_MAP_PTR_AND_OBJ flag in the OpenMP dialect, this allows more flexibility in where it is set.

In this particular case, the PR seeks to move the setting of this flag to the Flang frontend rather than have it assigned as a last minute thing in the MLIR -> LLVM-IR lowering. This allows us to better assign it in the cases where it's required, it also allows this flag to be utilised as an naïve indicator whether something is a pointer to some data (e.g. C/C++ pointer to malloc'd data or object) without complicating the existing IR much further for the moment.

Also some minor corrections to checking for pointer types in the lowering, swapping to utilising the OMP_MAP_PTR_AND_OBJ flag. As everything is now an opaque pointer now (which is around when these checks became a little redundant I imagine) and we effectively decay the type stored in the map to a non-ptr type it becomes difficult to detect what in actual fact is a pointer in the sense that clang checks if something is a pointer, i.e. at the AST type level, is it a C/C++ pointer type (via utility calls like isAnyPointerType). This is an area that needs some further verification and improvement in the near future, for the moment this suffices and makes the checks more correct.

This change is a pre-cursor refactoring PR for the derived type member mapping PR stack.


Patch is 38.52 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/84328.diff

18 Files Affected:

  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+5-1)
  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+5-1)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2)
  • (modified) flang/lib/Lower/OpenMP/Utils.cpp (+16)
  • (modified) flang/lib/Lower/OpenMP/Utils.h (+9)
  • (modified) flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp (+11-2)
  • (modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+2-2)
  • (modified) flang/test/Lower/OpenMP/FIR/array-bounds.f90 (+1-1)
  • (modified) flang/test/Lower/OpenMP/FIR/target.f90 (+1-1)
  • (modified) flang/test/Lower/OpenMP/allocatable-array-bounds.f90 (+3-3)
  • (modified) flang/test/Lower/OpenMP/allocatable-map.f90 (+4-4)
  • (modified) flang/test/Lower/OpenMP/array-bounds.f90 (+2-2)
  • (modified) flang/test/Lower/OpenMP/target.f90 (+1-1)
  • (modified) flang/test/Transforms/omp-descriptor-map-info-gen.fir (+2-2)
  • (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+7)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+22-7)
  • (modified) mlir/test/Dialect/OpenMP/ops.mlir (+2-2)
  • (modified) mlir/test/Target/LLVMIR/omptarget-fortran-allocatable-types-host.mlir (+6-7)
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index a41f8312a28c9e8..74e9b9accdc763a 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -919,6 +919,10 @@ bool ClauseProcessor::processMap(
 
         for (const Fortran::parser::OmpObject &ompObject :
              std::get<Fortran::parser::OmpObjectList>(mapClause->v.t).v) {
+          llvm::omp::OpenMPOffloadMappingFlags objectsMapTypeBits = mapTypeBits;
+          checkAndApplyDeclTargetMapFlags(converter, objectsMapTypeBits,
+                                          *getOmpObjectSymbol(ompObject));
+
           llvm::SmallVector<mlir::Value> bounds;
           std::stringstream asFortran;
 
@@ -943,7 +947,7 @@ bool ClauseProcessor::processMap(
               asFortran.str(), bounds, {},
               static_cast<
                   std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
-                  mapTypeBits),
+                  objectsMapTypeBits),
               mlir::omp::VariableCaptureKind::ByRef, symAddr.getType());
 
           mapOperands.push_back(mapOp);
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index 11aff0be25053db..538b8e17d56015b 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -203,6 +203,10 @@ bool ClauseProcessor::processMotionClauses(
                 : llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM;
 
         for (const Fortran::parser::OmpObject &ompObject : motionClause->v.v) {
+          llvm::omp::OpenMPOffloadMappingFlags objectsMapTypeBits = mapTypeBits;
+          checkAndApplyDeclTargetMapFlags(converter, objectsMapTypeBits,
+                                          *getOmpObjectSymbol(ompObject));
+
           llvm::SmallVector<mlir::Value> bounds;
           std::stringstream asFortran;
           Fortran::lower::AddrAndBoundsInfo info =
@@ -226,7 +230,7 @@ bool ClauseProcessor::processMotionClauses(
               asFortran.str(), bounds, {},
               static_cast<
                   std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
-                  mapTypeBits),
+                  objectsMapTypeBits),
               mlir::omp::VariableCaptureKind::ByRef, symAddr.getType());
 
           mapOperands.push_back(mapOp);
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 185e0316870e944..dc29bf1fd7e38b0 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -1127,6 +1127,8 @@ genTargetOp(Fortran::lower::AbstractConverter &converter,
           mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM;
         }
 
+        checkAndApplyDeclTargetMapFlags(converter, mapFlag, sym);
+
         mlir::Value mapOp = createMapInfoOp(
             converter.getFirOpBuilder(), baseOp.getLoc(), baseOp, mlir::Value{},
             name.str(), bounds, {},
diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp
index 49517f62895dfe7..6991acdf02f8f01 100644
--- a/flang/lib/Lower/OpenMP/Utils.cpp
+++ b/flang/lib/Lower/OpenMP/Utils.cpp
@@ -34,6 +34,22 @@ namespace Fortran {
 namespace lower {
 namespace omp {
 
+void checkAndApplyDeclTargetMapFlags(
+    Fortran::lower::AbstractConverter &converter,
+    llvm::omp::OpenMPOffloadMappingFlags &mapFlags,
+    const Fortran::semantics::Symbol &symbol) {
+  if (auto declareTargetOp =
+          llvm::dyn_cast_if_present<mlir::omp::DeclareTargetInterface>(
+              converter.getModuleOp().lookupSymbol(
+                  converter.mangleName(symbol)))) {
+    // Only Link clauses have OMP_MAP_PTR_AND_OBJ applied, To clause
+    // seems to function differently.
+    if (declareTargetOp.getDeclareTargetCaptureClause() ==
+        mlir::omp::DeclareTargetCaptureClause::link)
+      mapFlags |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_PTR_AND_OBJ;
+  }
+}
+
 void genObjectList(const Fortran::parser::OmpObjectList &objectList,
                    Fortran::lower::AbstractConverter &converter,
                    llvm::SmallVectorImpl<mlir::Value> &operands) {
diff --git a/flang/lib/Lower/OpenMP/Utils.h b/flang/lib/Lower/OpenMP/Utils.h
index 76a15e8bcaab9bf..a19f0a294eb15d4 100644
--- a/flang/lib/Lower/OpenMP/Utils.h
+++ b/flang/lib/Lower/OpenMP/Utils.h
@@ -21,6 +21,10 @@ namespace fir {
 class FirOpBuilder;
 } // namespace fir
 
+namespace llvm::omp {
+enum class OpenMPOffloadMappingFlags : uint64_t;
+} // namespace llvm::omp
+
 namespace Fortran {
 
 namespace semantics {
@@ -50,6 +54,11 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc,
                 mlir::omp::VariableCaptureKind mapCaptureType, mlir::Type retTy,
                 bool isVal = false);
 
+void checkAndApplyDeclTargetMapFlags(
+    Fortran::lower::AbstractConverter &converter,
+    llvm::omp::OpenMPOffloadMappingFlags &mapFlags,
+    const Fortran::semantics::Symbol &symbol);
+
 void gatherFuncAndVarSyms(
     const Fortran::parser::OmpObjectList &objList,
     mlir::omp::DeclareTargetCaptureClause clause,
diff --git a/flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp b/flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp
index 6ffcf0746c76fc0..5e84984eca12e90 100644
--- a/flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp
+++ b/flang/lib/Optimizer/Transforms/OMPDescriptorMapInfoGen.cpp
@@ -29,6 +29,7 @@
 #include "mlir/Pass/Pass.h"
 #include "mlir/Support/LLVM.h"
 #include "llvm/ADT/SmallPtrSet.h"
+#include "llvm/Frontend/OpenMP/OMPConstants.h"
 #include <iterator>
 
 namespace fir {
@@ -76,6 +77,11 @@ class OMPDescriptorMapInfoGenPass
     mlir::Value baseAddrAddr = builder.create<fir::BoxOffsetOp>(
         loc, descriptor, fir::BoxFieldAttr::base_addr);
 
+    llvm::omp::OpenMPOffloadMappingFlags baseAddrMapFlag =
+        llvm::omp::OpenMPOffloadMappingFlags(op.getMapType().value());
+    baseAddrMapFlag |=
+        llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_PTR_AND_OBJ;
+
     // Member of the descriptor pointing at the allocated data
     mlir::Value baseAddr = builder.create<mlir::omp::MapInfoOp>(
         loc, baseAddrAddr.getType(), descriptor,
@@ -83,8 +89,11 @@ class OMPDescriptorMapInfoGenPass
             fir::unwrapRefType(baseAddrAddr.getType()))
             .getElementType(),
         baseAddrAddr, mlir::SmallVector<mlir::Value>{}, op.getBounds(),
-        builder.getIntegerAttr(builder.getIntegerType(64, false),
-                               op.getMapType().value()),
+        builder.getIntegerAttr(
+            builder.getIntegerType(64, false),
+            static_cast<
+                std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
+                baseAddrMapFlag)),
         builder.getAttr<mlir::omp::VariableCaptureKindAttr>(
             mlir::omp::VariableCaptureKind::ByRef),
         builder.getStringAttr("") /*name*/);
diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
index a1fc614334dbce9..997eb161b639c52 100644
--- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
+++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
@@ -914,8 +914,8 @@ func.func @omp_critical_() {
 func.func @omp_map_info_descriptor_type_conversion(%arg0 : !fir.ref<!fir.box<!fir.heap<i32>>>) {
   // CHECK: %[[GEP:.*]] = llvm.getelementptr %[[ARG_0]][0, 0] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>
   %0 = fir.box_offset %arg0 base_addr : (!fir.ref<!fir.box<!fir.heap<i32>>>) -> !fir.llvm_ptr<!fir.ref<i32>>
-  // CHECK: %[[MEMBER_MAP:.*]] = omp.map_info var_ptr(%[[GEP]] : !llvm.ptr, i32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
-  %1 = omp.map_info var_ptr(%0 : !fir.llvm_ptr<!fir.ref<i32>>, i32) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<i32>> {name = ""}
+  // CHECK: %[[MEMBER_MAP:.*]] = omp.map_info var_ptr(%[[GEP]] : !llvm.ptr, i32) map_clauses(ptr_and_obj, tofrom) capture(ByRef) -> !llvm.ptr {name = ""}
+  %1 = omp.map_info var_ptr(%0 : !fir.llvm_ptr<!fir.ref<i32>>, i32) map_clauses(ptr_and_obj, tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<i32>> {name = ""}
   // CHECK: %[[DESC_MAP:.*]] = omp.map_info var_ptr(%[[ARG_0]] : !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8)>) map_clauses(always, delete) capture(ByRef) members(%[[MEMBER_MAP]] : !llvm.ptr) -> !llvm.ptr {name = ""}
   %2 = omp.map_info var_ptr(%arg0 : !fir.ref<!fir.box<!fir.heap<i32>>>, !fir.box<!fir.heap<i32>>) map_clauses(always, delete) capture(ByRef) members(%1 : !fir.llvm_ptr<!fir.ref<i32>>) -> !fir.ref<!fir.box<!fir.heap<i32>>> {name = ""}
   // CHECK: omp.target_exit_data map_entries(%[[DESC_MAP]] : !llvm.ptr) 
diff --git a/flang/test/Lower/OpenMP/FIR/array-bounds.f90 b/flang/test/Lower/OpenMP/FIR/array-bounds.f90
index 3cd284c46e727de..1e7ad6dc8ea0bcc 100644
--- a/flang/test/Lower/OpenMP/FIR/array-bounds.f90
+++ b/flang/test/Lower/OpenMP/FIR/array-bounds.f90
@@ -46,7 +46,7 @@ module assumed_array_routines
 !ALL: %[[DIMS1:.*]]:3 = fir.box_dims %arg0, %[[C0_1]] : (!fir.box<!fir.array<?xi32>>, index) -> (index, index, index)
 !ALL: %[[BOUNDS:.*]] = omp.bounds   lower_bound(%[[C3]] : index) upper_bound(%[[C4]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) start_idx(%[[C0]] : index) {stride_in_bytes = true}
 !ALL: %[[BOXADDRADDR:.*]] = fir.box_offset %0 base_addr : (!fir.ref<!fir.box<!fir.array<?xi32>>>) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>
-!ALL: %[[MAP_MEMBER:.*]] = omp.map_info var_ptr(%0 : !fir.ref<!fir.box<!fir.array<?xi32>>>, !fir.array<?xi32>) var_ptr_ptr(%[[BOXADDRADDR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""}
+!ALL: %[[MAP_MEMBER:.*]] = omp.map_info var_ptr(%0 : !fir.ref<!fir.box<!fir.array<?xi32>>>, !fir.array<?xi32>) var_ptr_ptr(%[[BOXADDRADDR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(ptr_and_obj, tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""}
 !ALL: %[[MAP:.*]] = omp.map_info var_ptr(%0 : !fir.ref<!fir.box<!fir.array<?xi32>>>, !fir.box<!fir.array<?xi32>>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBER]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) -> !fir.ref<!fir.array<?xi32>> {name = "arr_read_write(2:5)"}
 !ALL: %[[MAP2:.*]] = omp.map_info var_ptr(%[[ALLOCA]] : !fir.ref<i32>, i32)   map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref<i32> {name = "i"}
 !ALL: omp.target map_entries(%[[MAP_MEMBER]] -> %{{.*}}, %[[MAP]] -> %{{.*}}, %[[MAP2]] -> %{{.*}} : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>, !fir.ref<!fir.array<?xi32>>, !fir.ref<i32>) {
diff --git a/flang/test/Lower/OpenMP/FIR/target.f90 b/flang/test/Lower/OpenMP/FIR/target.f90
index 06772771647de8b..cb4bfa06e618a67 100644
--- a/flang/test/Lower/OpenMP/FIR/target.f90
+++ b/flang/test/Lower/OpenMP/FIR/target.f90
@@ -450,7 +450,7 @@ end subroutine omp_target_device_ptr
  subroutine omp_target_device_addr
    integer, pointer :: a
    !CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box<!fir.ptr<i32>> {bindc_name = "a", uniq_name = "_QFomp_target_device_addrEa"}
-   !CHECK: %[[MAP_MEMBERS:.*]] = omp.map_info var_ptr({{.*}} : !fir.ref<!fir.box<!fir.ptr<i32>>>, i32) var_ptr_ptr({{.*}} : !fir.llvm_ptr<!fir.ref<i32>>) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<i32>> {name = ""}
+   !CHECK: %[[MAP_MEMBERS:.*]] = omp.map_info var_ptr({{.*}} : !fir.ref<!fir.box<!fir.ptr<i32>>>, i32) var_ptr_ptr({{.*}} : !fir.llvm_ptr<!fir.ref<i32>>) map_clauses(ptr_and_obj, tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<i32>> {name = ""}
    !CHECK: %[[MAP:.*]] = omp.map_info var_ptr({{.*}} : !fir.ref<!fir.box<!fir.ptr<i32>>>, !fir.box<!fir.ptr<i32>>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_MEMBERS]] : !fir.llvm_ptr<!fir.ref<i32>>) -> !fir.ref<!fir.box<!fir.ptr<i32>>> {name = "a"}
    !CHECK: omp.target_data map_entries(%[[MAP_MEMBERS]], %[[MAP]] : {{.*}}) use_device_addr(%[[VAL_0]] : !fir.ref<!fir.box<!fir.ptr<i32>>>) {
    !$omp target data map(tofrom: a) use_device_addr(a)
diff --git a/flang/test/Lower/OpenMP/allocatable-array-bounds.f90 b/flang/test/Lower/OpenMP/allocatable-array-bounds.f90
index adf74efa9b596ed..42cb7cffe64945a 100644
--- a/flang/test/Lower/OpenMP/allocatable-array-bounds.f90
+++ b/flang/test/Lower/OpenMP/allocatable-array-bounds.f90
@@ -23,7 +23,7 @@
 !HOST: %[[BOX_3:.*]]:3 = fir.box_dims %[[LOAD_3]], %[[CONSTANT_3]] : (!fir.box<!fir.heap<!fir.array<?xi32>>>, index) -> (index, index, index)
 !HOST: %[[BOUNDS_1:.*]] = omp.bounds lower_bound(%[[LB_1]] : index) upper_bound(%[[UB_1]] : index) extent(%[[BOX_3]]#1 : index) stride(%[[BOX_2]]#2 : index) start_idx(%[[BOX_1]]#0 : index) {stride_in_bytes = true}
 !HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[DECLARE_1]]#1 base_addr : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>
-!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE_1]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.array<?xi32>) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS_1]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""}    
+!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE_1]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.array<?xi32>) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(ptr_and_obj, tofrom) capture(ByRef) bounds(%[[BOUNDS_1]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""}    
 !HOST: %[[MAP_INFO_1:.*]] = omp.map_info var_ptr(%[[DECLARE_1]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.box<!fir.heap<!fir.array<?xi32>>>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_INFO_MEMBER]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {name = "sp_read(2:5)"}
 
 !HOST: %[[LOAD_3:.*]] = fir.load %[[DECLARE_2]]#0 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>
@@ -41,7 +41,7 @@
 !HOST: %[[BOX_5:.*]]:3 = fir.box_dims %[[LOAD_5]], %[[CONSTANT_5]] : (!fir.box<!fir.heap<!fir.array<?xi32>>>, index) -> (index, index, index)
 !HOST: %[[BOUNDS_2:.*]] = omp.bounds lower_bound(%[[LB_2]] : index) upper_bound(%[[UB_2]] : index) extent(%[[BOX_5]]#1 : index) stride(%[[BOX_4]]#2 : index) start_idx(%[[BOX_3]]#0 : index) {stride_in_bytes = true}
 !HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[DECLARE_2]]#1 base_addr : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>
-!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE_2]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.array<?xi32>) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS_2]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""}    
+!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE_2]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.array<?xi32>) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(ptr_and_obj, tofrom) capture(ByRef) bounds(%[[BOUNDS_2]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""}    
 !HOST: %[[MAP_INFO_2:.*]] = omp.map_info var_ptr(%[[DECLARE_2]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.box<!fir.heap<!fir.array<?xi32>>>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_INFO_MEMBER]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {name = "sp_write(2:5)"}
 
 subroutine read_write_section()
@@ -80,7 +80,7 @@ module assumed_allocatable_array_routines
 !HOST: %[[BOX_3:.*]]:3 = fir.box_dims %[[LOAD_3]], %[[CONSTANT_3]] : (!fir.box<!fir.heap<!fir.array<?xi32>>>, index) -> (index, index, index)
 !HOST: %[[BOUNDS:.*]] = omp.bounds lower_bound(%[[LB]] : index) upper_bound(%[[UB]] : index) extent(%[[BOX_3]]#1 : index) stride(%[[BOX_2]]#2 : index) start_idx(%[[BOX_1]]#0 : index) {stride_in_bytes = true}
 !HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[DECLARE]]#1 base_addr : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>
-!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.array<?xi32>) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""}    
+!HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map_info var_ptr(%[[DECLARE]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.array<?xi32>) var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(ptr_and_obj, tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""}    
 !HOST: %[[MAP_INFO:.*]] = omp.map_info var_ptr(%[[DECLARE]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.box<!fir.heap<!fir.array<?xi32>>>) map_clauses(tofrom) capture(ByRef) members(%[[MAP_INFO_MEMBER]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {name = "arr_read_write(2:5)"}    
 subroutine assumed_shape_array(arr_read_write)
     integer, allocatable, intent(inout) :: arr_read_write(:)
diff --git a/flang/test/Lower/OpenMP/allocatable-map.f90 b/flang/test/Lower/OpenMP/allocatable-map.f90
index ddc20b582b26e66..4854a0cc82b72a4 100644
--- a/flang/test/Lower/OpenMP/allocatable-map.f90
+++ b/flang/test/Lower/OpenMP/allocatable-map.f90
@@ -2,12 +2,12 @@
 
 !HLFIRDIALECT: %[[POINTER:.*]]:2 = hlfir.declare %0 {fortran_attrs = #fir.var_attrs<pointer>, uniq_name = "_QFpointer_routineEpoint"} : (!fir.ref<!fir.box<!fir.ptr<i32>>>) -> (!fir.ref<!fir.box<!fir.ptr<i32>>>, !fir.ref<!fir.box<!fir.ptr<i32>>>)
 !HLFIRDIALECT: %[[BOX_OFF:.*]] = fir.box_offset %[[POINTER]]#1 base_addr : (!fir.ref<!fir.box<!fir.ptr<i32>>>) -> !fir.llvm_ptr<!fir.ref<i32>>
-!HLFIRDIALECT: %[[POINTER_MAP_MEMBER:.*]] = omp.map_info var_ptr(%[[POINTER]]#1 : !fir.ref<!fir.box<!fir.ptr<i32>>>, i32) var_ptr_ptr(%[[BOX_OFF]] : !fir.llvm_ptr<!fir.ref<i32>>)  map_clauses(implicit, tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<i32>> {name = ""}
-!HLFIRDIALECT: %[[POINTER_MAP:.*]] = omp.map_info var_ptr(%[[POINTER]]#1 : !fir.ref<!fir.box<!fir.ptr<i32>>>, !fir.box<!fir.ptr<i32>>) map_clauses(implicit, tofrom) capture(ByRef) members(%[[POINTER_MAP_MEMBER]] : !fir.llvm_ptr<!fir.ref<i32>>) -> !fir.ref<!fir.box<!fir.ptr<i32>>> {name = "point"}
-!HLFIRDIALECT: omp.target map_entries({{.*}}, %[[POINTER_MAP_MEMBER]] -> {{.*}}, %[[POINTER_MAP]] -> {{.*}} : {{.*}}, !fir.llvm_ptr<!fir.ref<i32>>, !fir.ref<!fir.box<!fir.ptr<i32>>>) {
+!HLFIRDIALECT: %[[POINTER_MAP_MEMBER:.*]] = omp.map_info var_ptr(%[[POINTER]]#1 : !fir.ref<!fir.box<!fir.ptr<i32>>>, i32) var_ptr_ptr(%[[BOX_OFF]] : !fir.llvm_ptr<!fir.ref<i32>>)  map_clauses(ptr_and_obj, tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<i32>> {name = ""}
+!HLFIRDIALECT: %[[POINTER_MAP:.*]] = omp.map_info var_ptr(%[[POINTER]]#1 : !fir.ref<!fir.box<!fir.ptr<i32>>>, !fir.box<!fir.ptr<i32>>) map_clauses(tofrom) capture(ByRef) members(%[[POINTER_MAP_MEMBER]] : !fir.llvm_ptr<!fir.ref<i32>>) -> !fir.ref<!fir.box<!fir.ptr<i32>>> {name = "point"}
+!HLFIRDIALECT: omp.target map_entries(%[[POINTER_MAP_MEMBER]] -> {{.*}}, %[[POINTER_MAP]] -> {{.*}} : !fir.llvm_ptr<!fir.ref<i32>>, !fir.ref<!fir.box<!fir.ptr<i32>>>) {
 subroutine pointer_routine()
     integer, pointer :: point 
-!$omp target map(tofrom:pointer)
+!$omp target map(tofrom:point)
     point = 1
 !$omp end target
 end subroutine pointer_routine
diff --git a/flang/test/Lower/OpenMP/array-bounds.f90 b/flang/test/Lower/OpenMP/array-bounds.f90
index 7d76ff4b106a0c0..706428544afb3d8 100644
--- a/flang/test/Lower/OpenMP/array-bounds.f90
+++ b/flang/test/Lower/OpenMP/array-bounds.f90
@@ -51,7 +51,7 @@ module assumed_array_routines
 !HOST: %[[DIMS1:.*]]:3 = fir.box_dims %[[ARG0_DECL]]#1, %[[C0_1]] : (!fir.box<!fir.array<?xi32>>, index) -> (index, index, index)
 !HOST: %[[BOUNDS:.*]] = omp.bounds   lower_bound(%[[C3]] : index) upper_bound(%[[C4]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) start_idx(%[[C0]] : index) {st...
[truncated]

@agozillon
Copy link
Contributor Author

agozillon commented Mar 7, 2024

This PR carves out a bit of the necessary change set from the derived type member map PR stack that can be made into it's own self sufficient PR as it is independent of the larger change set.

The idea is to detach the setting of OMP_MAP_AND_OBJ_PTR from the lowering phase and allow it to be specifiable by the frontend where there's more information to set it. e.g. descriptor base addresses require the flag, but currently no other member mapping requires it and there is likely some other cases for Fortran (addendum of descriptors as a first) and more in the case of C/C++.

Hoping to have another PR that will remove some of the complexity from the derived type member PR stack into an individual PR soon if all goes well with extracting it. Once that's up I'll look at redacting the change set contained in the two new PRs from the PR stack, the stack will then unfortunately fail until these two PRs it's dependent on are landed, but that seems like a reasonable trade off to make it more bite sized for reviewers.

Copy link
Contributor

@kparzysz kparzysz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally it looks good to me. I added some comments related to readability.

@@ -34,6 +34,22 @@ namespace Fortran {
namespace lower {
namespace omp {

void checkAndApplyDeclTargetMapFlags(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a value instead of modifying the input.

builder.getIntegerAttr(
builder.getIntegerType(64, false),
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as prior comments about this.

@@ -943,7 +947,7 @@ bool ClauseProcessor::processMap(
asFortran.str(), bounds, {},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you perhaps add a "using" alias for the underlying type and use it here? This would improve the readability a bit.

@@ -226,7 +230,7 @@ bool ClauseProcessor::processMotionClauses(
asFortran.str(), bounds, {},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing about "using". This should reduce the nesting level of the expression.

Comment on lines 41 to 44
if (auto declareTargetOp =
llvm::dyn_cast_if_present<mlir::omp::DeclareTargetInterface>(
converter.getModuleOp().lookupSymbol(
converter.mangleName(symbol)))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you break it up into a few lines?

Copy link
Contributor Author

@agozillon agozillon Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, although, I think this is a good example of reviewers having different tastes! ;-) I believe I was asked by @ergawy to collapse this into a single if in a prior iteration. I have no preference either way so happy to revert it to something more alike the the original form. It is however, entirely my bad for losing the previous PR history through my own choices!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In such cases use your best judgment. IMO this looks horrible, but I'll respect your choice. :)
Just respond with what you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Kareem complains, blame me. :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of the large one liner look either, I tend to think it's easier to approach/read things when they're a little more broken down into steps. But it's all preference at the end of the day and no one to blame!

Comment on lines 1830 to 1832
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapFlag &
llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_PTR_AND_OBJ) != 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please break it up, it's hard to see what's going on here at the first glance.

@agozillon
Copy link
Contributor Author

Thank you very much for the quick review @kparzysz I'll try update the PR with the requested changes by tomorrow or as soon as I can get to it after opening up the other PR, which is unfortunately a little dependent on the functionality provided by this PR (but still makes sense to have seperate, unfortunately missed the boat on making it a PR stack as this one is already open though)

converter.mangleName(symbol)))) {
// Only Link clauses have OMP_MAP_PTR_AND_OBJ applied, To clause
// seems to function differently.
if (declareTargetOp.getDeclareTargetCaptureClause() ==
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As, we do not have else{ } block here, can these two nested if statements be combined to one if statement including check for link clause?

@@ -2034,12 +2046,13 @@ static void processMapMembersWithParent(
assert(memberDataIdx >= 0 && "could not find mapped member of structure");

// Same MemberOfFlag to indicate its link with parent and other members
// of, and we flag that it's part of a pointer and object coupling.
// of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - Maybe remove "of"

agozillon added a commit to agozillon/llvm-project that referenced this pull request Mar 7, 2024
…ectives

This PR refactors bounds offsetting by combining the two differing implementations
(one applying to initial derivd type member map imlpementation for descriptors and
the other for reguar arrays, effectively allocatable array vs regular array in fortran) now
that it's a little simpler to do.

The PR also moves the utilisation of createAlteredByCaptureMap into genMapInfoOp,
where it will be correctly applied to all MapInfoData, appropriately offsetting and
altering Pointer data set in the kernel argument structure on the host.  This
primarily means bounds will now correctly apply to enter/exit/update map
clauses as opposed to just the Target directive that is currently the case. A few
fortran runtime tests have been added to verify this new behaviour.

This PR depends on: llvm#84328 and is an extraction
of the larger derived type member map PR stack (so a requirement for it to land).
agozillon added a commit to agozillon/llvm-project that referenced this pull request Mar 7, 2024
…ectives

This PR refactors bounds offsetting by combining the two differing implementations
(one applying to initial derivd type member map imlpementation for descriptors and
the other for reguar arrays, effectively allocatable array vs regular array in fortran) now
that it's a little simpler to do.

The PR also moves the utilisation of createAlteredByCaptureMap into genMapInfoOp,
where it will be correctly applied to all MapInfoData, appropriately offsetting and
altering Pointer data set in the kernel argument structure on the host.  This
primarily means bounds will now correctly apply to enter/exit/update map
clauses as opposed to just the Target directive that is currently the case. A few
fortran runtime tests have been added to verify this new behaviour.

This PR depends on: llvm#84328 and is an extraction
of the larger derived type member map PR stack (so a requirement for it to land).
@vzakhari
Copy link
Contributor

vzakhari commented Mar 7, 2024

Thank you for the changes!

You may ignore this, but I think this low-level naming coming from libomptarget implementation looks off for a higher level dialect. As I understand, you need to distinguish the pointer and non-pointer members of the aggregate so that you can later properly set PTR_AND_OBJ. So can we use something like is_ptr_member to make this difference explicit? It reads better to me from the language point of view, at least for C/C++. It also looks okay for Fortran, if we assume the underlying memory representation of the descriptor (which, I believe, will be the only parent whose members can have is_ptr_member).

Feel free to go with PTR_AND_OBJ though :)

@agozillon
Copy link
Contributor Author

agozillon commented Mar 7, 2024

Thank you for the changes!

You may ignore this, but I think this low-level naming coming from libomptarget implementation looks off for a higher level dialect. As I understand, you need to distinguish the pointer and non-pointer members of the aggregate so that you can later properly set PTR_AND_OBJ. So can we use something like is_ptr_member to make this difference explicit? It reads better to me from the language point of view, at least for C/C++. It also looks okay for Fortran, if we assume the underlying memory representation of the descriptor (which, I believe, will be the only parent whose members can have is_ptr_member).

Feel free to go with PTR_AND_OBJ though :)

That is indeed correct for at least one of the uses, I think (but perhaps I am incorrect) it's more generally used to indicate that something is a pointer that's pointing to some memory, so both should be mapped accordingly, feels a little out of place in Fortran but the descriptor base address and declare target link need it set, the latter because it undergoes some transformations to be a global pointer from my recollection of working on it as opposed to the original type itself having anything to do with being a pointer <-> pointee coupling.

As for the naming I agree it's not the best and I would be happy to change it if you wish to suggest another name as I think is_ptr_member may not quite cover all of it's uses. I am not so sure if is_ptr would be too generic, is_ptr_pointee?

Primarily just cleanup, with the main addition being a new alias added to
OMPConstants.h to simplify casting back and forth.
@llvmbot llvmbot added the clang:openmp OpenMP related changes to Clang label Mar 8, 2024
@agozillon
Copy link
Contributor Author

Recent commit aims to address the current reviewer comments, nothing too major just the requested tidying up, largest addition was a type alias to OMPConstants.h to simplify the rather frequent requirement to cast back and forth for map types.

Copy link
Contributor

@kparzysz kparzysz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vzakhari
Copy link
Contributor

vzakhari commented Mar 8, 2024

Thank you for the changes!
You may ignore this, but I think this low-level naming coming from libomptarget implementation looks off for a higher level dialect. As I understand, you need to distinguish the pointer and non-pointer members of the aggregate so that you can later properly set PTR_AND_OBJ. So can we use something like is_ptr_member to make this difference explicit? It reads better to me from the language point of view, at least for C/C++. It also looks okay for Fortran, if we assume the underlying memory representation of the descriptor (which, I believe, will be the only parent whose members can have is_ptr_member).
Feel free to go with PTR_AND_OBJ though :)

That is indeed correct for at least one of the uses, I think (but perhaps I am incorrect) it's more generally used to indicate that something is a pointer that's pointing to some memory, so both should be mapped accordingly, feels a little out of place in Fortran but the descriptor base address and declare target link need it set, the latter because it undergoes some transformations to be a global pointer from my recollection of working on it as opposed to the original type itself having anything to do with being a pointer <-> pointee coupling.

As for the naming I agree it's not the best and I would be happy to change it if you wish to suggest another name as I think is_ptr_member may not quite cover all of it's uses. I am not so sure if is_ptr would be too generic, is_ptr_pointee?

is_ptr_pointee is better, but what if we go with just pointer? The varPtr is always a pointer, so the pointer attribute on top of it would just indicate that the actual value pointed to by varPtr contains a pointer. Within the context of the chained map_info, having the pointer attribute on a member map_info will always indicate that the pointee mapping and the pointer attachment have to be made, but it will also have the exact meaning, again, that *varPtr is a pointer within the parent object (whether it is a structure, descriptor or scalar pointer). I wonder if there are any cases where we would not need to do the pointee mapping and the pointer attachment when we have a member map_info for a pointer entity of a parent. If not, then pointer just sounds right to me.

Taking a step back, why the semantics that you try to represent with this new attribute is not represented just by the fact that varPtrPtr is present on the member map_info - can you please explain?

My first paragraph above is sort of off when I am looking at the current map_info representation for the members. Sorry that I missed the point in time when we started using the box reference as varPtr operand to represent the actual data mapping. I found this comment in #71766:

Why utilise the original symbol (descriptor and pointer to data) as opposed to the BaseAddr (pointer to data)?

Currently (and probably always, as I think the future specification iteration requests descriptor mapping to some extent, but I might be misunderstanding the change set) the default lowering of the kernel expects the descriptor as well as the underlying data to be mapped, the generated kernel directly accesses the descriptor for calculations such as indexing.

The main symbol/variable being accessed inside of the kernel is also the original descriptor, and there's a portion of kernel lowering for device where we need the original descriptor SSA/global used inside of the kernel so that we can remap all uses of it to the newly generated kernel argument (or derived accessor instructions to this kernel argument) that corresponds
to it. Which works perfectly fine if we map the original symbol, but if we map the BaseAddr, which is a BoxAddrOf operation to the BoxAddr it becomes a little trickier as we can't traceback to the original symbol (global or local allocation) we need, as every BoxAddrOf operation and a lot of other box related operations (e.g. box/embox) currently generates a new allocation
and the map_info that previously pointed to the BaseAddr linked to the original descriptor instruction (global, allocatable or function argument) no longer points to it, but points to one of the many allocations created during lowering, and then we can no longer rewrite it's kernel uses.

When testing with a mix of target + enter/exit, the alloca SSA each map_info was tracking was also completely different due to each BoxAddrOf generating it's own alloca, this could also prove to be a bit of an issue when mapping data across.

There might be a way around this I can't think of, but it seems we need to map the descriptor rather than the BaseAddr (internal pointer of the descriptor) currently.

I am not sure what complications you had with tracing back of the original symbol, why the tracing back is needed, and why the BoxAddrOf operations would product new allocations. Can you please explain?

Sorry if I missed the turn, but I was under impression that we agreed on varPtr being exactly the address of the memory to map with the bounds provided by the bounds, the parent-children relationship provided by the chained map_info, and the pointer attachment semantics represented by varPtrPtr (of course, on top of the pointer-pointee semantics provided, again, by the changed map_info).

@agozillon
Copy link
Contributor Author

agozillon commented Mar 8, 2024

Thank you for the changes!
You may ignore this, but I think this low-level naming coming from libomptarget implementation looks off for a higher level dialect. As I understand, you need to distinguish the pointer and non-pointer members of the aggregate so that you can later properly set PTR_AND_OBJ. So can we use something like is_ptr_member to make this difference explicit? It reads better to me from the language point of view, at least for C/C++. It also looks okay for Fortran, if we assume the underlying memory representation of the descriptor (which, I believe, will be the only parent whose members can have is_ptr_member).
Feel free to go with PTR_AND_OBJ though :)

That is indeed correct for at least one of the uses, I think (but perhaps I am incorrect) it's more generally used to indicate that something is a pointer that's pointing to some memory, so both should be mapped accordingly, feels a little out of place in Fortran but the descriptor base address and declare target link need it set, the latter because it undergoes some transformations to be a global pointer from my recollection of working on it as opposed to the original type itself having anything to do with being a pointer <-> pointee coupling.
As for the naming I agree it's not the best and I would be happy to change it if you wish to suggest another name as I think is_ptr_member may not quite cover all of it's uses. I am not so sure if is_ptr would be too generic, is_ptr_pointee?

is_ptr_pointee is better, but what if we go with just pointer? The varPtr is always a pointer, so the pointer attribute on top of it would just indicate that the actual value pointed to by varPtr contains a pointer. Within the context of the chained map_info, having the pointer attribute on a member map_info will always indicate that the pointee mapping and the pointer attachment have to be made, but it will also have the exact meaning, again, that *varPtr is a pointer within the parent object (whether it is a structure, descriptor or scalar pointer). I wonder if there are any cases where we would not need to do the pointee mapping and the pointer attachment when we have a member map_info for a pointer entity of a parent. If not, then pointer just sounds right to me.

Taking a step back, why the semantics that you try to represent with this new attribute is not represented just by the fact that varPtrPtr is present on the member map_info - can you please explain?

While varPtrPtr makes sense for use with descriptors, I am not sure it makes sense for declare target, where the type at a FIR/LLVM dialect level has nothing to do with a pointer/pointee coupling, more to do with the end state of declare target. So, I am not so sure correlating the two things is ideal, there may be cases where the map flag is necessary (i.e. declare target) but varPtrPtr isn't and vice versa.

My first paragraph above is sort of off when I am looking at the current map_info representation for the members. Sorry that I missed the point in time when we started using the box reference as varPtr operand to represent the actual data mapping. I found this comment in #71766:

Why utilise the original symbol (descriptor and pointer to data) as opposed to the BaseAddr (pointer to data)?

Currently (and probably always, as I think the future specification iteration requests descriptor mapping to some extent, but I might be misunderstanding the change set) the default lowering of the kernel expects the descriptor as well as the underlying data to be mapped, the generated kernel directly accesses the descriptor for calculations such as indexing.

The main symbol/variable being accessed inside of the kernel is also the original descriptor, and there's a portion of kernel lowering for device where we need the original descriptor SSA/global used inside of the kernel so that we can remap all uses of it to the newly generated kernel argument (or derived accessor instructions to this kernel argument) that corresponds
to it. Which works perfectly fine if we map the original symbol, but if we map the BaseAddr, which is a BoxAddrOf operation to the BoxAddr it becomes a little trickier as we can't traceback to the original symbol (global or local allocation) we need, as every BoxAddrOf operation and a lot of other box related operations (e.g. box/embox) currently generates a new allocation
and the map_info that previously pointed to the BaseAddr linked to the original descriptor instruction (global, allocatable or function argument) no longer points to it, but points to one of the many allocations created during lowering, and then we can no longer rewrite it's kernel uses.

When testing with a mix of target + enter/exit, the alloca SSA each map_info was tracking was also completely different due to each BoxAddrOf generating it's own alloca, this could also prove to be a bit of an issue when mapping data across.

There might be a way around this I can't think of, but it seems we need to map the descriptor rather than the BaseAddr (internal pointer of the descriptor) currently.

This comment exert is rather out of context and time unfortunately :-) This was at minimum 3 large revisions of the PR before its final state, and predates the usage of BoxAddrOf by a fair bit from my recollection.

I am not sure what complications you had with tracing back of the original symbol, why the tracing back is needed,

The tracing back was required, as when you are mapping a structure and member pointer pairing (or presumably any member of a structure) the address of the member needs to be part of the originating structure (or in it's address range) to be correctly attached by the runtime (at least that's how I understood the issue at the time, perhaps I misinterpreted it). So, the dislocation of the two mappings (member and parent) to seperate allocations was not ideal. However, this is not the case when using BoxAddrOf.

and why the BoxAddrOf operations would product new allocations. Can you please explain?

As for why the extra allocations are/were made in this case, I do not know, I didn't create the lowering in this case, I am sure there are reasons, but they're unknown to me. However, it was not related to BoxAddrOf.

Sorry if I missed the turn, but I was under impression that we agreed on varPtr being exactly the address of the memory to map with the bounds provided by the bounds, the parent-children relationship provided by the chained map_info, and the pointer attachment semantics represented by varPtrPtr (of course, on top of the pointer-pointee semantics provided, again, by the changed map_info).

That is the case currently as far as I am aware, I don't see where the misunderstanding is coming from unfortunately. Although, I am leaning more towards not having a "chaining of map_info", but that can be discussed in the relevant PR when it's ready for review again.

@vzakhari
Copy link
Contributor

While varPtrPtr makes sense for use with descriptors, I am not sure it makes sense for declare target, where the type at a FIR/LLVM dialect level has nothing to do with a pointer/pointee coupling, more to do with the end state of declare target. So, I am not so sure correlating the two things is ideal, there may be cases where the map flag is necessary (i.e. declare target) but varPtrPtr isn't and vice versa.

I see. I guess it would not make sense for the declare targets. Thanks for the explanation!

The tracing back was required, as when you are mapping a structure and member pointer pairing (or presumably any member of a structure) the address of the member needs to be part of the originating structure (or in it's address range) to be correctly attached by the runtime (at least that's how I understood the issue at the time, perhaps I misinterpreted it). So, the dislocation of the two mappings (member and parent) to seperate allocations was not ideal. However, this is not the case when using BoxAddrOf.

I might be wrong, but I think all the information for invoking libomptarget should be available in varPtr and varPtrPtr, i.e. the corresponding Args pointer is the value of in varPtr (which is the value of the data pointer potentially offset'ed due to array section), and the ArgBases pointer is the value in varPtrPtr (which is the address of the member within the data structure, be it a descriptor or a C structure).

Okay, I guess I missed my chance to get more details, since there have been a lot of rework, as you said :)

@agozillon
Copy link
Contributor Author

While varPtrPtr makes sense for use with descriptors, I am not sure it makes sense for declare target, where the type at a FIR/LLVM dialect level has nothing to do with a pointer/pointee coupling, more to do with the end state of declare target. So, I am not so sure correlating the two things is ideal, there may be cases where the map flag is necessary (i.e. declare target) but varPtrPtr isn't and vice versa.

I see. I guess it would not make sense for the declare targets. Thanks for the explanation!

That's no problem, the discussion made me think of another way that this PR may be do-able without the addition of OMP_MAP_PTR_AND_OBJ being frontend assignable as well at least for the moment. So thank you very much for that. There's a possibility I will go down that route instead as it's a lesser amount of changes, if that's the case and it works out, I'll close this one for the time being. While I think it's useful to have the flexibility to add this map type via other dialects or frontends, it's likely not worth the added complexity for the time being if it is avoidable, but it could be a requirement in the future.

The tracing back was required, as when you are mapping a structure and member pointer pairing (or presumably any member of a structure) the address of the member needs to be part of the originating structure (or in it's address range) to be correctly attached by the runtime (at least that's how I understood the issue at the time, perhaps I misinterpreted it). So, the dislocation of the two mappings (member and parent) to seperate allocations was not ideal. However, this is not the case when using BoxAddrOf.

I might be wrong, but I think all the information for invoking libomptarget should be available in varPtr and varPtrPtr, i.e. the corresponding Args pointer is the value of in varPtr (which is the value of the data pointer potentially offset'ed due to array section), and the ArgBases pointer is the value in varPtrPtr (which is the address of the member within the data structure, be it a descriptor or a C structure).

This is (I think roughly, with some differences that can be corrected over time) how it works for descriptors currently, the base address map will have both varPtr and varPtrPtr set, however, the varPtr currently is the same as the one set for the descriptor map_info for the moment (wasn't sure what to set it to at the time, we do the offset calculations in MLIR -> LLVM-IR lowering currently as well) and the varPtrPtr is the address of the member within the descriptor data structure (retrieved via BoxAddrOf). We currently only use one or the other in the lowering unfortunately, if varPtrPtr is provided we select it for the moment, otherwise we select the varPtr field. This is something that could do with a little work, but there's only one case currently that the varPtrPtr field is utilised so I've been unsure how to better integrate it.

Okay, I guess I missed my chance to get more details, since there have been a lot of rework, as you said :)

Not at all, I am always happy to make changes when they're suggested, so if you do catch anything you think could be done better please do and I will do my best to incorporate it, if not initially in another iteration :-)

@agozillon
Copy link
Contributor Author

agozillon commented Mar 12, 2024

Updated the following PR with changes to checkIfPointerMap that requires a much more concise set of changes for the moment, and should make this PR redundant and closeable: #84349

Edit: I'll leave this PR open for the time being until the other lands to make sure we do not wish to go down this route for now.

agozillon added a commit to agozillon/llvm-project that referenced this pull request Mar 20, 2024
…ectives

This PR refactors bounds offsetting by combining the two differing implementations
(one applying to initial derivd type member map imlpementation for descriptors and
the other for reguar arrays, effectively allocatable array vs regular array in fortran) now
that it's a little simpler to do.

The PR also moves the utilisation of createAlteredByCaptureMap into genMapInfoOp,
where it will be correctly applied to all MapInfoData, appropriately offsetting and
altering Pointer data set in the kernel argument structure on the host.  This
primarily means bounds will now correctly apply to enter/exit/update map
clauses as opposed to just the Target directive that is currently the case. A few
fortran runtime tests have been added to verify this new behaviour.

This PR depends on: llvm#84328 and is an extraction
of the larger derived type member map PR stack (so a requirement for it to land).
agozillon added a commit that referenced this pull request Mar 22, 2024
…ectives (#84349)

This PR refactors bounds offsetting by combining the two differing
implementations (one applying to initial derived type member map
implementation for descriptors and the other for regular arrays,
effectively allocatable array vs regular array in fortran) now that it's
a little simpler to do.

The PR also moves the utilization of createAlteredByCaptureMap into
genMapInfoOp, where it will be correctly applied to all MapInfoData,
appropriately offsetting and altering Pointer data set in the kernel
argument structure on the host. This primarily means bounds offsets will
now correctly apply to enter/exit/update map clauses as opposed to just
the Target directive that is currently the case. A few fortran runtime
tests have been added to verify this new behavior.

This PR depends on: #84328 and
is an extraction of the larger derived type member map PR stack (so a
requirement for it to land).
@agozillon
Copy link
Contributor Author

Other cleaner patch landed! Thank you @vzakhari for nudging my brain in the right direction with your comments.

So I will close this for the moment, hopefully this approach won't be required. And then I'll rebase the larger PR stack on top of the newly landed PR to hopefully tidy it up and prepare it all for review. I'll await the buildbots passing happily first though, just incase.

@agozillon agozillon closed this Mar 22, 2024
chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
…ectives (llvm#84349)

This PR refactors bounds offsetting by combining the two differing
implementations (one applying to initial derived type member map
implementation for descriptors and the other for regular arrays,
effectively allocatable array vs regular array in fortran) now that it's
a little simpler to do.

The PR also moves the utilization of createAlteredByCaptureMap into
genMapInfoOp, where it will be correctly applied to all MapInfoData,
appropriately offsetting and altering Pointer data set in the kernel
argument structure on the host. This primarily means bounds offsets will
now correctly apply to enter/exit/update map clauses as opposed to just
the Target directive that is currently the case. A few fortran runtime
tests have been added to verify this new behavior.

This PR depends on: llvm#84328 and
is an extraction of the larger derived type member map PR stack (so a
requirement for it to land).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:openmp OpenMP related changes to Clang flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category mlir:llvm mlir:openmp mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants