-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[Flang][OpenMP] Derived type explicit allocatable member mapping #111192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Created using spr 1.3.4
@llvm/pr-subscribers-flang-openmp @llvm/pr-subscribers-flang-fir-hlfir Author: None (agozillon) ChangesThis PR is one of 3 in a PR stack, this is the primary change set which seeks The changes in this PR primarily occur in the OpenMP lowering and In the OpenMP lowering several utility functions were added or extended The other large change area was in the OMPMapInfoFinalization pass, A subset of the changes were also aimed at making some of the utilities Otherwise I have added a set of new tests encompassing some of Patch is 271.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111192.diff 18 Files Affected:
diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
index 09f7b892f1ecbe..b772c523caba49 100644
--- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h
+++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
@@ -215,6 +215,11 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
llvm::ArrayRef<mlir::Value> lenParams,
bool asTarget = false);
+ /// Create a two dimensional ArrayAttr containing integer data as
+ /// IntegerAttrs, effectively: ArrayAttr<ArrayAttr<IntegerAttr>>>.
+ mlir::ArrayAttr create2DI64ArrayAttr(
+ llvm::SmallVectorImpl<llvm::SmallVector<int64_t>> &intData);
+
/// Create a temporary using `fir.alloca`. This function does not hoist.
/// It is the callers responsibility to set the insertion point if
/// hoisting is required.
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index a4d2524bccf5c3..209e79e5182634 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -889,16 +889,17 @@ void ClauseProcessor::processMapObjects(
lower::StatementContext &stmtCtx, mlir::Location clauseLocation,
const omp::ObjectList &objects,
llvm::omp::OpenMPOffloadMappingFlags mapTypeBits,
- std::map<const semantics::Symbol *,
- llvm::SmallVector<OmpMapMemberIndicesData>> &parentMemberIndices,
+ std::map<Object, OmpMapParentAndMemberData> &parentMemberIndices,
llvm::SmallVectorImpl<mlir::Value> &mapVars,
llvm::SmallVectorImpl<const semantics::Symbol *> *mapSyms,
llvm::SmallVectorImpl<mlir::Location> *mapSymLocs,
llvm::SmallVectorImpl<mlir::Type> *mapSymTypes) const {
fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+
for (const omp::Object &object : objects) {
llvm::SmallVector<mlir::Value> bounds;
std::stringstream asFortran;
+ std::optional<omp::Object> parentObj;
lower::AddrAndBoundsInfo info =
lower::gatherDataOperandAddrAndBounds<mlir::omp::MapBoundsOp,
@@ -907,28 +908,46 @@ void ClauseProcessor::processMapObjects(
object.ref(), clauseLocation, asFortran, bounds,
treatIndexAsSection);
+ mlir::Value baseOp = info.rawInput;
+ if (object.sym()->owner().IsDerivedType()) {
+ omp::ObjectList objectList = gatherObjects(object, semaCtx);
+ assert(!objectList.empty() &&
+ "could not find parent objects of derived type member");
+ parentObj = objectList[0];
+ parentMemberIndices.emplace(parentObj.value(),
+ OmpMapParentAndMemberData{});
+
+ if (isMemberOrParentAllocatableOrPointer(object, semaCtx)) {
+ llvm::SmallVector<int64_t> indices;
+ generateMemberPlacementIndices(object, indices, semaCtx);
+ baseOp = createParentSymAndGenIntermediateMaps(
+ clauseLocation, converter, semaCtx, stmtCtx, objectList, indices,
+ parentMemberIndices[parentObj.value()], asFortran.str(),
+ mapTypeBits);
+ }
+ }
+
// Explicit map captures are captured ByRef by default,
// optimisation passes may alter this to ByCopy or other capture
// types to optimise
- mlir::Value baseOp = info.rawInput;
auto location = mlir::NameLoc::get(
mlir::StringAttr::get(firOpBuilder.getContext(), asFortran.str()),
baseOp.getLoc());
mlir::omp::MapInfoOp mapOp = createMapInfoOp(
firOpBuilder, location, baseOp,
/*varPtrPtr=*/mlir::Value{}, asFortran.str(), bounds,
- /*members=*/{}, /*membersIndex=*/mlir::DenseIntElementsAttr{},
+ /*members=*/{}, /*membersIndex=*/mlir::ArrayAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapTypeBits),
mlir::omp::VariableCaptureKind::ByRef, baseOp.getType());
- if (object.sym()->owner().IsDerivedType()) {
- addChildIndexAndMapToParent(object, parentMemberIndices, mapOp, semaCtx);
+ if (parentObj.has_value()) {
+ addChildIndexAndMapToParent(
+ object, parentMemberIndices[parentObj.value()], mapOp, semaCtx);
} else {
mapVars.push_back(mapOp);
- if (mapSyms)
- mapSyms->push_back(object.sym());
+ mapSyms->push_back(object.sym());
if (mapSymTypes)
mapSymTypes->push_back(baseOp.getType());
if (mapSymLocs)
@@ -949,9 +968,7 @@ bool ClauseProcessor::processMap(
llvm::SmallVector<const semantics::Symbol *> localMapSyms;
llvm::SmallVectorImpl<const semantics::Symbol *> *ptrMapSyms =
mapSyms ? mapSyms : &localMapSyms;
- std::map<const semantics::Symbol *,
- llvm::SmallVector<OmpMapMemberIndicesData>>
- parentMemberIndices;
+ std::map<Object, OmpMapParentAndMemberData> parentMemberIndices;
bool clauseFound = findRepeatableClause<omp::clause::Map>(
[&](const omp::clause::Map &clause, const parser::CharBlock &source) {
@@ -1003,17 +1020,15 @@ bool ClauseProcessor::processMap(
mapSymLocs, mapSymTypes);
});
- insertChildMapInfoIntoParent(converter, parentMemberIndices, result.mapVars,
- *ptrMapSyms, mapSymTypes, mapSymLocs);
-
+ insertChildMapInfoIntoParent(converter, semaCtx, stmtCtx, parentMemberIndices,
+ result.mapVars, mapSymTypes, mapSymLocs,
+ ptrMapSyms);
return clauseFound;
}
bool ClauseProcessor::processMotionClauses(lower::StatementContext &stmtCtx,
mlir::omp::MapClauseOps &result) {
- std::map<const semantics::Symbol *,
- llvm::SmallVector<OmpMapMemberIndicesData>>
- parentMemberIndices;
+ std::map<Object, OmpMapParentAndMemberData> parentMemberIndices;
llvm::SmallVector<const semantics::Symbol *> mapSymbols;
auto callbackFn = [&](const auto &clause, const parser::CharBlock &source) {
@@ -1034,9 +1049,9 @@ bool ClauseProcessor::processMotionClauses(lower::StatementContext &stmtCtx,
clauseFound =
findRepeatableClause<omp::clause::From>(callbackFn) || clauseFound;
- insertChildMapInfoIntoParent(converter, parentMemberIndices, result.mapVars,
- mapSymbols,
- /*mapSymTypes=*/nullptr, /*mapSymLocs=*/nullptr);
+ insertChildMapInfoIntoParent(
+ converter, semaCtx, stmtCtx, parentMemberIndices, result.mapVars,
+ /*mapSymTypes=*/nullptr, /*mapSymLocs=*/nullptr, &mapSymbols);
return clauseFound;
}
@@ -1110,9 +1125,7 @@ bool ClauseProcessor::processUseDeviceAddr(
llvm::SmallVectorImpl<mlir::Type> &useDeviceTypes,
llvm::SmallVectorImpl<mlir::Location> &useDeviceLocs,
llvm::SmallVectorImpl<const semantics::Symbol *> &useDeviceSyms) const {
- std::map<const semantics::Symbol *,
- llvm::SmallVector<OmpMapMemberIndicesData>>
- parentMemberIndices;
+ std::map<Object, OmpMapParentAndMemberData> parentMemberIndices;
bool clauseFound = findRepeatableClause<omp::clause::UseDeviceAddr>(
[&](const omp::clause::UseDeviceAddr &clause,
const parser::CharBlock &source) {
@@ -1125,9 +1138,9 @@ bool ClauseProcessor::processUseDeviceAddr(
&useDeviceSyms, &useDeviceLocs, &useDeviceTypes);
});
- insertChildMapInfoIntoParent(converter, parentMemberIndices,
- result.useDeviceAddrVars, useDeviceSyms,
- &useDeviceTypes, &useDeviceLocs);
+ insertChildMapInfoIntoParent(converter, semaCtx, stmtCtx, parentMemberIndices,
+ result.useDeviceAddrVars, &useDeviceTypes,
+ &useDeviceLocs, &useDeviceSyms);
return clauseFound;
}
@@ -1136,9 +1149,8 @@ bool ClauseProcessor::processUseDevicePtr(
llvm::SmallVectorImpl<mlir::Type> &useDeviceTypes,
llvm::SmallVectorImpl<mlir::Location> &useDeviceLocs,
llvm::SmallVectorImpl<const semantics::Symbol *> &useDeviceSyms) const {
- std::map<const semantics::Symbol *,
- llvm::SmallVector<OmpMapMemberIndicesData>>
- parentMemberIndices;
+ std::map<Object, OmpMapParentAndMemberData> parentMemberIndices;
+
bool clauseFound = findRepeatableClause<omp::clause::UseDevicePtr>(
[&](const omp::clause::UseDevicePtr &clause,
const parser::CharBlock &source) {
@@ -1151,9 +1163,9 @@ bool ClauseProcessor::processUseDevicePtr(
&useDeviceSyms, &useDeviceLocs, &useDeviceTypes);
});
- insertChildMapInfoIntoParent(converter, parentMemberIndices,
- result.useDevicePtrVars, useDeviceSyms,
- &useDeviceTypes, &useDeviceLocs);
+ insertChildMapInfoIntoParent(converter, semaCtx, stmtCtx, parentMemberIndices,
+ result.useDevicePtrVars, &useDeviceTypes,
+ &useDeviceLocs, &useDeviceSyms);
return clauseFound;
}
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index 0c8e7bd47ab5a6..0f6f0c9863582d 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -178,8 +178,7 @@ class ClauseProcessor {
lower::StatementContext &stmtCtx, mlir::Location clauseLocation,
const omp::ObjectList &objects,
llvm::omp::OpenMPOffloadMappingFlags mapTypeBits,
- std::map<const semantics::Symbol *,
- llvm::SmallVector<OmpMapMemberIndicesData>> &parentMemberIndices,
+ std::map<Object, OmpMapParentAndMemberData> &parentMemberIndices,
llvm::SmallVectorImpl<mlir::Value> &mapVars,
llvm::SmallVectorImpl<const semantics::Symbol *> *mapSyms,
llvm::SmallVectorImpl<mlir::Location> *mapSymLocs = nullptr,
diff --git a/flang/lib/Lower/OpenMP/Clauses.h b/flang/lib/Lower/OpenMP/Clauses.h
index 51bf0eab0f8d07..34639673a1bd5b 100644
--- a/flang/lib/Lower/OpenMP/Clauses.h
+++ b/flang/lib/Lower/OpenMP/Clauses.h
@@ -55,6 +55,13 @@ struct IdTyTemplate {
return designator == other.designator;
}
+ // Defining an "ordering" which allows types derived from this to be
+ // utilised in maps and other containers that require comparison
+ // operators for ordering
+ bool operator<(const IdTyTemplate &other) const {
+ return symbol < other.symbol;
+ }
+
operator bool() const { return symbol != nullptr; }
};
@@ -76,6 +83,10 @@ struct ObjectT<Fortran::lower::omp::IdTyTemplate<Fortran::lower::omp::ExprTy>,
Fortran::semantics::Symbol *sym() const { return identity.symbol; }
const std::optional<ExprTy> &ref() const { return identity.designator; }
+ bool operator<(const ObjectT<IdTy, ExprTy> &other) const {
+ return identity < other.identity;
+ }
+
IdTy identity;
};
} // namespace tomp::type
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 8195f4a897a90b..a1a6d8816e4bc1 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -935,7 +935,7 @@ static void genBodyOfTargetOp(
firOpBuilder, copyVal.getLoc(), copyVal,
/*varPtrPtr=*/mlir::Value{}, name.str(), bounds,
/*members=*/llvm::SmallVector<mlir::Value>{},
- /*membersIndex=*/mlir::DenseIntElementsAttr{},
+ /*membersIndex=*/mlir::ArrayAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT),
@@ -1792,7 +1792,7 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable,
mlir::Value mapOp = createMapInfoOp(
firOpBuilder, location, baseOp, /*varPtrPtr=*/mlir::Value{},
name.str(), bounds, /*members=*/{},
- /*membersIndex=*/mlir::DenseIntElementsAttr{},
+ /*membersIndex=*/mlir::ArrayAttr{},
static_cast<
std::underlying_type_t<llvm::omp::OpenMPOffloadMappingFlags>>(
mapFlag),
diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp
index 47bc12e1b8a030..2e25f374934b94 100644
--- a/flang/lib/Lower/OpenMP/Utils.cpp
+++ b/flang/lib/Lower/OpenMP/Utils.cpp
@@ -13,9 +13,15 @@
#include "Utils.h"
#include "Clauses.h"
+#include <DirectivesCommon.h>
+
+#include <flang/Evaluate/fold.h>
#include <flang/Lower/AbstractConverter.h>
+#include <flang/Lower/ConvertExprToHLFIR.h>
#include <flang/Lower/ConvertType.h>
#include <flang/Lower/PFTBuilder.h>
+#include <flang/Lower/StatementContext.h>
+#include <flang/Lower/SymbolMap.h>
#include <flang/Optimizer/Builder/FIRBuilder.h>
#include <flang/Optimizer/Builder/Todo.h>
#include <flang/Parser/parse-tree.h>
@@ -23,9 +29,6 @@
#include <flang/Semantics/tools.h>
#include <llvm/Support/CommandLine.h>
-#include <algorithm>
-#include <numeric>
-
llvm::cl::opt<bool> treatIndexAsSection(
"openmp-treat-index-as-section",
llvm::cl::desc("In the OpenMP data clauses treat `a(N)` as `a(N:N)`."),
@@ -117,14 +120,12 @@ void gatherFuncAndVarSyms(
symbolAndClause.emplace_back(clause, *object.sym());
}
-mlir::omp::MapInfoOp
-createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc,
- mlir::Value baseAddr, mlir::Value varPtrPtr, std::string name,
- llvm::ArrayRef<mlir::Value> bounds,
- llvm::ArrayRef<mlir::Value> members,
- mlir::DenseIntElementsAttr membersIndex, uint64_t mapType,
- mlir::omp::VariableCaptureKind mapCaptureType, mlir::Type retTy,
- bool partialMap) {
+mlir::omp::MapInfoOp createMapInfoOp(
+ fir::FirOpBuilder &builder, mlir::Location loc, mlir::Value baseAddr,
+ mlir::Value varPtrPtr, std::string name, mlir::ArrayRef<mlir::Value> bounds,
+ mlir::ArrayRef<mlir::Value> members, mlir::ArrayAttr membersIndex,
+ uint64_t mapType, mlir::omp::VariableCaptureKind mapCaptureType,
+ mlir::Type retTy, bool partialMap) {
if (auto boxTy = llvm::dyn_cast<fir::BaseBoxType>(baseAddr.getType())) {
baseAddr = builder.create<fir::BoxAddrOp>(loc, baseAddr);
retTy = baseAddr.getType();
@@ -145,11 +146,174 @@ createMapInfoOp(fir::FirOpBuilder &builder, mlir::Location loc,
builder.getIntegerAttr(builder.getIntegerType(64, false), mapType),
builder.getAttr<mlir::omp::VariableCaptureKindAttr>(mapCaptureType),
builder.getStringAttr(name), builder.getBoolAttr(partialMap));
-
return op;
}
-static int
+omp::ObjectList gatherObjects(omp::Object obj,
+ semantics::SemanticsContext &semaCtx) {
+ omp::ObjectList objList;
+ std::optional<omp::Object> baseObj = obj;
+ while (baseObj.has_value()) {
+ objList.push_back(baseObj.value());
+ baseObj = getBaseObject(baseObj.value(), semaCtx);
+ }
+ return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool isDuplicateMemberMapInfo(OmpMapParentAndMemberData &parentMembers,
+ llvm::SmallVectorImpl<int64_t> &memberIndices) {
+ for (auto memberData : parentMembers.memberPlacementIndices)
+ if (std::equal(memberIndices.begin(), memberIndices.end(),
+ memberData.begin()))
+ return true;
+ return false;
+}
+
+static void generateArrayIndices(lower::AbstractConverter &converter,
+ fir::FirOpBuilder &firOpBuilder,
+ lower::StatementContext &stmtCtx,
+ mlir::Location clauseLocation,
+ llvm::SmallVectorImpl<mlir::Value> &indices,
+ omp::Object object) {
+ if (auto maybeRef = evaluate::ExtractDataRef(*object.ref())) {
+ evaluate::DataRef ref = *maybeRef;
+ if (auto *arr = std::get_if<evaluate::ArrayRef>(&ref.u)) {
+ for (auto v : arr->subscript()) {
+ if (std::holds_alternative<Triplet>(v.u)) {
+ llvm_unreachable("Triplet indexing in map clause is unsupported");
+ } else {
+ auto expr =
+ std::get<Fortran::evaluate::IndirectSubscriptIntegerExpr>(v.u);
+ mlir::Value subscript = fir::getBase(
+ converter.genExprValue(toEvExpr(expr.value()), stmtCtx));
+ mlir::Value one = firOpBuilder.createIntegerConstant(
+ clauseLocation, firOpBuilder.getIndexType(), 1);
+ subscript = firOpBuilder.createConvert(
+ clauseLocation, firOpBuilder.getIndexType(), subscript);
+ indices.push_back(firOpBuilder.create<mlir::arith::SubIOp>(
+ clauseLocation, subscript, one));
+ }
+ }
+ }
+ }
+}
+
+// When mapping members of derived types, there is a chance that one of the
+// members along the way to a mapped member is an descriptor. In which case
+// we have to make sure we generate a map for those along the way otherwise
+// we will be missing a chunk of data required to actually map the member
+// type to device. This function effectively generates these maps and the
+// appropriate data accesses required to generate these maps. It will avoid
+// creating duplicate maps, as duplicates are just as bad as unmapped
+// descriptor data in a lot of cases for the runtime (and unnecessary
+// data movement should be avoided where possible)
+mlir::Value createParentSymAndGenIntermediateMaps(
+ mlir::Location clauseLocation, lower::AbstractConverter &converter,
+ semantics::SemanticsContext &semaCtx, lower::StatementContext &stmtCtx,
+ omp::ObjectList &objectList, llvm::SmallVector<int64_t> &indices,
+ OmpMapParentAndMemberData &parentMemberIndices, std::string asFortran,
+ llvm::omp::OpenMPOffloadMappingFlags mapTypeBits) {
+
+ auto arrayExprWithSubscript = [](omp::Object obj) {
+ if (auto maybeRef = evaluate::ExtractDataRef(*obj.ref())) {
+ evaluate::DataRef ref = *maybeRef;
+ if (auto *arr = std::get_if<evaluate::ArrayRef>(&ref.u))
+ return !arr->subscript().empty();
+ }
+ return false;
+ };
+
+ fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+ lower::AddrAndBoundsInfo parentBaseAddr = lower::getDataOperandBaseAddr(
+ converter, firOpBuilder, *objectList[0].sym(), clauseLocation);
+ mlir::Value curValue = parentBaseAddr.addr;
+
+ // Iterate over all objects in the objectList, this should consist of all
+ // record types between the parent and the member being mapped (including
+ // the parent). The object list may also contain array objects as well,
+ // this can occur when specifying bounds or a specific element access
+ // within a member map, we skip these.
+ size_t currentIndex = 0;
+ for (size_t i = 0; i < objectList.size(); ++i) {
+ if (fir::SequenceType arrType = mlir::dyn_cast<fir::SequenceType>(
+ fir::unwrapPassByRefType(curValue.getType()))) {
+ if (arrayExprWithSubscript(objectList[i])) {
+ llvm::SmallVector<mlir::Value> indices;
+ generateArrayIndices(converter, firOpBuilder, stmtCtx, clauseLocation,
+ indices, objectList[i]);
+ assert(!indices.empty() && "missing expected indices for map clause");
+ curValue = firOpBuilder.create<fir::CoordinateOp>(
+ clauseLocation, firOpBuilder.getRefType(arrType.getEleTy()),
+ curValue, indices);
+ }
+ }
+
+ if (fir::RecordType recordType = mlir::dyn_cast<fir::RecordType>(
+ fir::unwrapPassByRefType(curValue.getType()))) {
+ mlir::Value idxConst = firOpBuilder.createIntegerConstant(
+ clauseLocation, firOpBuilder.getIndexType(), indices[currentIndex]);
+ mlir::Type memberTy =
+ recordType.getTypeList().at(indices[currentIndex]).second;
+ curValue = firOpBuilder.create<fir::CoordinateOp>(
+ clauseLocation, firOpBuilder.getRefType(memberTy), curValue,
+ idxConst);
+
+ if ((currentIndex == indices.size() - 1) ||
+ !fir::isTypeWithDescriptor(memberTy)) {
+ currentIndex++;
+ continue;
+ }
+
+ llvm::SmallVector<int64_t> interimIndices(
+ indices.begin(), std::next(indices.begin(), currentIndex + 1));
+ if (!isDuplicateMemberMapInfo(parentMemberIndices, interimIndices)) {
+ // Generate initial bounds operations using the standard lowering
+ ...
[truncated]
|
This portion of the PR stack underwent the most changes, as it's the most "complex" part of the PR stack, it's primarily underwent some additional complexity in the map lowering for member mapping which now attempts to incorporate bounds where possible amongst a some other things, it also incorporates the last iteration of reviewer comments on the previous PR stack alongside additional tests checking the new changes work as expected (and continue to). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round, just reading through some tests to make sure I understand the effect of the changes. Will continue later. Thanks Andrew!
@@ -0,0 +1,161 @@ | |||
!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the whole file, can you capture the names of longs types in FileCheck variables and use the captured type names to improve readability and reduce line lengths a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this is a good idea and something that would be good to do in all tests added by this PR, and eventually in all flang tests dealing with complex types like these. It's very difficult at the moment to understand and maintain these tests because of it.
Also, test function/variable names could be shortened to improve legibility as well. I think we can add a comment to explain their purpose a little better if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me as the FIR type system tends to be a bit egregious, I'll do that for the main new tests I added in the PR, but going to avoid it in cases where I've not added anything significant or it's primarily just an alteration, as fully altering the files would be better done in a seperate tidy up PR as opposed to this one (one that could likely do the above for all of the OpenMP tests we have).
!CHECK: %[[MAP_REGULAR_NESTED_MEMBER:.*]] = omp.map.info var_ptr(%[[REGULAR_NESTED_MEMBER_COORD]] : !fir.ref<i32>, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref<i32> {name = "one_l%nest%k"} | ||
!CHECK: %[[DTYPE_BASE_ADDR:.*]] = fir.box_offset %[[DECLARE]]#1 base_addr : (!fir.ref<!fir.box<!fir.heap<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>>>) -> !fir.llvm_ptr<!fir.ref<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>> | ||
!CHECK: %[[MAP_DTYPE_BASE_ADDR:.*]] = omp.map.info var_ptr(%[[DECLARE]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>>>, !fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>) var_ptr_ptr(%[[DTYPE_BASE_ADDR]] : !fir.llvm_ptr<!fir.ref<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>>) map_clauses(tofrom) capture(ByRef) -> !fir.llvm_ptr<!fir.ref<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>> {name = ""} | ||
!CHECK: %[[MAP_DTYPE:.*]] = omp.map.info var_ptr(%[[DECLARE]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>>>, !fir.box<!fir.heap<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>>) map_clauses(to) capture(ByRef) members(%75, %69, %68, %73 : [0], [0,6,2], [0,6,2,0], [0,6,3] : !fir.llvm_ptr<!fir.ref<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>>, !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>, !fir.ref<i32>) -> !fir.ref<!fir.box<!fir.heap<!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_alloca_nested_derived_type_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>>> {name = "one_l"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forgot to use the FileCheck vars for members
here.
!CHECK: %[[NESTED_MEMBER_BASE_ADDR:.*]] = fir.box_offset %[[NESTED_MEMBER_COORD]] base_addr : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> | ||
!CHECK: %[[MAP_NESTED_MEMBER_BASE_ADDR:.*]] = omp.map.info var_ptr(%[[NESTED_MEMBER_COORD]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.array<?xi32>) var_ptr_ptr(%[[NESTED_MEMBER_BASE_ADDR]] : !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) map_clauses(tofrom) capture(ByRef) bounds(%[[BOUNDS]]) -> !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>> {name = ""} | ||
!CHECK: %[[MAP_NESTED_MEMBER_DESC:.*]] = omp.map.info var_ptr(%[[NESTED_MEMBER_COORD]] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.box<!fir.heap<!fir.array<?xi32>>>) map_clauses(to) capture(ByRef) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> {name = "one_l%nest%array_k"} | ||
!CHECK: %[[MAP_PARENT:.*]] = omp.map.info var_ptr(%[[DECLARE]]#1 : !fir.ref<!fir.type<_QFtest_nested_derived_type_alloca_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_nested_derived_type_alloca_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>>, !fir.type<_QFtest_nested_derived_type_alloca_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_nested_derived_type_alloca_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>) map_clauses(tofrom) capture(ByRef) members(%62, %61 : [6,2], [6,2,0] : !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>>, !fir.llvm_ptr<!fir.ref<!fir.array<?xi32>>>) -> !fir.ref<!fir.type<_QFtest_nested_derived_type_alloca_map_operand_and_block_additionTtop_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32,nest:!fir.type<_QFtest_nested_derived_type_alloca_map_operand_and_block_additionTmiddle_layer{i:f32,array_i:!fir.array<10xi32>,array_k:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>}>> {name = "one_l", partial_map = true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed Utils.h/.cpp
, submitting the current state of the review to prevent double-reviews when possible.
!CHECK: %[[ALLOCA:.*]] = fir.alloca !fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}> {bindc_name = "one_l", uniq_name = "_QFtest_derived_type_allocatable_map_operand_and_block_additionEone_l"} | ||
!CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[ALLOCA]] {uniq_name = "_QFtest_derived_type_allocatable_map_operand_and_block_additionEone_l"} : (!fir.ref<!fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>>) -> (!fir.ref<!fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>>, !fir.ref<!fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>>) | ||
!CHECK: %[[BOUNDS:.*]] = omp.map.bounds lower_bound({{.*}}) upper_bound({{.*}}) extent({{.*}}) stride({{.*}}) start_idx({{.*}}) {stride_in_bytes = true} | ||
!CHECK: %[[MEMBER_INDEX:.*]] = arith.constant 4 : index | ||
!CHECK: %[[MEMBER_COORD:.*]] = fir.coordinate_of %[[DECLARE]]#0, %[[MEMBER_INDEX]] : (!fir.ref<!fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>>, index) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agozillon Here is what I meant, nothing fancy, just making sure the test it more readable by reusing short names for complicated types.
!CHECK: %[[ALLOCA:.*]] = fir.alloca !fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}> {bindc_name = "one_l", uniq_name = "_QFtest_derived_type_allocatable_map_operand_and_block_additionEone_l"} | |
!CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[ALLOCA]] {uniq_name = "_QFtest_derived_type_allocatable_map_operand_and_block_additionEone_l"} : (!fir.ref<!fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>>) -> (!fir.ref<!fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>>, !fir.ref<!fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>>) | |
!CHECK: %[[BOUNDS:.*]] = omp.map.bounds lower_bound({{.*}}) upper_bound({{.*}}) extent({{.*}}) stride({{.*}}) start_idx({{.*}}) {stride_in_bytes = true} | |
!CHECK: %[[MEMBER_INDEX:.*]] = arith.constant 4 : index | |
!CHECK: %[[MEMBER_COORD:.*]] = fir.coordinate_of %[[DECLARE]]#0, %[[MEMBER_INDEX]] : (!fir.ref<!fir.type<_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}>>, index) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> | |
!CHECK: %[[ALLOCA:.*]] = fir.alloca !fir.type<[[ONE_LAYER_TY:_QFtest_derived_type_allocatable_map_operand_and_block_additionTone_layer{i:f32,scalar:!fir.box<!fir.heap<i32>>,array_i:!fir.array<10xi32>,j:f32,array_j:!fir.box<!fir.heap<!fir.array<?xi32>>>,k:i32}]]> {bindc_name = "one_l", uniq_name = "_QFtest_derived_type_allocatable_map_operand_and_block_additionEone_l"} | |
!CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[ALLOCA]] {uniq_name = "_QFtest_derived_type_allocatable_map_operand_and_block_additionEone_l"} : (!fir.ref<!fir.type<[[ONE_LAYER_TY]]>>) -> (!fir.ref<!fir.type<[[ONE_LAYER_TY]]>>, !fir.ref<!fir.type<[[ONE_LAYER_TY]]>>) | |
!CHECK: %[[BOUNDS:.*]] = omp.map.bounds lower_bound({{.*}}) upper_bound({{.*}}) extent({{.*}}) stride({{.*}}) start_idx({{.*}}) {stride_in_bytes = true} | |
!CHECK: %[[MEMBER_INDEX:.*]] = arith.constant 4 : index | |
!CHECK: %[[MEMBER_COORD:.*]] = fir.coordinate_of %[[DECLARE]]#0, %[[MEMBER_INDEX]] : (!fir.ref<!fir.type<[[ONE_LAYER_TY]]>>, index) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xi32>>>> |
// symbol for top_dtype. Other parent derived type instances that have | ||
// members mapped would have there own OmpMapParentAndMemberData entry | ||
// accessed via their own symbol. | ||
struct OmpMapParentAndMemberData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be something like this:
struct OmpMapParentAndMemberData { | |
struct OmpMapParentAndMemberData { | |
struct MemberMappingData { | |
llvm::SmallVector<int64_t> placementIndices; | |
mlir::omp::MapInfo memberMap; | |
}; | |
llvm::SmallVector<MemberMappingData> allMembersData; | |
}; |
The reason I think this is better is that it ties all the mapping data for a single member in one structure which makes easier to understand what OmpMapParentAndMemberData
encapsulates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to keep it the way it is (at least for this iteration), as a structure of arrays as opposed to an array of structures, I had it as the latter originally, but it makes the functions for transforming from this structure to the array attributes that the map holds and vice versa more complicated and accessing and storing the data is a little bit more cumbersome. I believe the rather large description above also helps describe the intent of the structure, or I would hope at least or I could have skipped writing it :-)
uint64_t mapType, mlir::omp::VariableCaptureKind mapCaptureType, | ||
mlir::Type retTy, bool partialMap = false); | ||
|
||
void addChildIndexAndMapToParent(const omp::Object &object, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have this as a member method inside OmpMapParentAndMemberData
?
if (compObj.has_value() && | ||
semantics::IsAllocatableOrObjectPointer(compObj.value().sym())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (compObj.has_value() && | |
semantics::IsAllocatableOrObjectPointer(compObj.value().sym())) | |
if (semantics::IsAllocatableOrObjectPointer(compObj.value().sym())) |
for (parentIdx = 0; parentIdx < mapSyms.size(); ++parentIdx) { | ||
if (mapSyms[parentIdx] == indices.first) { | ||
|
||
for (parentIdx = 0; parentIdx < mapSymbols->size(); ++parentIdx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find_if
can be used here, right?
return false; | ||
} | ||
|
||
static void generateArrayIndices(lower::AbstractConverter &converter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add docs with an example.
// this can occur when specifying bounds or a specific element access | ||
// within a member map, we skip these. | ||
size_t currentIndex = 0; | ||
for (size_t i = 0; i < objectList.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use a range-based loop here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we cannot at the moment as we utilise i
in a couple of locations at the moment!
// creating duplicate maps, as duplicates are just as bad as unmapped | ||
// descriptor data in a lot of cases for the runtime (and unnecessary | ||
// data movement should be avoided where possible) | ||
mlir::Value createParentSymAndGenIntermediateMaps( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is difficult for me to understand what this function does. Can you please document the paramemters (the ones specific to the function, no need to document converter
and other usual parameters.
Can you also add some pseudo-MLIR guiding examples to the comments to help clarify the behavior of the function?
void generateMemberPlacementIndices(const Object &object, | ||
llvm::SmallVectorImpl<int64_t> &indices, | ||
semantics::SemanticsContext &semaCtx) { | ||
indices.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to assert the list is emtpy here.
llvm::SmallVectorImpl<int> &indices, | ||
semantics::SemanticsContext &semaCtx) { | ||
void generateMemberPlacementIndices(const Object &object, | ||
llvm::SmallVectorImpl<int64_t> &indices, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious why the change from int
to int64_t
in particular?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe as I opted to utilise an I64 array attribute to contain the member indices as it had better utility/support/ease of use from what I could see when moving over to the new member index style.
// mapping of BlockArgs to MapInfoOp's at the same placement in | ||
// each array (BlockArgs and MapVars). | ||
if (directiveOp) { | ||
directiveOp.getRegion().insertArgument(i + j, mapMember.getType(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably needs rewriting based on the BlockArgOpenMPOpInterface
, since this index-based approach easily breaks if other entry block argument-generating clauses are introduced before map
(such as in_reduction
). This may be a helpful reference:
llvm-project/flang/lib/Lower/OpenMP/OpenMP.cpp
Lines 1010 to 1014 in db1a762
unsigned insertIndex = | |
argIface.getMapBlockArgsStart() + argIface.numMapBlockArgs(); | |
targetOp.getMapVarsMutable().append(mapOp); | |
mlir::Value clonedValArg = region.insertArgument( | |
insertIndex, copyVal.getType(), copyVal.getLoc()); |
#include <flang/Lower/AbstractConverter.h> | ||
#include <flang/Lower/ConvertExprToHLFIR.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of the header file changes necessary? Just making sure we only have the minimal set here.
void | ||
getMemberUserList(mlir::omp::MapInfoOp op, | ||
llvm::SmallVectorImpl<ParentAndPlacement> &mapMemberUsers) { | ||
for (auto *users : op->getUsers()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
for (auto *users : op->getUsers()) | |
for (auto *user : op->getUsers()) |
/// base address member. We take the position of the descriptor in | ||
/// the member indices list, which is the index data that the base | ||
/// addresses index will be based off of, as the base address is | ||
/// a member of the descriptor. We must also alter other member's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment was cut-off.
/// parents member list. | ||
void | ||
getMemberUserList(mlir::omp::MapInfoOp op, | ||
llvm::SmallVectorImpl<ParentAndPlacement> &mapMemberUsers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what scenario will mapMemberUsers
end up with more than one element?
As a test (just to learn more about the changes), I modified this function as follows and none of the tests failed:
--- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
@@ -81,8 +81,10 @@ class MapInfoFinalizationPass
for (auto *users : op->getUsers())
if (auto map = mlir::dyn_cast_if_present<mlir::omp::MapInfoOp>(users))
for (auto [i, mapMember] : llvm::enumerate(map.getMembers()))
- if (mapMember.getDefiningOp() == op)
+ if (mapMember.getDefiningOp() == op) {
mapMemberUsers.push_back({map, i});
+ break;
+ }
}
llvm::SmallVector<int64_t>
Also, mapMemberUsers[0]
is the only element used below (in genDescriptorMemberMaps
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I answered some variation of this question in the previous PR iteration, so just going to copy paste it here:
"it's intentional for the moment, there should never be more than one mapMemberUsers currently when we reach this pass, as we do not generate MLIR that allows the re-use of MapInfoOp's across operations, so technically the only cases where a MapInfoOp should be used, is a parent map member operation (in which case we add an entry to mapMemberUsers) or when it's tied to an existing MapInfoOp using operation (TargetOp/TargetExitData etc.).
So, perhaps being a bit pre-emptive and adding some initial support for this case, for when we may end up trying to generating tidier MLIR where we re-use map info operations allowing them to have multiple users. As that would be ideal in the future (and I believe you've already written one such test previously).
Previously an assert checking that we had no more than 1 user was above but i think it is covered by the prior assert at line 488 so it felt a little redundant to have two of the same check."
I believe there's another comment at line 271-ish (after updates may be off by a bit) that mentions the above, and I believe I'll be re-adding the assert as requested by @skatrak :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Andrew for your work on this. I have a few relatively minor comments, but otherwise the general approach seems ok to me.
omp::ObjectList objectList = gatherObjects(object, semaCtx); | ||
assert(!objectList.empty() && | ||
"could not find parent objects of derived type member"); | ||
parentObj = objectList[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It's not clear to me why the first returned object is guaranteed to be the parent. Perhaps adding a description to gatherObjects
would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, consider the symbol: "parent%child%member", we pass the whole symbol, and start from the RHS and peel back the symbol layers until we get to the end, "parent" in this case, we then reverse the list, so "parent" is the first member and it goes from LHS to RHS rather than RHS to LHS! Mainly mentioning to help explain a bit, more than happy to add a comment for it though :-)
mlir::omp::MapInfoOp &mapOp, semantics::SemanticsContext &semaCtx); | ||
mlir::omp::MapInfoOp createMapInfoOp( | ||
fir::FirOpBuilder &builder, mlir::Location loc, mlir::Value baseAddr, | ||
mlir::Value varPtrPtr, std::string name, mlir::ArrayRef<mlir::Value> bounds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mlir::Value varPtrPtr, std::string name, mlir::ArrayRef<mlir::Value> bounds, | |
mlir::Value varPtrPtr, llvm::StringRef name, mlir::ArrayRef<mlir::Value> bounds, |
} | ||
|
||
llvm::SmallVector<int64_t> | ||
getAsIntegers(llvm::ArrayRef<mlir::Attribute> values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Take an llvm::SmallVectorImpl<int64_t> &
output argument instead.
mlir::omp::MapInfoOp mapInfo, | ||
llvm::SmallVector<llvm::SmallVector<int64_t>> &indices) { | ||
indices.reserve(mapInfo.getMembersIndexAttr().getValue().size()); | ||
for (auto v : mapInfo.getMembersIndexAttr().getValue()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You can use llvm::transform
here too, for consistency.
/// awkward to work with) | ||
void getMemberIndicesAsVectors( | ||
mlir::omp::MapInfoOp mapInfo, | ||
llvm::SmallVector<llvm::SmallVector<int64_t>> &indices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm::SmallVector<llvm::SmallVector<int64_t>> &indices) { | |
llvm::SmallVectorImpl<llvm::SmallVector<int64_t>> &indices) { |
mlir::omp::TargetEnterDataOp>(user)) | ||
return user; | ||
|
||
if (auto mapUser = llvm::dyn_cast_if_present<mlir::omp::MapInfoOp>(user)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The check above already expects that user
is not null, and it seems like it would be unexpected that users would be null.
if (auto mapUser = llvm::dyn_cast_if_present<mlir::omp::MapInfoOp>(user)) | |
if (auto mapUser = llvm::dyn_cast<mlir::omp::MapInfoOp>(user)) |
for (auto memberData : parentMembers.memberPlacementIndices) | ||
if (std::equal(memberIndices.begin(), memberIndices.end(), | ||
memberData.begin())) | ||
return true; | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be simplified a bit further with something similar to this:
for (auto memberData : parentMembers.memberPlacementIndices) | |
if (std::equal(memberIndices.begin(), memberIndices.end(), | |
memberData.begin())) | |
return true; | |
return false; | |
return llvm::find_if(parentMembers.memberPlacementIndices, | |
[&](auto &memberData) { | |
return llvm::equal(memberIndices, memberData); | |
}) != parentMembers.memberPlacementIndices.end(); |
while (compObj) { | ||
indices.push_back(getComponentPlacementInParent(compObj->sym())); | ||
int64_t index = getComponentPlacementInParent(compObj->sym()); | ||
assert(index >= 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Add a comment to the assert.
@@ -0,0 +1,161 @@ | |||
!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this is a good idea and something that would be good to do in all tests added by this PR, and eventually in all flang tests dealing with complex types like these. It's very difficult at the moment to understand and maintain these tests because of it.
Also, test function/variable names could be shortened to improve legibility as well. I think we can add a comment to explain their purpose a little better if needed.
|
||
// ----- | ||
|
||
module attributes {omp.is_target_device = false} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Is this needed for this test case specifically? If so, that's fine, I'm only pointing it out because none of the others below have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately had to create a new iteration of PRs for this stack, as I moved machine and SPR decided that it didn't like that for some reason or other, but I've moved away from using SPR for the moment and did the new ones "by hand".
So I am incredibly sorry about that and the spam. I've submitted review comments here in reply, which unfortunately cannot be carried over (wish I could do so trivially), otherwise the new PR stack has incorporated all the changes I could.
omp::ObjectList objectList = gatherObjects(object, semaCtx); | ||
assert(!objectList.empty() && | ||
"could not find parent objects of derived type member"); | ||
parentObj = objectList[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, consider the symbol: "parent%child%member", we pass the whole symbol, and start from the RHS and peel back the symbol layers until we get to the end, "parent" in this case, we then reverse the list, so "parent" is the first member and it goes from LHS to RHS rather than RHS to LHS! Mainly mentioning to help explain a bit, more than happy to add a comment for it though :-)
return op; | ||
} | ||
|
||
static int | ||
omp::ObjectList gatherObjects(omp::Object obj, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opted to call it gatherObjectsOf, as it does include the passed in object itself! Also might be worth noting it can be used with things other than derived types, i.e. arrays or common block members, it's only really utilised for derived types just now though so happy to change the argument name!
mlir::Location clauseLocation, | ||
llvm::SmallVectorImpl<mlir::Value> &indices, | ||
omp::Object object) { | ||
if (auto maybeRef = evaluate::ExtractDataRef(*object.ref())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy to do the early returns, but I'd like to keep the llvm_unreachable and not replace it with a continue, it'd be ideal to know if it was generating incorrect indices if it was used with an array ref with triplets!
// this can occur when specifying bounds or a specific element access | ||
// within a member map, we skip these. | ||
size_t currentIndex = 0; | ||
for (size_t i = 0; i < objectList.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we cannot at the moment as we utilise i
in a couple of locations at the moment!
llvm::SmallVectorImpl<int> &indices, | ||
semantics::SemanticsContext &semaCtx) { | ||
void generateMemberPlacementIndices(const Object &object, | ||
llvm::SmallVectorImpl<int64_t> &indices, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe as I opted to utilise an I64 array attribute to contain the member indices as it had better utility/support/ease of use from what I could see when moving over to the new member index style.
// symbol for top_dtype. Other parent derived type instances that have | ||
// members mapped would have there own OmpMapParentAndMemberData entry | ||
// accessed via their own symbol. | ||
struct OmpMapParentAndMemberData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to keep it the way it is (at least for this iteration), as a structure of arrays as opposed to an array of structures, I had it as the latter originally, but it makes the functions for transforming from this structure to the array attributes that the map holds and vice versa more complicated and accessing and storing the data is a little bit more cumbersome. I believe the rather large description above also helps describe the intent of the structure, or I would hope at least or I could have skipped writing it :-)
/// parents member list. | ||
void | ||
getMemberUserList(mlir::omp::MapInfoOp op, | ||
llvm::SmallVectorImpl<ParentAndPlacement> &mapMemberUsers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I answered some variation of this question in the previous PR iteration, so just going to copy paste it here:
"it's intentional for the moment, there should never be more than one mapMemberUsers currently when we reach this pass, as we do not generate MLIR that allows the re-use of MapInfoOp's across operations, so technically the only cases where a MapInfoOp should be used, is a parent map member operation (in which case we add an entry to mapMemberUsers) or when it's tied to an existing MapInfoOp using operation (TargetOp/TargetExitData etc.).
So, perhaps being a bit pre-emptive and adding some initial support for this case, for when we may end up trying to generating tidier MLIR where we re-use map info operations allowing them to have multiple users. As that would be ideal in the future (and I believe you've already written one such test previously).
Previously an assert checking that we had no more than 1 user was above but i think it is covered by the prior assert at line 488 so it felt a little redundant to have two of the same check."
I believe there's another comment at line 271-ish (after updates may be off by a bit) that mentions the above, and I believe I'll be re-adding the assert as requested by @skatrak :-)
// we are expanding a parent that is a descriptor and we have to adjust | ||
// all of its members to reflect the insertion of the base address. | ||
if (!mapMemberUsers.empty()) { | ||
// Currently, there should only be one user per map when this pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an equivalent on entry to the pass, but happy to add another fail safe.
return user; | ||
|
||
if (auto mapUser = llvm::dyn_cast_if_present<mlir::omp::MapInfoOp>(user)) | ||
return getFirstTargetUser(mapUser); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to opt for some assert checks, not a huge fan of llvm_unreachable outside of certain cases!
@@ -0,0 +1,161 @@ | |||
!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me as the FIR type system tends to be a bit egregious, I'll do that for the main new tests I added in the PR, but going to avoid it in cases where I've not added anything significant or it's primarily just an alteration, as fully altering the files would be better done in a seperate tidy up PR as opposed to this one (one that could likely do the above for all of the OpenMP tests we have).
This PR is one of 3 in a PR stack, this is the primary change set which seeks
to extend the current derived type explicit member mapping support to
handle descriptor member mapping at arbitrary levels of nesting. The PR
stack seems to do this reasonably (from testing so far) but as you can
create quite complex mappings with derived types (in particular when adding
allocatable derived types or arrays of allocatable derived types) I imagine
there will be hiccups, which I am more than happy to address. There will
also be further extensions to this work to handle the implicit auto-magical
mapping of descriptor members in derived types and a few other changes
planned for the future (with some ideas on optimizing things).
The changes in this PR primarily occur in the OpenMP lowering and
the OMPMapInfoFinalization pass.
In the OpenMP lowering several utility functions were added or extended
to support the generation of appropriate intermediate member mappings
which are currently required when the parent (or multiple parents) of a
mapped member are descriptor types. We need to map the entirety of
these types or do a "deep copy" for lack of a better term, where we map
both the base address and the descriptor as without the copying of both
of these we lack the information in the case of the descriptor to access the
member or attach the pointers data to the pointer and in the latter case we
require the base address to map the chunk of data. Currently we do not
segment descriptor based derived types as we do with regular
non-descriptor derived types, we effectively map their entirety in all
cases at the moment, I hope to address this at some point in the future as
it adds a fair bit of a performance penalty to having nestings of allocatable
derived types as an example. The process of mapping all intermediate
descriptor members in a members path only occurs if a member has
an allocatable or object parent in its symbol path or the member itself
is a member or allocatable. This occurs in the
createParentSymAndGenIntermediateMaps function, which will also
generate the appropriate address for the allocatable member
within the derived type to use as a the varPtr field of the map (for
intermediate allocatable maps and final allocatable mappings). In
this case it's necessary as we can't utilise the usual Fortran::lower
functionality such as gatherDataOperandAddrAndBounds without
causing issues later in the lowering due to extra allocas being spawned
which seem to affect the pointer attachment (at least this is my
current assumption, it results in memory access errors on the device
due to incorrect map information generation). This is similar
to why we do not use the MLIR value generated for this and utilise
the original symbol provided when mapping descriptor types external
to derived types. Hopefully this can be rectified in the future so this
function can be simplified and more closely aligned to the other type
mappings. We also make use of fir::CoordinateOp as opposed to the
HLFIR version as the HLFIR version doesn't support the appropriate
lowering to FIR necessary at the moment, we also cannot use a
single CoordinateOp (similarly to a single GEP) as when we index
through a descriptor operation (BoxType) we encounter issues later
in the lowering, however in either case we need access to intermediate
descriptors so individual CoordinateOp's aid this (although, being
able to compress them into a smaller amount of CoordinateOp's may
simplify the IR and perhaps result in a better end product, something
to consider for the future).
The other large change area was in the OMPMapInfoFinalization pass,
where the pass had to be extended to support the expansion of box
types (or multiple nestings of box types) within derived types, or box
type derived types. This requires expanding each BoxType mapping
from one into two maps and then modifying all of the existing
member indices of the overarching parent mapping to account for
the addition of these new members alongside adjusting the existing
member indices to support the addition of these new maps which
extend the original member indices (as a base address of a box type
is currently considered a member of the box type at a position of
0 as when lowered to LLVM-IR it's a pointer contained at this position
in the descriptor type, however, this means extending mapped children
of this expanded descriptor type to additionally incorporate the new
member index in the correct location in its own index list). I believe
there is a reasonable amount of comments that should aid in
understanding this better, alongside the test alterations for the pass.
A subset of the changes were also aimed at making some of the utilities
for packing and unpacking the DenseIntElementsAttr
containing the member indices shareable across the lowering and
OMPMapInfoFinalization, this required moving some functions to the
Lower/Support/Utils.h header, and transforming the lowering structure
containing the member index data into something more similar to the
version used in OMPMapInfoFinalization. There we also some other
attempts at tidying things up in relation to the member index data
generation in the lowering, some of which required creating a logical
operator for the OpenMP ID class so it can be utilised as a map key
(it simply utilises the symbol address for the moment as ordering
isn't particularly important).
Otherwise I have added a set of new tests encompassing some of
the mappings currently supported by this PR (unfortunately as
you can have arbitrary nestings of all shapes and types it's not
very feasible to cover them all).