-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[acc] Initial implementation of MemoryEffects on acc
operations
#75970
Conversation
The `acc` dialect operations now implement MemoryEffects interfaces in the following ways: - Data entry operations which may read host memory via `varPtr` are now marked as so. The majority of them do NOT actually read the host memory. For example, `acc.present` works on the basis of presence of pointer and not necessarily what the data points to - so they are not marked as reading the host memory. They still use `varPtr` though but this dependency is reflected through ssa. - Data clause operations which may mutate the data pointed to by `accPtr` are marked as doing so. - Data clause operations which update required structured or dynamic runtime counters are marked as reading and writing the newly defined `RuntimeCounters` resource. Some operations, like `acc.getdeviceptr` do not actually use the runtime counters - but are marked as reading them since the address obtained depends on the mapping operations which do update the runtime counters. Namely, `acc.getdeviceptr` cannot be moved across other mapping operations. - Constructs are marked as writing to the `ConstructResource`. This may be too strict but is needed for the following reasons: 1) Structured constructs may not use `accPtr` and instead use `varPtr` - when this is the case, data actions may be removed even when used. 2) Unstructured constructs are currently used to aggregate multiple data actions. We do not want such constructs removed or moved for now. - Terminators are marked as `Pure` as in other dialects. The current approach has the following limitations which may require further improvements: - Subsequent `acc.copyin` operations on same data do not actually read host memory pointed to by `varPtr` but are still marked as so. - Two `acc.delete` operations on same data may not mutate `accPtr` until the runtime counters are zero (but are still marked as mutating). - The `varPtrPtr` argument, when present, points to the address of location of `varPtr`. When mapping to target device, an `accPtrPtr` needs computed and this memory is mutated. This effect is not captured since the current operations do not produce `accPtrPtr`. - Runtime counter effects are imprecise since two operations with differing `varPtr` increment/decrement different counters. Additionally, operations with `varPtrPtr` mutate attachment counters. - The `ConstructResource` is too strict and likely can be relaxed with better modeling.
@llvm/pr-subscribers-openacc @llvm/pr-subscribers-mlir Author: Razvan Lupusoru (razvanlupusoru) ChangesThe
The current approach has the following limitations which may require further improvements:
Patch is 39.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/75970.diff 4 Files Affected:
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index 75432db33a790d..fae54eefb02f70 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -218,14 +218,18 @@ static void createDeclareDeallocFuncWithArg(
builder.create<mlir::acc::DeclareExitOp>(
loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccPtr()));
- mlir::Value varPtr;
if constexpr (std::is_same_v<ExitOp, mlir::acc::CopyoutOp> ||
std::is_same_v<ExitOp, mlir::acc::UpdateHostOp>)
- varPtr = entryOp.getVarPtr();
- builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(), varPtr,
- entryOp.getBounds(), entryOp.getDataClause(),
- /*structured=*/false, /*implicit=*/false,
- builder.getStringAttr(*entryOp.getName()));
+ builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(),
+ entryOp.getVarPtr(), entryOp.getBounds(),
+ entryOp.getDataClause(),
+ /*structured=*/false, /*implicit=*/false,
+ builder.getStringAttr(*entryOp.getName()));
+ else
+ builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(),
+ entryOp.getBounds(), entryOp.getDataClause(),
+ /*structured=*/false, /*implicit=*/false,
+ builder.getStringAttr(*entryOp.getName()));
// Generate the post dealloc function.
modBuilder.setInsertionPointAfter(preDeallocOp);
@@ -368,14 +372,17 @@ static void genDataExitOperations(fir::FirOpBuilder &builder,
for (mlir::Value operand : operands) {
auto entryOp = mlir::dyn_cast_or_null<EntryOp>(operand.getDefiningOp());
assert(entryOp && "data entry op expected");
- mlir::Value varPtr;
if constexpr (std::is_same_v<ExitOp, mlir::acc::CopyoutOp> ||
std::is_same_v<ExitOp, mlir::acc::UpdateHostOp>)
- varPtr = entryOp.getVarPtr();
- builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(), varPtr,
- entryOp.getBounds(), entryOp.getDataClause(),
- structured, entryOp.getImplicit(),
- builder.getStringAttr(*entryOp.getName()));
+ builder.create<ExitOp>(
+ entryOp.getLoc(), entryOp.getAccPtr(), entryOp.getVarPtr(),
+ entryOp.getBounds(), entryOp.getDataClause(), structured,
+ entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName()));
+ else
+ builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(),
+ entryOp.getBounds(), entryOp.getDataClause(),
+ structured, entryOp.getImplicit(),
+ builder.getStringAttr(*entryOp.getName()));
}
}
@@ -2840,9 +2847,8 @@ static void createDeclareGlobalOp(mlir::OpBuilder &modBuilder,
else
builder.create<DeclareOp>(loc, mlir::Value{},
mlir::ValueRange(entryOp.getAccPtr()));
- mlir::Value varPtr;
if constexpr (std::is_same_v<GlobalOp, mlir::acc::GlobalDestructorOp>) {
- builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(), varPtr,
+ builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(),
entryOp.getBounds(), entryOp.getDataClause(),
/*structured=*/false, /*implicit=*/false,
builder.getStringAttr(*entryOp.getName()));
@@ -2930,14 +2936,18 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder,
builder.create<mlir::acc::DeclareExitOp>(
loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccPtr()));
- mlir::Value varPtr;
if constexpr (std::is_same_v<ExitOp, mlir::acc::CopyoutOp> ||
std::is_same_v<ExitOp, mlir::acc::UpdateHostOp>)
- varPtr = entryOp.getVarPtr();
- builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(), varPtr,
- entryOp.getBounds(), entryOp.getDataClause(),
- /*structured=*/false, /*implicit=*/false,
- builder.getStringAttr(*entryOp.getName()));
+ builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(),
+ entryOp.getVarPtr(), entryOp.getBounds(),
+ entryOp.getDataClause(),
+ /*structured=*/false, /*implicit=*/false,
+ builder.getStringAttr(*entryOp.getName()));
+ else
+ builder.create<ExitOp>(entryOp.getLoc(), entryOp.getAccPtr(),
+ entryOp.getBounds(), entryOp.getDataClause(),
+ /*structured=*/false, /*implicit=*/false,
+ builder.getStringAttr(*entryOp.getName()));
// Generate the post dealloc function.
modBuilder.setInsertionPointAfter(preDeallocOp);
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
index 4dc94782c1c9b5..36daf8de235f34 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
@@ -46,14 +46,22 @@
mlir::acc::UseDeviceOp, mlir::acc::ReductionOp, \
mlir::acc::DeclareDeviceResidentOp, mlir::acc::DeclareLinkOp, \
mlir::acc::CacheOp
+#define ACC_DATA_EXIT_OPS \
+ mlir::acc::CopyoutOp, mlir::acc::DeleteOp, mlir::acc::DetachOp, \
+ mlir::acc::UpdateHostOp
+#define ACC_DATA_CLAUSE_OPS ACC_DATA_ENTRY_OPS, ACC_DATA_EXIT_OPS
#define ACC_COMPUTE_CONSTRUCT_OPS \
mlir::acc::ParallelOp, mlir::acc::KernelsOp, mlir::acc::SerialOp
#define ACC_COMPUTE_CONSTRUCT_AND_LOOP_OPS \
ACC_COMPUTE_CONSTRUCT_OPS, mlir::acc::LoopOp
+#define OPENACC_DATA_CONSTRUCT_STRUCTURED_OPS \
+ mlir::acc::DataOp, mlir::acc::DeclareOp
+#define ACC_DATA_CONSTRUCT_UNSTRUCTURED_OPS \
+ mlir::acc::EnterDataOp, mlir::acc::ExitDataOp, mlir::acc::UpdateOp, \
+ mlir::acc::HostDataOp, mlir::acc::DeclareEnterOp, \
+ mlir::acc::DeclareExitOp
#define ACC_DATA_CONSTRUCT_OPS \
- mlir::acc::DataOp, mlir::acc::EnterDataOp, mlir::acc::ExitDataOp, \
- mlir::acc::UpdateOp, mlir::acc::HostDataOp, mlir::acc::DeclareEnterOp, \
- mlir::acc::DeclareExitOp, mlir::acc::DeclareOp
+ OPENACC_DATA_CONSTRUCT_STRUCTURED_OPS, ACC_DATA_CONSTRUCT_UNSTRUCTURED_OPS
#define ACC_COMPUTE_AND_DATA_CONSTRUCT_OPS \
ACC_COMPUTE_CONSTRUCT_OPS, ACC_DATA_CONSTRUCT_OPS
#define ACC_COMPUTE_LOOP_AND_DATA_CONSTRUCT_OPS \
@@ -73,9 +81,27 @@ namespace acc {
/// combined and the final mapping value would be 5 (4 | 1).
enum OpenACCExecMapping { NONE = 0, VECTOR = 1, WORKER = 2, GANG = 4 };
-/// Used to obtain the `varPtr` from a data entry operation.
-/// Returns empty value if not a data entry operation.
-mlir::Value getVarPtr(mlir::Operation *accDataEntryOp);
+/// Used to obtain the `varPtr` from a data clause operation.
+/// Returns empty value if not a data clause operation or is a data exit
+/// operation with no `varPtr`.
+mlir::Value getVarPtr(mlir::Operation *accDataClauseOp);
+
+/// Used to obtain the `accPtr` from a data clause operation.
+/// When a data entry operation, it obtains its result `accPtr` value.
+/// If a data exit operation, it obtains its operand `accPtr` value.
+/// Returns empty value if not a data clause operation.
+mlir::Value getAccPtr(mlir::Operation *accDataClauseOp);
+
+/// Used to obtain the `varPtrPtr` from a data clause operation.
+/// Returns empty value if not a data clause operation.
+mlir::Value getVarPtrPtr(mlir::Operation *accDataClauseOp);
+
+/// Used to obtain `bounds` from an acc data clause operation.
+/// Returns an empty vector if there are no bounds.
+mlir::SmallVector<mlir::Value> getBounds(mlir::Operation *accDataClauseOp);
+
+/// Used to obtain the `name` from an acc operation.
+std::optional<llvm::StringRef> getVarName(mlir::Operation *accOp);
/// Used to obtain the `dataClause` from a data entry operation.
/// Returns empty optional if not a data entry operation.
@@ -87,6 +113,12 @@ getDataClause(mlir::Operation *accDataEntryOp);
/// implicit flag.
bool getImplicitFlag(mlir::Operation *accDataEntryOp);
+/// Used to get an immutable range iterating over the data operands.
+mlir::ValueRange getDataOperands(mlir::Operation *accOp);
+
+/// Used to get a mutable range iterating over the data operands.
+mlir::MutableOperandRange getMutableDataOperands(mlir::Operation *accOp);
+
/// Used to obtain the attribute name for declare.
static constexpr StringLiteral getDeclareAttrName() {
return StringLiteral("acc.declare");
@@ -100,6 +132,16 @@ static constexpr StringLiteral getRoutineInfoAttrName() {
return StringLiteral("acc.routine_info");
}
+struct RuntimeCounters
+ : public mlir::SideEffects::Resource::Base<RuntimeCounters> {
+ mlir::StringRef getName() final { return "AccRuntimeCounters"; }
+};
+
+struct ConstructResource
+ : public mlir::SideEffects::Resource::Base<ConstructResource> {
+ mlir::StringRef getName() final { return "AccConstructResource"; }
+};
+
} // namespace acc
} // namespace mlir
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
index 9d48b1f1c3f9af..4abc0d62dbfdae 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
@@ -179,6 +179,15 @@ def OpenACC_DeviceTypeAttr : EnumAttr<OpenACC_Dialect,
let assemblyFormat = [{ ```<` $value `>` }];
}
+// Define a resource for the OpenACC runtime counters.
+def OpenACC_RuntimeCounters : Resource<"::mlir::acc::RuntimeCounters">;
+
+// Define a resource for the OpenACC constructs.
+// Useful to ensure that the constructs are not removed (even though
+// the data semantics are encoded in the operations linked via their
+// `dataOperands` list).
+def OpenACC_ConstructResource : Resource<"::mlir::acc::ConstructResource">;
+
// Used for data specification in data clauses (2.7.1).
// Either (or both) extent and upperbound must be specified.
def OpenACC_DataBoundsOp : OpenACC_Op<"bounds",
@@ -250,18 +259,18 @@ def OpenACC_DataBoundsOp : OpenACC_Op<"bounds",
//
// The bounds are represented in rank order. Rank 0 (inner-most dimension) is
// the first.
+//
class OpenACC_DataEntryOp<string mnemonic, string clause, string extraDescription,
- list<Trait> traits = []> :
+ list<Trait> traits = [], dag additionalArgs = (ins)> :
OpenACC_Op<mnemonic, !listconcat(traits,
[AttrSizedOperandSegments])> {
- let arguments = (ins OpenACC_PointerLikeTypeInterface:$varPtr,
- Optional<OpenACC_PointerLikeTypeInterface>:$varPtrPtr,
+ let arguments = !con(additionalArgs,
+ (ins Optional<OpenACC_PointerLikeTypeInterface>:$varPtrPtr,
Variadic<OpenACC_DataBoundsType>:$bounds, /* rank-0 to rank-{n-1} */
DefaultValuedAttr<OpenACC_DataClauseAttr,clause>:$dataClause,
DefaultValuedAttr<BoolAttr, "true">:$structured,
DefaultValuedAttr<BoolAttr, "false">:$implicit,
- OptionalAttr<StrAttr>:$name);
- let results = (outs OpenACC_PointerLikeTypeInterface:$accPtr);
+ OptionalAttr<StrAttr>:$name));
let description = !strconcat(extraDescription, [{
Description of arguments:
@@ -299,50 +308,73 @@ class OpenACC_DataEntryOp<string mnemonic, string clause, string extraDescriptio
// 2.5.13 private clause
//===----------------------------------------------------------------------===//
def OpenACC_PrivateOp : OpenACC_DataEntryOp<"private",
- "mlir::acc::DataClause::acc_private", ""> {
+ "mlir::acc::DataClause::acc_private", "", [],
+ (ins OpenACC_PointerLikeTypeInterface:$varPtr)> {
let summary = "Represents private semantics for acc private clause.";
+ let results = (outs Arg<OpenACC_PointerLikeTypeInterface,
+ "Address of device variable",[MemWrite]>:$accPtr);
}
//===----------------------------------------------------------------------===//
// 2.5.14 firstprivate clause
//===----------------------------------------------------------------------===//
def OpenACC_FirstprivateOp : OpenACC_DataEntryOp<"firstprivate",
- "mlir::acc::DataClause::acc_firstprivate", ""> {
+ "mlir::acc::DataClause::acc_firstprivate", "", [],
+ (ins Arg<OpenACC_PointerLikeTypeInterface,"Address of variable",[MemRead]>:$varPtr)> {
let summary = "Represents firstprivate semantic for the acc firstprivate "
"clause.";
+ let results = (outs Arg<OpenACC_PointerLikeTypeInterface,
+ "Address of device variable",[MemWrite]>:$accPtr);
}
//===----------------------------------------------------------------------===//
// 2.5.15 reduction clause
//===----------------------------------------------------------------------===//
def OpenACC_ReductionOp : OpenACC_DataEntryOp<"reduction",
- "mlir::acc::DataClause::acc_reduction", ""> {
+ "mlir::acc::DataClause::acc_reduction", "",
+ [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>,
+ MemWrite<OpenACC_RuntimeCounters>]>],
+ (ins Arg<OpenACC_PointerLikeTypeInterface,"Address of variable",[MemRead]>:$varPtr)> {
let summary = "Represents reduction semantics for acc reduction clause.";
+ let results = (outs Arg<OpenACC_PointerLikeTypeInterface,
+ "Address of device variable",[MemWrite]>:$accPtr);
}
//===----------------------------------------------------------------------===//
// 2.7.4 deviceptr clause
//===----------------------------------------------------------------------===//
def OpenACC_DevicePtrOp : OpenACC_DataEntryOp<"deviceptr",
- "mlir::acc::DataClause::acc_deviceptr", ""> {
+ "mlir::acc::DataClause::acc_deviceptr", "",
+ [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>]>],
+ (ins OpenACC_PointerLikeTypeInterface:$varPtr)> {
let summary = "Specifies that the variable pointer is a device pointer.";
+ let results = (outs OpenACC_PointerLikeTypeInterface:$accPtr);
}
//===----------------------------------------------------------------------===//
// 2.7.5 present clause
//===----------------------------------------------------------------------===//
def OpenACC_PresentOp : OpenACC_DataEntryOp<"present",
- "mlir::acc::DataClause::acc_present", ""> {
+ "mlir::acc::DataClause::acc_present", "",
+ [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>,
+ MemWrite<OpenACC_RuntimeCounters>]>],
+ (ins OpenACC_PointerLikeTypeInterface:$varPtr)> {
let summary = "Specifies that the variable is already present on device.";
+ let results = (outs OpenACC_PointerLikeTypeInterface:$accPtr);
}
//===----------------------------------------------------------------------===//
// 2.7.7 copyin clause
//===----------------------------------------------------------------------===//
def OpenACC_CopyinOp : OpenACC_DataEntryOp<"copyin",
- "mlir::acc::DataClause::acc_copyin", ""> {
+ "mlir::acc::DataClause::acc_copyin", "",
+ [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>,
+ MemWrite<OpenACC_RuntimeCounters>]>],
+ (ins Arg<OpenACC_PointerLikeTypeInterface,"Address of variable",[MemRead]>:$varPtr)> {
let summary = "Represents copyin semantics for acc data clauses like acc "
"copyin and acc copy.";
+ let results = (outs Arg<OpenACC_PointerLikeTypeInterface,
+ "Address of device variable",[MemWrite]>:$accPtr);
let extraClassDeclaration = [{
/// Check if this is a copyin with readonly modifier.
@@ -354,9 +386,14 @@ def OpenACC_CopyinOp : OpenACC_DataEntryOp<"copyin",
// 2.7.9 create clause
//===----------------------------------------------------------------------===//
def OpenACC_CreateOp : OpenACC_DataEntryOp<"create",
- "mlir::acc::DataClause::acc_create", ""> {
+ "mlir::acc::DataClause::acc_create", "",
+ [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>,
+ MemWrite<OpenACC_RuntimeCounters>]>],
+ (ins OpenACC_PointerLikeTypeInterface:$varPtr)> {
let summary = "Represents create semantics for acc data clauses like acc "
"create and acc copyout.";
+ let results = (outs Arg<OpenACC_PointerLikeTypeInterface,
+ "Address of device variable",[MemWrite]>:$accPtr);
let extraClassDeclaration = [{
/// Check if this is a create with zero modifier.
@@ -368,18 +405,26 @@ def OpenACC_CreateOp : OpenACC_DataEntryOp<"create",
// 2.7.10 no_create clause
//===----------------------------------------------------------------------===//
def OpenACC_NoCreateOp : OpenACC_DataEntryOp<"nocreate",
- "mlir::acc::DataClause::acc_no_create", ""> {
+ "mlir::acc::DataClause::acc_no_create", "",
+ [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>,
+ MemWrite<OpenACC_RuntimeCounters>]>],
+ (ins OpenACC_PointerLikeTypeInterface:$varPtr)> {
let summary = "Represents acc no_create semantics.";
+ let results = (outs OpenACC_PointerLikeTypeInterface:$accPtr);
}
//===----------------------------------------------------------------------===//
// 2.7.12 attach clause
//===----------------------------------------------------------------------===//
def OpenACC_AttachOp : OpenACC_DataEntryOp<"attach",
- "mlir::acc::DataClause::acc_attach", ""> {
+ "mlir::acc::DataClause::acc_attach", "",
+ [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>,
+ MemWrite<OpenACC_RuntimeCounters>]>],
+ (ins Arg<OpenACC_PointerLikeTypeInterface,"Address of variable",[MemRead]>:$varPtr)> {
let summary = "Represents acc attach semantics which updates a pointer in "
"device memory with the corresponding device address of the "
"pointee.";
+ let results = (outs OpenACC_PointerLikeTypeInterface:$accPtr);
}
//===----------------------------------------------------------------------===//
@@ -397,8 +442,10 @@ def OpenACC_GetDevicePtrOp : OpenACC_DataEntryOp<"getdeviceptr",
operation is not visible. This operation can have a `dataClause` argument
that is any of the valid `mlir::acc::DataClause` entries.
\
- }]> {
+ }], [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>]>],
+ (ins OpenACC_PointerLikeTypeInterface:$varPtr)> {
let summary = "Gets device address if variable exists on device.";
+ let results = (outs OpenACC_PointerLikeTypeInterface:$accPtr);
let hasVerifier = 0;
}
@@ -406,41 +453,55 @@ def OpenACC_GetDevicePtrOp : OpenACC_DataEntryOp<"getdeviceptr",
// 2.14.4 device clause
//===----------------------------------------------------------------------===//
def OpenACC_UpdateDeviceOp : OpenACC_DataEntryOp<"update_device",
- "mlir::acc::DataClause::acc_update_device", ""> {
+ "mlir::acc::DataClause::acc_update_device", "", [],
+ (ins Arg<OpenACC_PointerLikeTypeInterface,"Address of variable",[MemRead]>:$varPtr)> {
let summary = "Represents acc update device semantics.";
+ let results = (outs Arg<OpenACC_PointerLikeTypeInterface,
+ "Address of device variable",[MemWrite]>:$accPtr);
}
//===----------------------------------------------------------------------===//
// 2.8 use_device clause
//===----------------------------------------------------------------------===//
def OpenACC_UseDeviceOp : OpenACC_DataEntryOp<"use_device",
- "mlir::acc::DataClause::acc_use_device", ""> {
+ "mlir::acc::DataClause::acc_use_device", "",
+ [MemoryEffects<[MemRead<OpenACC_RuntimeCounters>]>],
+ (ins OpenACC_PointerLikeTypeInterface:$varPtr)> {
let summary = "Represents acc use_device semantics.";
+ let results = (outs OpenACC_PointerLikeTypeInterface:$accPtr);
}
//===----------------------------------------------------------------------===//
// 2.13.1 device_resident clause
//===---------------------------------------------------------------...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for working on this. This is a very good start and we can iterate if needed.
I just have a small question on the private op.
let summary = "Represents private semantics for acc private clause."; | ||
let results = (outs Arg<OpenACC_PointerLikeTypeInterface, | ||
"Address of device variable",[MemWrite]>:$accPtr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Address of device variable",[MemWrite]>:$accPtr); | |
"Address of device variable",[MemAlloc]>:$accPtr); |
Can we use MemAlloc here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use MemAlloc here?
I was conflicted what to do about the memory effects on accPtr
for the private
operation. For firstprivate
it was more obvious because it gets written to. I agree that MemAlloc
makes sense - I was just hesitant because scope for such case is important. And current placement of the private
operation does not properly convey scope. So I think it is preferable to use MemWrite
here for now. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with MemWrite
for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you for making this happen, Razvan!
} | ||
|
||
//===----------------------------------------------------------------------===// | ||
// 2.5.15 reduction clause | ||
//===----------------------------------------------------------------------===// | ||
def OpenACC_ReductionOp : OpenACC_DataEntryOp<"reduction", | ||
"mlir::acc::DataClause::acc_reduction", ""> { | ||
"mlir::acc::DataClause::acc_reduction", "", | ||
[MemoryEffects<[MemRead<OpenACC_RuntimeCounters>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really required? I thought the data action is decomposed from the reduction clause as explicit copyin/copyout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really required? I thought the data action is decomposed from the reduction clause as explicit copyin/copyout.
Nice catch - I forgot that this is what was done. I currently modeled it as a copyin
but instead should be modeled like firstprivate
instead remembering this.
Since this operation gets decomposed to `acc.copyin` at entry to region, it does not need to model RuntimeCounters resource. So model it more like `acc.firstprivate`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update!
The
acc
dialect operations now implement MemoryEffects interfaces in the following ways:varPtr
are now marked as so. The majority of them do NOT actually read the host memory. For example,acc.present
works on the basis of presence of pointer and not necessarily what the data points to - so they are not marked as reading the host memory. They still usevarPtr
though but this dependency is reflected through ssa.accPtr
are marked as doing so.RuntimeCounters
resource. Some operations, likeacc.getdeviceptr
do not actually use the runtime counters - but are marked as reading them since the address obtained depends on the mapping operations which do update the runtime counters. Namely,acc.getdeviceptr
cannot be moved across other mapping operations.ConstructResource
. This may be too strict but is needed for the following reasons: 1) Structured constructs may not useaccPtr
and instead usevarPtr
- when this is the case, data actions may be removed even when used. 2) Unstructured constructs are currently used to aggregate multiple data actions. We do not want such constructs removed or moved for now.Pure
as in other dialects.The current approach has the following limitations which may require further improvements:
acc.copyin
operations on same data do not actually read host memory pointed to byvarPtr
but are still marked as so.acc.delete
operations on same data may not mutateaccPtr
until the runtime counters are zero (but are still marked as mutating).varPtrPtr
argument, when present, points to the address of location ofvarPtr
. When mapping to target device, anaccPtrPtr
needs computed and this memory is mutated. This effect is not captured since the current operations do not produceaccPtrPtr
.varPtr
increment/decrement different counters. Additionally, operations withvarPtrPtr
mutate attachment counters.ConstructResource
is too strict and likely can be relaxed with better modeling.