-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[acc] Add acc.specialized_routine attribute #170766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Introduce a new attribute `acc.specialized_routine` to mark functions that have been specialized from a host function marked with `acc.routine_info`. The new attribute captures: - A SymbolRefAttr referencing the original `acc.routine` operation - The parallelism level via the new `ParLevel` enum - The original function name (since specialized functions may be renamed) Example - before specialization: ``` acc.routine @routine_gang func(@foo) gang acc.routine @routine_vector func(@foo) vector func.func @foo() attributes { acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]> } { ... } ``` After specialization, there are three functions: the original function and two specialized versions (one per parallelism level): ``` acc.routine @routine_gang func(@foo) gang acc.routine @routine_vector func(@foo) vector // Original function (unchanged) func.func @foo() attributes { acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]> } { ... } // Specialized for gang parallelism func.func @foo_gang() attributes { acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "foo"> } { ... } // Specialized for vector parallelism func.func @foo_vector() attributes { acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "foo"> } { ... } ```
|
@llvm/pr-subscribers-openacc @llvm/pr-subscribers-mlir-openacc Author: Razvan Lupusoru (razvanlupusoru) ChangesIntroduce a new attribute The new attribute captures:
Example - before specialization: After specialization, there are three functions: Full diff: https://github.com/llvm/llvm-project/pull/170766.diff 4 Files Affected:
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
index 252a78648dd74..84fbf2c3d936c 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
@@ -177,14 +177,25 @@ static constexpr StringLiteral getDeclareActionAttrName() {
}
static constexpr StringLiteral getRoutineInfoAttrName() {
- return StringLiteral("acc.routine_info");
+ return RoutineInfoAttr::name;
}
-/// Used to check whether the current operation is an `acc routine`
-inline bool isAccRoutineOp(mlir::Operation *op) {
+static constexpr StringLiteral getSpecializedRoutineAttrName() {
+ return SpecializedRoutineAttr::name;
+}
+
+/// Used to check whether the current operation is marked with
+/// `acc routine`. The operation passed in should be a function.
+inline bool isAccRoutine(mlir::Operation *op) {
return op->hasAttr(mlir::acc::getRoutineInfoAttrName());
}
+/// Used to check whether this is a specialized accelerator version of
+/// `acc routine` function.
+inline bool isSpecializedAccRoutine(mlir::Operation *op) {
+ return op->hasAttr(mlir::acc::getSpecializedRoutineAttrName());
+}
+
static constexpr StringLiteral getFromDefaultClauseAttrName() {
return StringLiteral("acc.from_default");
}
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
index 7a727bd7fb838..ba13a9c83a0b2 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
@@ -152,6 +152,32 @@ def OpenACC_LoopParMode : I32EnumAttr<
let genSpecializedAttr = 0;
}
+// Parallelism level (gang/worker/vector/seq).
+// GangDim1 is the default gang level (equivalent to just "gang").
+// GangDim2/GangDim3 are for gang(dim:2) and gang(dim:3).
+def OpenACC_ParLevelSeq : I32EnumAttrCase<"seq", 0>;
+def OpenACC_ParLevelGangDim1 : I32EnumAttrCase<"gang_dim1", 1>;
+def OpenACC_ParLevelGangDim2 : I32EnumAttrCase<"gang_dim2", 2>;
+def OpenACC_ParLevelGangDim3 : I32EnumAttrCase<"gang_dim3", 3>;
+def OpenACC_ParLevelWorker : I32EnumAttrCase<"worker", 4>;
+def OpenACC_ParLevelVector : I32EnumAttrCase<"vector", 5>;
+
+def OpenACC_ParLevel : I32EnumAttr<"ParLevel",
+ "Parallelism level (gang/worker/vector/seq)",
+ [OpenACC_ParLevelSeq,
+ OpenACC_ParLevelGangDim1, OpenACC_ParLevelGangDim2,
+ OpenACC_ParLevelGangDim3,
+ OpenACC_ParLevelWorker, OpenACC_ParLevelVector]> {
+ let genSpecializedAttr = 0;
+ let cppNamespace = "::mlir::acc";
+}
+
+def OpenACC_ParLevelAttr : EnumAttr<OpenACC_Dialect,
+ OpenACC_ParLevel,
+ "par_level"> {
+ let assemblyFormat = [{ ```<` $value `>` }];
+}
+
def OpenACC_PrivateRecipe : I32EnumAttrCase<"private_recipe", 0>;
def OpenACC_FirstprivateRecipe : I32EnumAttrCase<"firstprivate_recipe", 1>;
def OpenACC_ReductionRecipe : I32EnumAttrCase<"reduction_recipe", 2>;
@@ -3336,6 +3362,58 @@ def RoutineInfoAttr : OpenACC_Attr<"RoutineInfo", "routine_info"> {
let assemblyFormat = "`<` `[` `` $accRoutines `]` `>`";
}
+def SpecializedRoutineAttr : OpenACC_Attr<"SpecializedRoutine",
+ "specialized_routine"> {
+ let summary = "Marks a specialized device version of an acc routine";
+
+ let description = [{
+ This attribute is attached to a function that was specialized from a host
+ function marked with `acc.routine_info`. It captures the parallelism level,
+ a reference to the original `acc.routine` operation, and the original
+ function name (since the specialized function may be renamed).
+
+ Example - before specialization:
+ ```mlir
+ acc.routine @routine_gang func(@foo) gang
+ acc.routine @routine_vector func(@foo) vector
+
+ func.func @foo() attributes {
+ acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
+ } { ... }
+ ```
+
+ After specialization, there are three functions: the original function and
+ two specialized versions (one per parallelism level):
+ ```mlir
+ acc.routine @routine_gang func(@foo) gang
+ acc.routine @routine_vector func(@foo) vector
+
+ // Original function (unchanged)
+ func.func @foo() attributes {
+ acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
+ } { ... }
+
+ // Specialized for gang parallelism
+ func.func @foo_gang() attributes {
+ acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "foo">
+ } { ... }
+
+ // Specialized for vector parallelism
+ func.func @foo_vector() attributes {
+ acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "foo">
+ } { ... }
+ ```
+ }];
+
+ let parameters = (ins
+ "SymbolRefAttr":$routine,
+ "ParLevelAttr":$level,
+ "StringAttr":$funcName
+ );
+
+ let assemblyFormat = "`<` $routine `,` $level `,` $funcName `>`";
+}
+
//===----------------------------------------------------------------------===//
// 2.14.1. Init Directive
//===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp b/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
index 766f690e21459..8cab2234ec370 100644
--- a/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
+++ b/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
@@ -360,7 +360,9 @@ class ACCImplicitDeclare
accOp.getRegion(), globalsToAccDeclare, accSupport, symTab);
})
.Case<FunctionOpInterface>([&](auto func) {
- if (acc::isAccRoutineOp(func) && !func.isExternal())
+ if ((acc::isAccRoutine(func) ||
+ acc::isSpecializedAccRoutine(func)) &&
+ !func.isExternal())
collectGlobalsFromDeviceRegion(func.getFunctionBody(),
globalsToAccDeclare, accSupport,
symTab);
diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir
index 5a1c20bcf5a24..d31397c15769b 100644
--- a/mlir/test/Dialect/OpenACC/ops.mlir
+++ b/mlir/test/Dialect/OpenACC/ops.mlir
@@ -1810,6 +1810,59 @@ acc.routine @acc_func_rout9 func(@acc_func) bind("acc_func_gpu_gang_dim1") gang(
// -----
+// Test acc.specialized_routine attribute for specialized device functions
+acc.routine @routine_seq func(@device_func_seq) seq
+acc.routine @routine_gang func(@device_func_gang) gang
+acc.routine @routine_gang_dim2 func(@device_func_gang_dim2) gang(dim: 2 : i64)
+acc.routine @routine_gang_dim3 func(@device_func_gang_dim3) gang(dim: 3 : i64)
+acc.routine @routine_worker func(@device_func_worker) worker
+acc.routine @routine_vector func(@device_func_vector) vector
+
+func.func @device_func_seq() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_seq, <seq>, "host_func_seq">} {
+ return
+}
+
+func.func @device_func_gang() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "host_func_gang">} {
+ return
+}
+
+func.func @device_func_gang_dim2() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim2, <gang_dim2>, "host_func_gang_dim2">} {
+ return
+}
+
+func.func @device_func_gang_dim3() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim3, <gang_dim3>, "host_func_gang_dim3">} {
+ return
+}
+
+func.func @device_func_worker() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_worker, <worker>, "host_func_worker">} {
+ return
+}
+
+func.func @device_func_vector() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "host_func_vector">} {
+ return
+}
+
+// CHECK: acc.routine @routine_seq func(@device_func_seq) seq
+// CHECK: acc.routine @routine_gang func(@device_func_gang) gang
+// CHECK: acc.routine @routine_gang_dim2 func(@device_func_gang_dim2) gang(dim: 2 : i64)
+// CHECK: acc.routine @routine_gang_dim3 func(@device_func_gang_dim3) gang(dim: 3 : i64)
+// CHECK: acc.routine @routine_worker func(@device_func_worker) worker
+// CHECK: acc.routine @routine_vector func(@device_func_vector) vector
+// CHECK-LABEL: func.func @device_func_seq()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_seq, <seq>, "host_func_seq">}
+// CHECK-LABEL: func.func @device_func_gang()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "host_func_gang">}
+// CHECK-LABEL: func.func @device_func_gang_dim2()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim2, <gang_dim2>, "host_func_gang_dim2">}
+// CHECK-LABEL: func.func @device_func_gang_dim3()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim3, <gang_dim3>, "host_func_gang_dim3">}
+// CHECK-LABEL: func.func @device_func_worker()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_worker, <worker>, "host_func_worker">}
+// CHECK-LABEL: func.func @device_func_vector()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "host_func_vector">}
+
+// -----
+
func.func @acc_func() -> () {
"test.openacc_dummy_op"() {acc.declare_action = #acc.declare_action<postAlloc = @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc>} : () -> ()
return
|
|
@llvm/pr-subscribers-mlir Author: Razvan Lupusoru (razvanlupusoru) ChangesIntroduce a new attribute The new attribute captures:
Example - before specialization: After specialization, there are three functions: Full diff: https://github.com/llvm/llvm-project/pull/170766.diff 4 Files Affected:
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
index 252a78648dd74..84fbf2c3d936c 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
@@ -177,14 +177,25 @@ static constexpr StringLiteral getDeclareActionAttrName() {
}
static constexpr StringLiteral getRoutineInfoAttrName() {
- return StringLiteral("acc.routine_info");
+ return RoutineInfoAttr::name;
}
-/// Used to check whether the current operation is an `acc routine`
-inline bool isAccRoutineOp(mlir::Operation *op) {
+static constexpr StringLiteral getSpecializedRoutineAttrName() {
+ return SpecializedRoutineAttr::name;
+}
+
+/// Used to check whether the current operation is marked with
+/// `acc routine`. The operation passed in should be a function.
+inline bool isAccRoutine(mlir::Operation *op) {
return op->hasAttr(mlir::acc::getRoutineInfoAttrName());
}
+/// Used to check whether this is a specialized accelerator version of
+/// `acc routine` function.
+inline bool isSpecializedAccRoutine(mlir::Operation *op) {
+ return op->hasAttr(mlir::acc::getSpecializedRoutineAttrName());
+}
+
static constexpr StringLiteral getFromDefaultClauseAttrName() {
return StringLiteral("acc.from_default");
}
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
index 7a727bd7fb838..ba13a9c83a0b2 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
@@ -152,6 +152,32 @@ def OpenACC_LoopParMode : I32EnumAttr<
let genSpecializedAttr = 0;
}
+// Parallelism level (gang/worker/vector/seq).
+// GangDim1 is the default gang level (equivalent to just "gang").
+// GangDim2/GangDim3 are for gang(dim:2) and gang(dim:3).
+def OpenACC_ParLevelSeq : I32EnumAttrCase<"seq", 0>;
+def OpenACC_ParLevelGangDim1 : I32EnumAttrCase<"gang_dim1", 1>;
+def OpenACC_ParLevelGangDim2 : I32EnumAttrCase<"gang_dim2", 2>;
+def OpenACC_ParLevelGangDim3 : I32EnumAttrCase<"gang_dim3", 3>;
+def OpenACC_ParLevelWorker : I32EnumAttrCase<"worker", 4>;
+def OpenACC_ParLevelVector : I32EnumAttrCase<"vector", 5>;
+
+def OpenACC_ParLevel : I32EnumAttr<"ParLevel",
+ "Parallelism level (gang/worker/vector/seq)",
+ [OpenACC_ParLevelSeq,
+ OpenACC_ParLevelGangDim1, OpenACC_ParLevelGangDim2,
+ OpenACC_ParLevelGangDim3,
+ OpenACC_ParLevelWorker, OpenACC_ParLevelVector]> {
+ let genSpecializedAttr = 0;
+ let cppNamespace = "::mlir::acc";
+}
+
+def OpenACC_ParLevelAttr : EnumAttr<OpenACC_Dialect,
+ OpenACC_ParLevel,
+ "par_level"> {
+ let assemblyFormat = [{ ```<` $value `>` }];
+}
+
def OpenACC_PrivateRecipe : I32EnumAttrCase<"private_recipe", 0>;
def OpenACC_FirstprivateRecipe : I32EnumAttrCase<"firstprivate_recipe", 1>;
def OpenACC_ReductionRecipe : I32EnumAttrCase<"reduction_recipe", 2>;
@@ -3336,6 +3362,58 @@ def RoutineInfoAttr : OpenACC_Attr<"RoutineInfo", "routine_info"> {
let assemblyFormat = "`<` `[` `` $accRoutines `]` `>`";
}
+def SpecializedRoutineAttr : OpenACC_Attr<"SpecializedRoutine",
+ "specialized_routine"> {
+ let summary = "Marks a specialized device version of an acc routine";
+
+ let description = [{
+ This attribute is attached to a function that was specialized from a host
+ function marked with `acc.routine_info`. It captures the parallelism level,
+ a reference to the original `acc.routine` operation, and the original
+ function name (since the specialized function may be renamed).
+
+ Example - before specialization:
+ ```mlir
+ acc.routine @routine_gang func(@foo) gang
+ acc.routine @routine_vector func(@foo) vector
+
+ func.func @foo() attributes {
+ acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
+ } { ... }
+ ```
+
+ After specialization, there are three functions: the original function and
+ two specialized versions (one per parallelism level):
+ ```mlir
+ acc.routine @routine_gang func(@foo) gang
+ acc.routine @routine_vector func(@foo) vector
+
+ // Original function (unchanged)
+ func.func @foo() attributes {
+ acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
+ } { ... }
+
+ // Specialized for gang parallelism
+ func.func @foo_gang() attributes {
+ acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "foo">
+ } { ... }
+
+ // Specialized for vector parallelism
+ func.func @foo_vector() attributes {
+ acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "foo">
+ } { ... }
+ ```
+ }];
+
+ let parameters = (ins
+ "SymbolRefAttr":$routine,
+ "ParLevelAttr":$level,
+ "StringAttr":$funcName
+ );
+
+ let assemblyFormat = "`<` $routine `,` $level `,` $funcName `>`";
+}
+
//===----------------------------------------------------------------------===//
// 2.14.1. Init Directive
//===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp b/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
index 766f690e21459..8cab2234ec370 100644
--- a/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
+++ b/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
@@ -360,7 +360,9 @@ class ACCImplicitDeclare
accOp.getRegion(), globalsToAccDeclare, accSupport, symTab);
})
.Case<FunctionOpInterface>([&](auto func) {
- if (acc::isAccRoutineOp(func) && !func.isExternal())
+ if ((acc::isAccRoutine(func) ||
+ acc::isSpecializedAccRoutine(func)) &&
+ !func.isExternal())
collectGlobalsFromDeviceRegion(func.getFunctionBody(),
globalsToAccDeclare, accSupport,
symTab);
diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir
index 5a1c20bcf5a24..d31397c15769b 100644
--- a/mlir/test/Dialect/OpenACC/ops.mlir
+++ b/mlir/test/Dialect/OpenACC/ops.mlir
@@ -1810,6 +1810,59 @@ acc.routine @acc_func_rout9 func(@acc_func) bind("acc_func_gpu_gang_dim1") gang(
// -----
+// Test acc.specialized_routine attribute for specialized device functions
+acc.routine @routine_seq func(@device_func_seq) seq
+acc.routine @routine_gang func(@device_func_gang) gang
+acc.routine @routine_gang_dim2 func(@device_func_gang_dim2) gang(dim: 2 : i64)
+acc.routine @routine_gang_dim3 func(@device_func_gang_dim3) gang(dim: 3 : i64)
+acc.routine @routine_worker func(@device_func_worker) worker
+acc.routine @routine_vector func(@device_func_vector) vector
+
+func.func @device_func_seq() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_seq, <seq>, "host_func_seq">} {
+ return
+}
+
+func.func @device_func_gang() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "host_func_gang">} {
+ return
+}
+
+func.func @device_func_gang_dim2() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim2, <gang_dim2>, "host_func_gang_dim2">} {
+ return
+}
+
+func.func @device_func_gang_dim3() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim3, <gang_dim3>, "host_func_gang_dim3">} {
+ return
+}
+
+func.func @device_func_worker() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_worker, <worker>, "host_func_worker">} {
+ return
+}
+
+func.func @device_func_vector() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "host_func_vector">} {
+ return
+}
+
+// CHECK: acc.routine @routine_seq func(@device_func_seq) seq
+// CHECK: acc.routine @routine_gang func(@device_func_gang) gang
+// CHECK: acc.routine @routine_gang_dim2 func(@device_func_gang_dim2) gang(dim: 2 : i64)
+// CHECK: acc.routine @routine_gang_dim3 func(@device_func_gang_dim3) gang(dim: 3 : i64)
+// CHECK: acc.routine @routine_worker func(@device_func_worker) worker
+// CHECK: acc.routine @routine_vector func(@device_func_vector) vector
+// CHECK-LABEL: func.func @device_func_seq()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_seq, <seq>, "host_func_seq">}
+// CHECK-LABEL: func.func @device_func_gang()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "host_func_gang">}
+// CHECK-LABEL: func.func @device_func_gang_dim2()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim2, <gang_dim2>, "host_func_gang_dim2">}
+// CHECK-LABEL: func.func @device_func_gang_dim3()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim3, <gang_dim3>, "host_func_gang_dim3">}
+// CHECK-LABEL: func.func @device_func_worker()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_worker, <worker>, "host_func_worker">}
+// CHECK-LABEL: func.func @device_func_vector()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "host_func_vector">}
+
+// -----
+
func.func @acc_func() -> () {
"test.openacc_dummy_op"() {acc.declare_action = #acc.declare_action<postAlloc = @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc>} : () -> ()
return
|
clementval
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/129/builds/34408 Here is the relevant piece of the build log for the reference |
Introduce a new attribute `acc.specialized_routine` to mark functions that have been specialized from a host function marked with `acc.routine_info`. The new attribute captures: - A SymbolRefAttr referencing the original `acc.routine` operation - The parallelism level via the new `ParLevel` enum - The original function name (since specialized functions may be renamed) Example - before specialization: ``` acc.routine @routine_gang func(@foo) gang acc.routine @routine_vector func(@foo) vector func.func @foo() attributes { acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]> } { ... } ``` After specialization, there are three functions: the original function and two specialized versions (one per parallelism level): ``` acc.routine @routine_gang func(@foo) gang acc.routine @routine_vector func(@foo) vector // Original function (unchanged) func.func @foo() attributes { acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]> } { ... } // Specialized for gang parallelism func.func @foo_gang() attributes { acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "foo"> } { ... } // Specialized for vector parallelism func.func @foo_vector() attributes { acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "foo"> } { ... } ```
Introduce a new attribute
acc.specialized_routineto mark functions that have been specialized from a host function marked withacc.routine_info.The new attribute captures:
acc.routineoperationParLevelenumExample - before specialization:
After specialization, there are three functions:
the original function and two specialized versions (one per parallelism level):