Skip to content

Conversation

@razvanlupusoru
Copy link
Contributor

Introduce a new attribute acc.specialized_routine to mark functions that have been specialized from a host function marked with acc.routine_info.

The new attribute captures:

  • A SymbolRefAttr referencing the original acc.routine operation
  • The parallelism level via the new ParLevel enum
  • The original function name (since specialized functions may be renamed)

Example - before specialization:

acc.routine @routine_gang func(@foo) gang
acc.routine @routine_vector func(@foo) vector

func.func @foo() attributes {
  acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
} { ... }

After specialization, there are three functions:
the original function and two specialized versions (one per parallelism level):

acc.routine @routine_gang func(@foo) gang
acc.routine @routine_vector func(@foo) vector

// Original function (unchanged)
func.func @foo() attributes {
  acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
} { ... }

// Specialized for gang parallelism
func.func @foo_gang() attributes {
  acc.specialized_routine = #acc.specialized_routine<@routine_gang,
<gang_dim1>, "foo">
} { ... }

// Specialized for vector parallelism
func.func @foo_vector() attributes {
  acc.specialized_routine = #acc.specialized_routine<@routine_vector,
<vector>, "foo">
} { ... }

Introduce a new attribute `acc.specialized_routine` to mark
functions that have been specialized from a host function
marked with `acc.routine_info`.

The new attribute captures:
- A SymbolRefAttr referencing the original `acc.routine`
  operation
- The parallelism level via the new `ParLevel` enum
- The original function name (since specialized functions
  may be renamed)

Example - before specialization:
```
acc.routine @routine_gang func(@foo) gang
acc.routine @routine_vector func(@foo) vector

func.func @foo() attributes {
  acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
} { ... }
```

After specialization, there are three functions:
the original function and two specialized versions
(one per parallelism level):
```
acc.routine @routine_gang func(@foo) gang
acc.routine @routine_vector func(@foo) vector

// Original function (unchanged)
func.func @foo() attributes {
  acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
} { ... }

// Specialized for gang parallelism
func.func @foo_gang() attributes {
  acc.specialized_routine = #acc.specialized_routine<@routine_gang,
<gang_dim1>, "foo">
} { ... }

// Specialized for vector parallelism
func.func @foo_vector() attributes {
  acc.specialized_routine = #acc.specialized_routine<@routine_vector,
<vector>, "foo">
} { ... }
```
@llvmbot
Copy link
Member

llvmbot commented Dec 4, 2025

@llvm/pr-subscribers-openacc

@llvm/pr-subscribers-mlir-openacc

Author: Razvan Lupusoru (razvanlupusoru)

Changes

Introduce a new attribute acc.specialized_routine to mark functions that have been specialized from a host function marked with acc.routine_info.

The new attribute captures:

  • A SymbolRefAttr referencing the original acc.routine operation
  • The parallelism level via the new ParLevel enum
  • The original function name (since specialized functions may be renamed)

Example - before specialization:

acc.routine @<!-- -->routine_gang func(@<!-- -->foo) gang
acc.routine @<!-- -->routine_vector func(@<!-- -->foo) vector

func.func @<!-- -->foo() attributes {
  acc.routine_info = #acc.routine_info&lt;[@<!-- -->routine_gang, @<!-- -->routine_vector]&gt;
} { ... }

After specialization, there are three functions:
the original function and two specialized versions (one per parallelism level):

acc.routine @<!-- -->routine_gang func(@<!-- -->foo) gang
acc.routine @<!-- -->routine_vector func(@<!-- -->foo) vector

// Original function (unchanged)
func.func @<!-- -->foo() attributes {
  acc.routine_info = #acc.routine_info&lt;[@<!-- -->routine_gang, @<!-- -->routine_vector]&gt;
} { ... }

// Specialized for gang parallelism
func.func @<!-- -->foo_gang() attributes {
  acc.specialized_routine = #acc.specialized_routine&lt;@<!-- -->routine_gang,
&lt;gang_dim1&gt;, "foo"&gt;
} { ... }

// Specialized for vector parallelism
func.func @<!-- -->foo_vector() attributes {
  acc.specialized_routine = #acc.specialized_routine&lt;@<!-- -->routine_vector,
&lt;vector&gt;, "foo"&gt;
} { ... }

Full diff: https://github.com/llvm/llvm-project/pull/170766.diff

4 Files Affected:

  • (modified) mlir/include/mlir/Dialect/OpenACC/OpenACC.h (+14-3)
  • (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+78)
  • (modified) mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp (+3-1)
  • (modified) mlir/test/Dialect/OpenACC/ops.mlir (+53)
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
index 252a78648dd74..84fbf2c3d936c 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
@@ -177,14 +177,25 @@ static constexpr StringLiteral getDeclareActionAttrName() {
 }
 
 static constexpr StringLiteral getRoutineInfoAttrName() {
-  return StringLiteral("acc.routine_info");
+  return RoutineInfoAttr::name;
 }
 
-/// Used to check whether the current operation is an `acc routine`
-inline bool isAccRoutineOp(mlir::Operation *op) {
+static constexpr StringLiteral getSpecializedRoutineAttrName() {
+  return SpecializedRoutineAttr::name;
+}
+
+/// Used to check whether the current operation is marked with
+/// `acc routine`. The operation passed in should be a function.
+inline bool isAccRoutine(mlir::Operation *op) {
   return op->hasAttr(mlir::acc::getRoutineInfoAttrName());
 }
 
+/// Used to check whether this is a specialized accelerator version of
+/// `acc routine` function.
+inline bool isSpecializedAccRoutine(mlir::Operation *op) {
+  return op->hasAttr(mlir::acc::getSpecializedRoutineAttrName());
+}
+
 static constexpr StringLiteral getFromDefaultClauseAttrName() {
   return StringLiteral("acc.from_default");
 }
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
index 7a727bd7fb838..ba13a9c83a0b2 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
@@ -152,6 +152,32 @@ def OpenACC_LoopParMode : I32EnumAttr<
   let genSpecializedAttr = 0;
 }
 
+// Parallelism level (gang/worker/vector/seq).
+// GangDim1 is the default gang level (equivalent to just "gang").
+// GangDim2/GangDim3 are for gang(dim:2) and gang(dim:3).
+def OpenACC_ParLevelSeq      : I32EnumAttrCase<"seq", 0>;
+def OpenACC_ParLevelGangDim1 : I32EnumAttrCase<"gang_dim1", 1>;
+def OpenACC_ParLevelGangDim2 : I32EnumAttrCase<"gang_dim2", 2>;
+def OpenACC_ParLevelGangDim3 : I32EnumAttrCase<"gang_dim3", 3>;
+def OpenACC_ParLevelWorker   : I32EnumAttrCase<"worker", 4>;
+def OpenACC_ParLevelVector   : I32EnumAttrCase<"vector", 5>;
+
+def OpenACC_ParLevel : I32EnumAttr<"ParLevel",
+    "Parallelism level (gang/worker/vector/seq)",
+    [OpenACC_ParLevelSeq,
+     OpenACC_ParLevelGangDim1, OpenACC_ParLevelGangDim2,
+     OpenACC_ParLevelGangDim3,
+     OpenACC_ParLevelWorker, OpenACC_ParLevelVector]> {
+  let genSpecializedAttr = 0;
+  let cppNamespace = "::mlir::acc";
+}
+
+def OpenACC_ParLevelAttr : EnumAttr<OpenACC_Dialect,
+                                    OpenACC_ParLevel,
+                                    "par_level"> {
+  let assemblyFormat = [{ ```<` $value `>` }];
+}
+
 def OpenACC_PrivateRecipe : I32EnumAttrCase<"private_recipe", 0>;
 def OpenACC_FirstprivateRecipe : I32EnumAttrCase<"firstprivate_recipe", 1>;
 def OpenACC_ReductionRecipe : I32EnumAttrCase<"reduction_recipe", 2>;
@@ -3336,6 +3362,58 @@ def RoutineInfoAttr : OpenACC_Attr<"RoutineInfo", "routine_info"> {
   let assemblyFormat = "`<` `[` `` $accRoutines `]` `>`";
 }
 
+def SpecializedRoutineAttr : OpenACC_Attr<"SpecializedRoutine",
+                                          "specialized_routine"> {
+  let summary = "Marks a specialized device version of an acc routine";
+
+  let description = [{
+    This attribute is attached to a function that was specialized from a host
+    function marked with `acc.routine_info`. It captures the parallelism level,
+    a reference to the original `acc.routine` operation, and the original
+    function name (since the specialized function may be renamed).
+
+    Example - before specialization:
+    ```mlir
+    acc.routine @routine_gang func(@foo) gang
+    acc.routine @routine_vector func(@foo) vector
+
+    func.func @foo() attributes {
+      acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
+    } { ... }
+    ```
+
+    After specialization, there are three functions: the original function and
+    two specialized versions (one per parallelism level):
+    ```mlir
+    acc.routine @routine_gang func(@foo) gang
+    acc.routine @routine_vector func(@foo) vector
+
+    // Original function (unchanged)
+    func.func @foo() attributes {
+      acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
+    } { ... }
+
+    // Specialized for gang parallelism
+    func.func @foo_gang() attributes {
+      acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "foo">
+    } { ... }
+
+    // Specialized for vector parallelism
+    func.func @foo_vector() attributes {
+      acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "foo">
+    } { ... }
+    ```
+  }];
+
+  let parameters = (ins
+    "SymbolRefAttr":$routine,
+    "ParLevelAttr":$level,
+    "StringAttr":$funcName
+  );
+
+  let assemblyFormat = "`<` $routine `,` $level `,` $funcName `>`";
+}
+
 //===----------------------------------------------------------------------===//
 // 2.14.1. Init Directive
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp b/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
index 766f690e21459..8cab2234ec370 100644
--- a/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
+++ b/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
@@ -360,7 +360,9 @@ class ACCImplicitDeclare
                     accOp.getRegion(), globalsToAccDeclare, accSupport, symTab);
               })
           .Case<FunctionOpInterface>([&](auto func) {
-            if (acc::isAccRoutineOp(func) && !func.isExternal())
+            if ((acc::isAccRoutine(func) ||
+                 acc::isSpecializedAccRoutine(func)) &&
+                !func.isExternal())
               collectGlobalsFromDeviceRegion(func.getFunctionBody(),
                                              globalsToAccDeclare, accSupport,
                                              symTab);
diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir
index 5a1c20bcf5a24..d31397c15769b 100644
--- a/mlir/test/Dialect/OpenACC/ops.mlir
+++ b/mlir/test/Dialect/OpenACC/ops.mlir
@@ -1810,6 +1810,59 @@ acc.routine @acc_func_rout9 func(@acc_func) bind("acc_func_gpu_gang_dim1") gang(
 
 // -----
 
+// Test acc.specialized_routine attribute for specialized device functions
+acc.routine @routine_seq func(@device_func_seq) seq
+acc.routine @routine_gang func(@device_func_gang) gang
+acc.routine @routine_gang_dim2 func(@device_func_gang_dim2) gang(dim: 2 : i64)
+acc.routine @routine_gang_dim3 func(@device_func_gang_dim3) gang(dim: 3 : i64)
+acc.routine @routine_worker func(@device_func_worker) worker
+acc.routine @routine_vector func(@device_func_vector) vector
+
+func.func @device_func_seq() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_seq, <seq>, "host_func_seq">} {
+  return
+}
+
+func.func @device_func_gang() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "host_func_gang">} {
+  return
+}
+
+func.func @device_func_gang_dim2() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim2, <gang_dim2>, "host_func_gang_dim2">} {
+  return
+}
+
+func.func @device_func_gang_dim3() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim3, <gang_dim3>, "host_func_gang_dim3">} {
+  return
+}
+
+func.func @device_func_worker() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_worker, <worker>, "host_func_worker">} {
+  return
+}
+
+func.func @device_func_vector() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "host_func_vector">} {
+  return
+}
+
+// CHECK: acc.routine @routine_seq func(@device_func_seq) seq
+// CHECK: acc.routine @routine_gang func(@device_func_gang) gang
+// CHECK: acc.routine @routine_gang_dim2 func(@device_func_gang_dim2) gang(dim: 2 : i64)
+// CHECK: acc.routine @routine_gang_dim3 func(@device_func_gang_dim3) gang(dim: 3 : i64)
+// CHECK: acc.routine @routine_worker func(@device_func_worker) worker
+// CHECK: acc.routine @routine_vector func(@device_func_vector) vector
+// CHECK-LABEL: func.func @device_func_seq()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_seq, <seq>, "host_func_seq">}
+// CHECK-LABEL: func.func @device_func_gang()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "host_func_gang">}
+// CHECK-LABEL: func.func @device_func_gang_dim2()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim2, <gang_dim2>, "host_func_gang_dim2">}
+// CHECK-LABEL: func.func @device_func_gang_dim3()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim3, <gang_dim3>, "host_func_gang_dim3">}
+// CHECK-LABEL: func.func @device_func_worker()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_worker, <worker>, "host_func_worker">}
+// CHECK-LABEL: func.func @device_func_vector()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "host_func_vector">}
+
+// -----
+
 func.func @acc_func() -> () {
   "test.openacc_dummy_op"() {acc.declare_action = #acc.declare_action<postAlloc = @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc>} : () -> ()
   return

@llvmbot
Copy link
Member

llvmbot commented Dec 4, 2025

@llvm/pr-subscribers-mlir

Author: Razvan Lupusoru (razvanlupusoru)

Changes

Introduce a new attribute acc.specialized_routine to mark functions that have been specialized from a host function marked with acc.routine_info.

The new attribute captures:

  • A SymbolRefAttr referencing the original acc.routine operation
  • The parallelism level via the new ParLevel enum
  • The original function name (since specialized functions may be renamed)

Example - before specialization:

acc.routine @<!-- -->routine_gang func(@<!-- -->foo) gang
acc.routine @<!-- -->routine_vector func(@<!-- -->foo) vector

func.func @<!-- -->foo() attributes {
  acc.routine_info = #acc.routine_info&lt;[@<!-- -->routine_gang, @<!-- -->routine_vector]&gt;
} { ... }

After specialization, there are three functions:
the original function and two specialized versions (one per parallelism level):

acc.routine @<!-- -->routine_gang func(@<!-- -->foo) gang
acc.routine @<!-- -->routine_vector func(@<!-- -->foo) vector

// Original function (unchanged)
func.func @<!-- -->foo() attributes {
  acc.routine_info = #acc.routine_info&lt;[@<!-- -->routine_gang, @<!-- -->routine_vector]&gt;
} { ... }

// Specialized for gang parallelism
func.func @<!-- -->foo_gang() attributes {
  acc.specialized_routine = #acc.specialized_routine&lt;@<!-- -->routine_gang,
&lt;gang_dim1&gt;, "foo"&gt;
} { ... }

// Specialized for vector parallelism
func.func @<!-- -->foo_vector() attributes {
  acc.specialized_routine = #acc.specialized_routine&lt;@<!-- -->routine_vector,
&lt;vector&gt;, "foo"&gt;
} { ... }

Full diff: https://github.com/llvm/llvm-project/pull/170766.diff

4 Files Affected:

  • (modified) mlir/include/mlir/Dialect/OpenACC/OpenACC.h (+14-3)
  • (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+78)
  • (modified) mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp (+3-1)
  • (modified) mlir/test/Dialect/OpenACC/ops.mlir (+53)
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
index 252a78648dd74..84fbf2c3d936c 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h
@@ -177,14 +177,25 @@ static constexpr StringLiteral getDeclareActionAttrName() {
 }
 
 static constexpr StringLiteral getRoutineInfoAttrName() {
-  return StringLiteral("acc.routine_info");
+  return RoutineInfoAttr::name;
 }
 
-/// Used to check whether the current operation is an `acc routine`
-inline bool isAccRoutineOp(mlir::Operation *op) {
+static constexpr StringLiteral getSpecializedRoutineAttrName() {
+  return SpecializedRoutineAttr::name;
+}
+
+/// Used to check whether the current operation is marked with
+/// `acc routine`. The operation passed in should be a function.
+inline bool isAccRoutine(mlir::Operation *op) {
   return op->hasAttr(mlir::acc::getRoutineInfoAttrName());
 }
 
+/// Used to check whether this is a specialized accelerator version of
+/// `acc routine` function.
+inline bool isSpecializedAccRoutine(mlir::Operation *op) {
+  return op->hasAttr(mlir::acc::getSpecializedRoutineAttrName());
+}
+
 static constexpr StringLiteral getFromDefaultClauseAttrName() {
   return StringLiteral("acc.from_default");
 }
diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
index 7a727bd7fb838..ba13a9c83a0b2 100644
--- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
+++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
@@ -152,6 +152,32 @@ def OpenACC_LoopParMode : I32EnumAttr<
   let genSpecializedAttr = 0;
 }
 
+// Parallelism level (gang/worker/vector/seq).
+// GangDim1 is the default gang level (equivalent to just "gang").
+// GangDim2/GangDim3 are for gang(dim:2) and gang(dim:3).
+def OpenACC_ParLevelSeq      : I32EnumAttrCase<"seq", 0>;
+def OpenACC_ParLevelGangDim1 : I32EnumAttrCase<"gang_dim1", 1>;
+def OpenACC_ParLevelGangDim2 : I32EnumAttrCase<"gang_dim2", 2>;
+def OpenACC_ParLevelGangDim3 : I32EnumAttrCase<"gang_dim3", 3>;
+def OpenACC_ParLevelWorker   : I32EnumAttrCase<"worker", 4>;
+def OpenACC_ParLevelVector   : I32EnumAttrCase<"vector", 5>;
+
+def OpenACC_ParLevel : I32EnumAttr<"ParLevel",
+    "Parallelism level (gang/worker/vector/seq)",
+    [OpenACC_ParLevelSeq,
+     OpenACC_ParLevelGangDim1, OpenACC_ParLevelGangDim2,
+     OpenACC_ParLevelGangDim3,
+     OpenACC_ParLevelWorker, OpenACC_ParLevelVector]> {
+  let genSpecializedAttr = 0;
+  let cppNamespace = "::mlir::acc";
+}
+
+def OpenACC_ParLevelAttr : EnumAttr<OpenACC_Dialect,
+                                    OpenACC_ParLevel,
+                                    "par_level"> {
+  let assemblyFormat = [{ ```<` $value `>` }];
+}
+
 def OpenACC_PrivateRecipe : I32EnumAttrCase<"private_recipe", 0>;
 def OpenACC_FirstprivateRecipe : I32EnumAttrCase<"firstprivate_recipe", 1>;
 def OpenACC_ReductionRecipe : I32EnumAttrCase<"reduction_recipe", 2>;
@@ -3336,6 +3362,58 @@ def RoutineInfoAttr : OpenACC_Attr<"RoutineInfo", "routine_info"> {
   let assemblyFormat = "`<` `[` `` $accRoutines `]` `>`";
 }
 
+def SpecializedRoutineAttr : OpenACC_Attr<"SpecializedRoutine",
+                                          "specialized_routine"> {
+  let summary = "Marks a specialized device version of an acc routine";
+
+  let description = [{
+    This attribute is attached to a function that was specialized from a host
+    function marked with `acc.routine_info`. It captures the parallelism level,
+    a reference to the original `acc.routine` operation, and the original
+    function name (since the specialized function may be renamed).
+
+    Example - before specialization:
+    ```mlir
+    acc.routine @routine_gang func(@foo) gang
+    acc.routine @routine_vector func(@foo) vector
+
+    func.func @foo() attributes {
+      acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
+    } { ... }
+    ```
+
+    After specialization, there are three functions: the original function and
+    two specialized versions (one per parallelism level):
+    ```mlir
+    acc.routine @routine_gang func(@foo) gang
+    acc.routine @routine_vector func(@foo) vector
+
+    // Original function (unchanged)
+    func.func @foo() attributes {
+      acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
+    } { ... }
+
+    // Specialized for gang parallelism
+    func.func @foo_gang() attributes {
+      acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "foo">
+    } { ... }
+
+    // Specialized for vector parallelism
+    func.func @foo_vector() attributes {
+      acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "foo">
+    } { ... }
+    ```
+  }];
+
+  let parameters = (ins
+    "SymbolRefAttr":$routine,
+    "ParLevelAttr":$level,
+    "StringAttr":$funcName
+  );
+
+  let assemblyFormat = "`<` $routine `,` $level `,` $funcName `>`";
+}
+
 //===----------------------------------------------------------------------===//
 // 2.14.1. Init Directive
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp b/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
index 766f690e21459..8cab2234ec370 100644
--- a/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
+++ b/mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
@@ -360,7 +360,9 @@ class ACCImplicitDeclare
                     accOp.getRegion(), globalsToAccDeclare, accSupport, symTab);
               })
           .Case<FunctionOpInterface>([&](auto func) {
-            if (acc::isAccRoutineOp(func) && !func.isExternal())
+            if ((acc::isAccRoutine(func) ||
+                 acc::isSpecializedAccRoutine(func)) &&
+                !func.isExternal())
               collectGlobalsFromDeviceRegion(func.getFunctionBody(),
                                              globalsToAccDeclare, accSupport,
                                              symTab);
diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir
index 5a1c20bcf5a24..d31397c15769b 100644
--- a/mlir/test/Dialect/OpenACC/ops.mlir
+++ b/mlir/test/Dialect/OpenACC/ops.mlir
@@ -1810,6 +1810,59 @@ acc.routine @acc_func_rout9 func(@acc_func) bind("acc_func_gpu_gang_dim1") gang(
 
 // -----
 
+// Test acc.specialized_routine attribute for specialized device functions
+acc.routine @routine_seq func(@device_func_seq) seq
+acc.routine @routine_gang func(@device_func_gang) gang
+acc.routine @routine_gang_dim2 func(@device_func_gang_dim2) gang(dim: 2 : i64)
+acc.routine @routine_gang_dim3 func(@device_func_gang_dim3) gang(dim: 3 : i64)
+acc.routine @routine_worker func(@device_func_worker) worker
+acc.routine @routine_vector func(@device_func_vector) vector
+
+func.func @device_func_seq() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_seq, <seq>, "host_func_seq">} {
+  return
+}
+
+func.func @device_func_gang() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "host_func_gang">} {
+  return
+}
+
+func.func @device_func_gang_dim2() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim2, <gang_dim2>, "host_func_gang_dim2">} {
+  return
+}
+
+func.func @device_func_gang_dim3() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim3, <gang_dim3>, "host_func_gang_dim3">} {
+  return
+}
+
+func.func @device_func_worker() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_worker, <worker>, "host_func_worker">} {
+  return
+}
+
+func.func @device_func_vector() attributes {acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "host_func_vector">} {
+  return
+}
+
+// CHECK: acc.routine @routine_seq func(@device_func_seq) seq
+// CHECK: acc.routine @routine_gang func(@device_func_gang) gang
+// CHECK: acc.routine @routine_gang_dim2 func(@device_func_gang_dim2) gang(dim: 2 : i64)
+// CHECK: acc.routine @routine_gang_dim3 func(@device_func_gang_dim3) gang(dim: 3 : i64)
+// CHECK: acc.routine @routine_worker func(@device_func_worker) worker
+// CHECK: acc.routine @routine_vector func(@device_func_vector) vector
+// CHECK-LABEL: func.func @device_func_seq()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_seq, <seq>, "host_func_seq">}
+// CHECK-LABEL: func.func @device_func_gang()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang, <gang_dim1>, "host_func_gang">}
+// CHECK-LABEL: func.func @device_func_gang_dim2()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim2, <gang_dim2>, "host_func_gang_dim2">}
+// CHECK-LABEL: func.func @device_func_gang_dim3()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_gang_dim3, <gang_dim3>, "host_func_gang_dim3">}
+// CHECK-LABEL: func.func @device_func_worker()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_worker, <worker>, "host_func_worker">}
+// CHECK-LABEL: func.func @device_func_vector()
+// CHECK: attributes {acc.specialized_routine = #acc.specialized_routine<@routine_vector, <vector>, "host_func_vector">}
+
+// -----
+
 func.func @acc_func() -> () {
   "test.openacc_dummy_op"() {acc.declare_action = #acc.declare_action<postAlloc = @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc>} : () -> ()
   return

Copy link
Contributor

@clementval clementval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@razvanlupusoru razvanlupusoru merged commit 834b8b7 into llvm:main Dec 4, 2025
12 of 13 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 5, 2025

LLVM Buildbot has detected a new failure on builder ppc64le-mlir-rhel-clang running on ppc64le-mlir-rhel-test while building mlir at step 6 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/129/builds/34408

Here is the relevant piece of the build log for the reference
Step 6 (test-build-check-mlir-build-only-check-mlir) failure: 1200 seconds without output running [b'ninja', b'check-mlir'], attempting to kill
...
PASS: MLIR :: Pass/pipeline-options-parsing.mlir (3712 of 3723)
PASS: MLIR :: mlir-tblgen/op-error.td (3713 of 3723)
PASS: MLIR-Unit :: IR/./MLIRIRTests/0/130 (3714 of 3723)
PASS: MLIR-Unit :: IR/./MLIRIRTests/39/130 (3715 of 3723)
PASS: MLIR-Unit :: IR/./MLIRIRTests/38/130 (3716 of 3723)
PASS: MLIR-Unit :: Interfaces/./MLIRInterfacesTests/11/22 (3717 of 3723)
PASS: MLIR-Unit :: Interfaces/./MLIRInterfacesTests/13/22 (3718 of 3723)
PASS: MLIR-Unit :: Pass/./MLIRPassTests/10/13 (3719 of 3723)
PASS: MLIR-Unit :: Interfaces/./MLIRInterfacesTests/12/22 (3720 of 3723)
PASS: MLIR :: mlir-reduce/dce-test.mlir (3721 of 3723)
command timed out: 1200 seconds without output running [b'ninja', b'check-mlir'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=3275.356242

honeygoyal pushed a commit to honeygoyal/llvm-project that referenced this pull request Dec 9, 2025
Introduce a new attribute `acc.specialized_routine` to mark functions
that have been specialized from a host function marked with
`acc.routine_info`.

The new attribute captures:
- A SymbolRefAttr referencing the original `acc.routine` operation
- The parallelism level via the new `ParLevel` enum
- The original function name (since specialized functions may be
renamed)

Example - before specialization:
```
acc.routine @routine_gang func(@foo) gang
acc.routine @routine_vector func(@foo) vector

func.func @foo() attributes {
  acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
} { ... }
```

After specialization, there are three functions:
the original function and two specialized versions (one per parallelism
level):
```
acc.routine @routine_gang func(@foo) gang
acc.routine @routine_vector func(@foo) vector

// Original function (unchanged)
func.func @foo() attributes {
  acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
} { ... }

// Specialized for gang parallelism
func.func @foo_gang() attributes {
  acc.specialized_routine = #acc.specialized_routine<@routine_gang,
<gang_dim1>, "foo">
} { ... }

// Specialized for vector parallelism
func.func @foo_vector() attributes {
  acc.specialized_routine = #acc.specialized_routine<@routine_vector,
<vector>, "foo">
} { ... }
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants