Skip to content

[OpenACC][CIR] Implement copyin/copyout/create lowering for compute/c… #145976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 27, 2025

Conversation

erichkeane
Copy link
Collaborator

…ombined

This patch does the lowering of copyin (represented as a
acc.copyin/acc.delete), copyout (acc.create/acc.copyin), and create
(acc.create/acc.delete).

Additionally, it found a few problems with #144806, so it fixes those as well.

…ombined

This patch does the lowering of copyin (represented as a
    acc.copyin/acc.delete), copyout (acc.create/acc.copyin), and create
(acc.create/acc.delete).

Additionally, it found a few problems with llvm#144806, so it fixes those as
well.
@llvmbot llvmbot added clang Clang issues not falling into any other category mlir mlir:openacc openacc ClangIR Anything related to the ClangIR project labels Jun 26, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 26, 2025

@llvm/pr-subscribers-mlir-openacc
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-openacc

@llvm/pr-subscribers-clang

Author: Erich Keane (erichkeane)

Changes

…ombined

This patch does the lowering of copyin (represented as a
acc.copyin/acc.delete), copyout (acc.create/acc.copyin), and create
(acc.create/acc.delete).

Additionally, it found a few problems with #144806, so it fixes those as well.


Patch is 31.54 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/145976.diff

5 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp (+51)
  • (added) clang/test/CIR/CodeGenOpenACC/combined-copyin-copyout-create.c (+160)
  • (added) clang/test/CIR/CodeGenOpenACC/compute-copyin-copyout-create.c (+128)
  • (modified) mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp (+4-2)
  • (modified) mlir/test/Dialect/OpenACC/ops.mlir (+11-1)
diff --git a/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp b/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp
index 1454cee336a09..fe4145959b206 100644
--- a/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp
@@ -858,6 +858,57 @@ class OpenACCClauseCIREmitter final
     }
   }
 
+  void VisitCopyInClause(const OpenACCCopyInClause &clause) {
+    if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
+                               mlir::acc::KernelsOp>) {
+      for (auto var : clause.getVarList())
+        addDataOperand<mlir::acc::CopyinOp, mlir::acc::DeleteOp>(
+            var, mlir::acc::DataClause::acc_copyin, clause.getModifierList(),
+            /*structured=*/true,
+            /*implicit=*/false);
+    } else if constexpr (isCombinedType<OpTy>) {
+      applyToComputeOp(clause);
+    } else {
+      // TODO: When we've implemented this for everything, switch this to an
+      // unreachable. data, declare, combined constructs remain.
+      return clauseNotImplemented(clause);
+    }
+  }
+
+  void VisitCopyOutClause(const OpenACCCopyOutClause &clause) {
+    if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
+                               mlir::acc::KernelsOp>) {
+      for (auto var : clause.getVarList())
+        addDataOperand<mlir::acc::CreateOp, mlir::acc::CopyoutOp>(
+            var, mlir::acc::DataClause::acc_copyout, clause.getModifierList(),
+            /*structured=*/true,
+            /*implicit=*/false);
+    } else if constexpr (isCombinedType<OpTy>) {
+      applyToComputeOp(clause);
+    } else {
+      // TODO: When we've implemented this for everything, switch this to an
+      // unreachable. data, declare, combined constructs remain.
+      return clauseNotImplemented(clause);
+    }
+  }
+
+  void VisitCreateClause(const OpenACCCreateClause &clause) {
+    if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
+                               mlir::acc::KernelsOp>) {
+      for (auto var : clause.getVarList())
+        addDataOperand<mlir::acc::CreateOp, mlir::acc::DeleteOp>(
+            var, mlir::acc::DataClause::acc_create, clause.getModifierList(),
+            /*structured=*/true,
+            /*implicit=*/false);
+    } else if constexpr (isCombinedType<OpTy>) {
+      applyToComputeOp(clause);
+    } else {
+      // TODO: When we've implemented this for everything, switch this to an
+      // unreachable. data, declare, combined constructs remain.
+      return clauseNotImplemented(clause);
+    }
+  }
+
   void VisitUseDeviceClause(const OpenACCUseDeviceClause &clause) {
     if constexpr (isOneOfTypes<OpTy, mlir::acc::HostDataOp>) {
       for (auto var : clause.getVarList())
diff --git a/clang/test/CIR/CodeGenOpenACC/combined-copyin-copyout-create.c b/clang/test/CIR/CodeGenOpenACC/combined-copyin-copyout-create.c
new file mode 100644
index 0000000000000..d6179c012ee91
--- /dev/null
+++ b/clang/test/CIR/CodeGenOpenACC/combined-copyin-copyout-create.c
@@ -0,0 +1,160 @@
+// RUN: %clang_cc1 -fopenacc -Wno-openacc-self-if-potential-conflict -emit-cir -fclangir %s -o - | FileCheck %s
+
+void acc_combined(int parmVar) {
+  // CHECK: cir.func{{.*}} @acc_combined(%[[ARG:.*]]: !s32i{{.*}}) {
+  // CHECK-NEXT: %[[PARM:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["parmVar", init]
+
+  int localVar1;
+  // CHECK-NEXT: %[[LV1:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["localVar1"]
+  float localVar2;
+  // CHECK-NEXT: %[[LV2:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["localVar2"]
+  // CHECK-NEXT: cir.store %[[ARG]], %[[PARM]]
+#pragma acc parallel loop copyin(parmVar) copyout(localVar1) create(localVar2)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, name = "localVar1"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV2]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {name = "localVar2"}
+  // CHECK-NEXT: acc.parallel combined(loop) dataOperands(%[[COPYIN1]], %[[CREATE1]], %[[CREATE2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>, !cir.ptr<!cir.float>) {
+  // CHECK-NEXT: acc.loop combined(parallel) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE2]] : !cir.ptr<!cir.float>) {dataClause = #acc<data_clause acc_create>, name = "localVar2"}
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) to varPtr(%[[LV1]] : !cir.ptr<!s32i>) {name = "localVar1"}
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, name = "parmVar"}
+
+#pragma acc serial loop copyin(parmVar, localVar1)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {name = "parmVar"}
+  // CHECK-NEXT: %[[COPYIN2:.*]] = acc.copyin varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {name = "localVar1"}
+  // CHECK-NEXT: acc.serial combined(loop) dataOperands(%[[COPYIN1]], %[[COPYIN2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.loop combined(serial) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN2]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, name = "localVar1"}
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, name = "parmVar"}
+
+#pragma acc kernels loop copyout(parmVar, localVar1)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, name = "localVar1"}
+  // CHECK-NEXT: acc.kernels combined(loop) dataOperands(%[[CREATE1]], %[[CREATE2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.loop combined(kernels) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.terminator
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE2]] : !cir.ptr<!s32i>) to varPtr(%[[LV1]] : !cir.ptr<!s32i>) {name = "localVar1"}
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) to varPtr(%[[PARM]] : !cir.ptr<!s32i>) {name = "parmVar"}
+
+#pragma acc parallel loop create (parmVar, localVar2)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV2]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {name = "localVar2"}
+  // CHECK-NEXT: acc.parallel combined(loop) dataOperands(%[[CREATE1]], %[[CREATE2]] : !cir.ptr<!s32i>, !cir.ptr<!cir.float>) {
+  // CHECK-NEXT: acc.loop combined(parallel) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE2]] : !cir.ptr<!cir.float>) {dataClause = #acc<data_clause acc_create>, name = "localVar2"}
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_create>, name = "parmVar"}
+
+#pragma acc serial loop copyin(capture: parmVar) copyin(always: localVar1)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier capture>, name = "parmVar"}
+  // CHECK-NEXT: %[[COPYIN2:.*]] = acc.copyin varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier always>, name = "localVar1"}
+  // CHECK-NEXT: acc.serial combined(loop) dataOperands(%[[COPYIN1]], %[[COPYIN2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.loop combined(serial) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN2]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, modifiers = #acc<data_clause_modifier always>, name = "localVar1"}
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, modifiers = #acc<data_clause_modifier capture>, name = "parmVar"}
+
+#pragma acc kernels loop copyout(capture: parmVar) copyout(always: localVar1)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, modifiers = #acc<data_clause_modifier capture>, name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, modifiers = #acc<data_clause_modifier always>, name = "localVar1"}
+  // CHECK-NEXT: acc.kernels combined(loop) dataOperands(%[[CREATE1]], %[[CREATE2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.loop combined(kernels) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.terminator
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE2]] : !cir.ptr<!s32i>) to varPtr(%[[LV1]] : !cir.ptr<!s32i>) {modifiers = #acc<data_clause_modifier always>, name = "localVar1"}
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) to varPtr(%[[PARM]] : !cir.ptr<!s32i>) {modifiers = #acc<data_clause_modifier capture>, name = "parmVar"}
+
+#pragma acc parallel loop create(capture: parmVar)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier capture>, name = "parmVar"}
+  // CHECK-NEXT: acc.parallel combined(loop) dataOperands(%[[CREATE1]] : !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.loop combined(parallel) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_create>, modifiers = #acc<data_clause_modifier capture>, name = "parmVar"}
+
+#pragma acc serial loop copyin(capture, always: parmVar, localVar1)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier always,capture>, name = "parmVar"}
+  // CHECK-NEXT: %[[COPYIN2:.*]] = acc.copyin varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier always,capture>, name = "localVar1"}
+  // CHECK-NEXT: acc.serial combined(loop) dataOperands(%[[COPYIN1]], %[[COPYIN2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.loop combined(serial) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN2]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, modifiers = #acc<data_clause_modifier always,capture>, name = "localVar1"}
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, modifiers = #acc<data_clause_modifier always,capture>, name = "parmVar"}
+
+#pragma acc kernels loop copyin(readonly, always, alwaysin, capture: parmVar, localVar1, localVar2)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier always,readonly,capture>, name = "parmVar"}
+  // CHECK-NEXT: %[[COPYIN2:.*]] = acc.copyin varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier always,readonly,capture>, name = "localVar1"}
+  // CHECK-NEXT: %[[COPYIN3:.*]] = acc.copyin varPtr(%[[LV2]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {modifiers = #acc<data_clause_modifier always,readonly,capture>, name = "localVar2"}
+  // CHECK-NEXT: acc.kernels combined(loop) dataOperands(%[[COPYIN1]], %[[COPYIN2]], %[[COPYIN3]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>, !cir.ptr<!cir.float>) {
+  // CHECK-NEXT: acc.loop combined(kernels) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.terminator
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN3]] : !cir.ptr<!cir.float>) {dataClause = #acc<data_clause acc_copyin>, modifiers = #acc<data_clause_modifier always,readonly,capture>, name = "localVar2"}
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN2]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, modifiers = #acc<data_clause_modifier always,readonly,capture>, name = "localVar1"}
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, modifiers = #acc<data_clause_modifier always,readonly,capture>, name = "parmVar"}
+
+#pragma acc parallel loop copyout(zero, always, alwaysout, capture: parmVar, localVar1, localVar2)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, modifiers = #acc<data_clause_modifier always,zero,capture>, name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, modifiers = #acc<data_clause_modifier always,zero,capture>, name = "localVar1"}
+  // CHECK-NEXT: %[[CREATE3:.*]] = acc.create varPtr(%[[LV2]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {dataClause = #acc<data_clause acc_copyout>, modifiers = #acc<data_clause_modifier always,zero,capture>, name = "localVar2"}
+  // CHECK-NEXT: acc.parallel combined(loop) dataOperands(%[[CREATE1]], %[[CREATE2]], %[[CREATE3]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>, !cir.ptr<!cir.float>) {
+  // CHECK-NEXT: acc.loop combined(parallel) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE3]] : !cir.ptr<!cir.float>) to varPtr(%[[LV2]] : !cir.ptr<!cir.float>) {modifiers = #acc<data_clause_modifier always,zero,capture>, name = "localVar2"}
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE2]] : !cir.ptr<!s32i>) to varPtr(%[[LV1]] : !cir.ptr<!s32i>) {modifiers = #acc<data_clause_modifier always,zero,capture>, name = "localVar1"}
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) to varPtr(%[[PARM]] : !cir.ptr<!s32i>) {modifiers = #acc<data_clause_modifier always,zero,capture>, name = "parmVar"}
+
+#pragma acc serial loop create(zero, capture: parmVar, localVar1, localVar2)
+  for(int i = 0; i < 5; ++i);
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier zero,capture>, name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {modifiers = #acc<data_clause_modifier zero,capture>, name = "localVar1"}
+  // CHECK-NEXT: %[[CREATE3:.*]] = acc.create varPtr(%[[LV2]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {modifiers = #acc<data_clause_modifier zero,capture>, name = "localVar2"}
+  // CHECK-NEXT: acc.serial combined(loop) dataOperands(%[[CREATE1]], %[[CREATE2]], %[[CREATE3]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>, !cir.ptr<!cir.float>) {
+  // CHECK-NEXT: acc.loop combined(serial) {
+  // CHECK: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE3]] : !cir.ptr<!cir.float>) {dataClause = #acc<data_clause acc_create>, modifiers = #acc<data_clause_modifier zero,capture>, name = "localVar2"}
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE2]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_create>, modifiers = #acc<data_clause_modifier zero,capture>, name = "localVar1"}
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_create>, modifiers = #acc<data_clause_modifier zero,capture>, name = "parmVar"}
+}
diff --git a/clang/test/CIR/CodeGenOpenACC/compute-copyin-copyout-create.c b/clang/test/CIR/CodeGenOpenACC/compute-copyin-copyout-create.c
new file mode 100644
index 0000000000000..2180a3370939e
--- /dev/null
+++ b/clang/test/CIR/CodeGenOpenACC/compute-copyin-copyout-create.c
@@ -0,0 +1,128 @@
+// RUN: %clang_cc1 -fopenacc -Wno-openacc-self-if-potential-conflict -emit-cir -fclangir %s -o - | FileCheck %s
+
+void acc_compute(int parmVar) {
+  // CHECK: cir.func{{.*}} @acc_compute(%[[ARG:.*]]: !s32i{{.*}}) {
+  // CHECK-NEXT: %[[PARM:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["parmVar", init]
+
+  int localVar1;
+  // CHECK-NEXT: %[[LV1:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["localVar1"]
+  float localVar2;
+  // CHECK-NEXT: %[[LV2:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["localVar2"]
+  // CHECK-NEXT: cir.store %[[ARG]], %[[PARM]]
+
+#pragma acc parallel copyin(parmVar) copyout(localVar1) create(localVar2)
+  ;
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, name = "localVar1"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV2]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {name = "localVar2"}
+  // CHECK-NEXT: acc.parallel dataOperands(%[[COPYIN1]], %[[CREATE1]], %[[CREATE2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>, !cir.ptr<!cir.float>) {
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE2]] : !cir.ptr<!cir.float>) {dataClause = #acc<data_clause acc_create>, name = "localVar2"}
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) to varPtr(%[[LV1]] : !cir.ptr<!s32i>) {name = "localVar1"}
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, name = "parmVar"}
+
+#pragma acc serial copyin(parmVar, localVar1)
+  ;
+  // CHECK-NEXT: %[[COPYIN1:.*]] = acc.copyin varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {name = "parmVar"}
+  // CHECK-NEXT: %[[COPYIN2:.*]] = acc.copyin varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {name = "localVar1"}
+  // CHECK-NEXT: acc.serial dataOperands(%[[COPYIN1]], %[[COPYIN2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN2]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, name = "localVar1"}
+  // CHECK-NEXT: acc.delete accPtr(%[[COPYIN1]] : !cir.ptr<!s32i>) {dataClause = #acc<data_clause acc_copyin>, name = "parmVar"}
+
+#pragma acc kernels copyout(parmVar, localVar1)
+  ;
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV1]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {dataClause = #acc<data_clause acc_copyout>, name = "localVar1"}
+  // CHECK-NEXT: acc.kernels dataOperands(%[[CREATE1]], %[[CREATE2]] : !cir.ptr<!s32i>, !cir.ptr<!s32i>) {
+  // CHECK-NEXT: acc.terminator
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE2]] : !cir.ptr<!s32i>) to varPtr(%[[LV1]] : !cir.ptr<!s32i>) {name = "localVar1"}
+  // CHECK-NEXT: acc.copyout accPtr(%[[CREATE1]] : !cir.ptr<!s32i>) to varPtr(%[[PARM]] : !cir.ptr<!s32i>) {name = "parmVar"}
+
+#pragma acc parallel create (parmVar, localVar2)
+  ;
+  // CHECK-NEXT: %[[CREATE1:.*]] = acc.create varPtr(%[[PARM]] : !cir.ptr<!s32i>) -> !cir.ptr<!s32i> {name = "parmVar"}
+  // CHECK-NEXT: %[[CREATE2:.*]] = acc.create varPtr(%[[LV2]] : !cir.ptr<!cir.float>) -> !cir.ptr<!cir.float> {name = "localVar2"}
+  // CHECK-NEXT: acc.parallel dataOperands(%[[CREATE1]], %[[CREATE2]] : !cir.ptr<!s32i>, !cir.ptr<!cir.float>) {
+  // CHECK-NEXT: acc.yield
+  // CHECK-NEXT: } loc
+  // CHECK-NEXT: acc.delete accPtr(%[[CREATE2]] : !cir.ptr<!cir.float>) {dataClause = #acc<data_clause acc_create>, name = "localVar2"}
+  // CHECK-NEXT...
[truncated]

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, with one suggestion

void VisitCopyInClause(const OpenACCCopyInClause &clause) {
if constexpr (isOneOfTypes<OpTy, mlir::acc::ParallelOp, mlir::acc::SerialOp,
mlir::acc::KernelsOp>) {
for (auto var : clause.getVarList())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make the type of var explicit?

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@erichkeane
Copy link
Collaborator Author

All fixed up!

I had to make some changes to the dialect, so I'd very much like @razvanlupusoru to confirm/approve.

Copy link
Contributor

@razvanlupusoru razvanlupusoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@erichkeane erichkeane merged commit 3463aba into llvm:main Jun 27, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project mlir:openacc mlir openacc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants