[OpenACC][CIR] Implement atomic update lowering #164836

erichkeane · 2025-10-23T15:36:39Z

This is the 3rd of 4 forms of the 'atomic' construct. This one allows increment/decrement, compound-assign, and assign-to-bin-op(referencing
the original variable).

All of the above is enforced during Sema, but for our purposes, we ONLY need to know the variable on the LHS and the expression, so this does that.

The ACC dialect for acc.atomic.update uses a 'recipe' as well, which takes the VALUE, and yields the value of the updated value.

To simplify the implementation, our lowering very simply creates an alloca inside the recipe, stores the passed-in value, then loads/yields it at the end.

This is the 3rd of 4 forms of the 'atomic' construct. This one allows increment/decrement, compound-assign, and assign-to-bin-op(referencing the original variable). All of the above is enforced during Sema, but for our purposes, we ONLY need to know the variable on the LHS and the expression, so this does that. The ACC dialect for acc.atomic.update uses a 'recipe' as well, which takes the VALUE, and yields the value of the updated value. To simplify the implementation, our lowering very simply creates an alloca inside the recipe, stores the passed-in value, then loads/yields it at the end.

llvmbot · 2025-10-23T15:37:25Z

@llvm/pr-subscribers-clangir

Author: Erich Keane (erichkeane)

Changes

This is the 3rd of 4 forms of the 'atomic' construct. This one allows increment/decrement, compound-assign, and assign-to-bin-op(referencing
the original variable).

All of the above is enforced during Sema, but for our purposes, we ONLY need to know the variable on the LHS and the expression, so this does that.

The ACC dialect for acc.atomic.update uses a 'recipe' as well, which takes the VALUE, and yields the value of the updated value.

To simplify the implementation, our lowering very simply creates an alloca inside the recipe, stores the passed-in value, then loads/yields it at the end.

Full diff: https://github.com/llvm/llvm-project/pull/164836.diff

5 Files Affected:

(modified) clang/lib/AST/StmtOpenACC.cpp (+39-8)
(modified) clang/lib/CIR/CodeGen/CIRGenFunction.h (+6)
(modified) clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp (+58-6)
(added) clang/test/CIR/CodeGenOpenACC/atomic-update.cpp (+151)
(modified) clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp (+2-2)

diff --git a/clang/lib/AST/StmtOpenACC.cpp b/clang/lib/AST/StmtOpenACC.cpp
index 462a10d45fbf0..70fa6014aca3c 100644
--- a/clang/lib/AST/StmtOpenACC.cpp
+++ b/clang/lib/AST/StmtOpenACC.cpp
@@ -326,16 +326,30 @@ OpenACCAtomicConstruct *OpenACCAtomicConstruct::Create(
 
 static std::pair<const Expr *, const Expr *> getBinaryOpArgs(const Expr *Op) {
   if (const auto *BO = dyn_cast<BinaryOperator>(Op)) {
-    assert(BO->getOpcode() == BO_Assign);
+    assert(BO->isAssignmentOp());
     return {BO->getLHS(), BO->getRHS()};
   }
 
   const auto *OO = cast<CXXOperatorCallExpr>(Op);
-  assert(OO->getOperator() == OO_Equal);
-
+  assert(OO->isAssignmentOp());
   return {OO->getArg(0), OO->getArg(1)};
 }
 
+static std::pair<bool, const Expr *> getUnaryOpArgs(const Expr *Op) {
+  if (const auto *UO = dyn_cast<UnaryOperator>(Op))
+    return {true, UO->getSubExpr()};
+
+  if (const auto *OpCall = dyn_cast<CXXOperatorCallExpr>(Op)) {
+    // Post-inc/dec have a second unused argument to differentiate it, so we
+    // accept -- or ++ as unary, or any operator call with only 1 arg.
+    if (OpCall->getNumArgs() == 1 || OpCall->getOperator() != OO_PlusPlus ||
+        OpCall->getOperator() != OO_MinusMinus)
+      return {true, OpCall->getArg(0)};
+  }
+
+  return {false, nullptr};
+}
+
 const OpenACCAtomicConstruct::StmtInfo
 OpenACCAtomicConstruct::getAssociatedStmtInfo() const {
   // This ends up being a vastly simplified version of SemaOpenACCAtomic, since
@@ -343,18 +357,17 @@ OpenACCAtomicConstruct::getAssociatedStmtInfo() const {
   // asserts to ensure we don't get off into the weeds.
   assert(getAssociatedStmt() && "invalid associated stmt?");
 
+  const Expr *AssocStmt = cast<const Expr>(getAssociatedStmt());
   switch (AtomicKind) {
-  case OpenACCAtomicKind::None:
-  case OpenACCAtomicKind::Update:
   case OpenACCAtomicKind::Capture:
-    assert(false && "Only 'read'/'write' have been implemented here");
+    assert(false && "Only 'read'/'write'/'update' have been implemented here");
     return {};
   case OpenACCAtomicKind::Read: {
     // Read only supports the format 'v = x'; where both sides are a scalar
     // expression. This can come in 2 forms; BinaryOperator or
     // CXXOperatorCallExpr (rarely).
     std::pair<const Expr *, const Expr *> BinaryArgs =
-        getBinaryOpArgs(cast<const Expr>(getAssociatedStmt()));
+        getBinaryOpArgs(AssocStmt);
     // We want the L-value for each side, so we ignore implicit casts.
     return {BinaryArgs.first->IgnoreImpCasts(),
             BinaryArgs.second->IgnoreImpCasts(), /*expr=*/nullptr};
@@ -364,13 +377,31 @@ OpenACCAtomicConstruct::getAssociatedStmtInfo() const {
     // type, and 'x' is a scalar l value. As above, this can come in 2 forms;
     // Binary Operator or CXXOperatorCallExpr.
     std::pair<const Expr *, const Expr *> BinaryArgs =
-        getBinaryOpArgs(cast<const Expr>(getAssociatedStmt()));
+        getBinaryOpArgs(AssocStmt);
     // We want the L-value for ONLY the X side, so we ignore implicit casts. For
     // the right side (the expr), we emit it as an r-value so we need to
     // maintain implicit casts.
     return {/*v=*/nullptr, BinaryArgs.first->IgnoreImpCasts(),
             BinaryArgs.second};
   }
+  case OpenACCAtomicKind::None:
+  case OpenACCAtomicKind::Update: {
+    std::pair<bool, const Expr *> UnaryArgs = getUnaryOpArgs(AssocStmt);
+
+    // TODO: ERICH: Figure out what we are going to do to figure out this is an
+    // inc/dec?
+    if (UnaryArgs.first)
+      return {/*v=*/nullptr, UnaryArgs.second->IgnoreImpCasts(),
+              /*expr=*/nullptr};
+
+    std::pair<const Expr *, const Expr *> BinaryArgs =
+        getBinaryOpArgs(AssocStmt);
+    // For binary args, we just store the RHS as an expression (in the
+    // expression slot), since the codegen just wants the whole thing for a
+    // recipe.
+    return {/*v=*/nullptr, BinaryArgs.first->IgnoreImpCasts(),
+            BinaryArgs.second};
+  }
   }
 
   llvm_unreachable("unknown OpenACC atomic kind");
diff --git a/clang/lib/CIR/CodeGen/CIRGenFunction.h b/clang/lib/CIR/CodeGen/CIRGenFunction.h
index 5f9dbdc64b9e5..a8ffab79e8398 100644
--- a/clang/lib/CIR/CodeGen/CIRGenFunction.h
+++ b/clang/lib/CIR/CodeGen/CIRGenFunction.h
@@ -665,6 +665,12 @@ class CIRGenFunction : public CIRGenTypeCache {
     symbolTable.insert(vd, addr.getPointer());
   }
 
+  // Replaces the address of the local variable, if it exists.  Else does the
+  // same thing as setAddrOfLocalVar.
+  void replaceAddrOfLocalVar(const clang::VarDecl *vd, Address addr) {
+    localDeclMap.insert_or_assign(vd, addr);
+  }
+
   // A class to allow reverting changes to a var-decl's registration to the
   // localDeclMap. This is used in cases where things are being inserted into
   // the variable list but don't follow normal lookup/search rules, like in
diff --git a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
index 349b111c0d8fd..76765e351c2a6 100644
--- a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
@@ -304,12 +304,21 @@ CIRGenFunction::emitOpenACCCacheConstruct(const OpenACCCacheConstruct &s) {
   return mlir::success();
 }
 
+const VarDecl *getLValueDecl(const Expr *E) {
+  // We are going to assume that after stripping implicit casts, that the LValue
+  // is just a DRE around the var-decl.
+
+  E = E->IgnoreImpCasts();
+
+  const auto *DRE = cast<DeclRefExpr>(E);
+  return cast<VarDecl>(DRE->getDecl());
+}
+
 mlir::LogicalResult
 CIRGenFunction::emitOpenACCAtomicConstruct(const OpenACCAtomicConstruct &s) {
-  // For now, we are only support 'read'/'write', so diagnose. We can switch on
-  // the kind later once we start implementing the other 2 forms. While we
-  if (s.getAtomicKind() != OpenACCAtomicKind::Read &&
-      s.getAtomicKind() != OpenACCAtomicKind::Write) {
+  // For now, we are only support 'read'/'write'/'update', so diagnose. We can
+  // switch on the kind later once we implement the 'capture' form.
+  if (s.getAtomicKind() == OpenACCAtomicKind::Capture) {
     cgm.errorNYI(s.getSourceRange(), "OpenACC Atomic Construct");
     return mlir::failure();
   }
@@ -318,11 +327,10 @@ CIRGenFunction::emitOpenACCAtomicConstruct(const OpenACCAtomicConstruct &s) {
   // expression it is associated with rather than emitting it inside of it.  So
   // it has custom emit logic.
   mlir::Location start = getLoc(s.getSourceRange().getBegin());
+  mlir::Location end = getLoc(s.getSourceRange().getEnd());
   OpenACCAtomicConstruct::StmtInfo inf = s.getAssociatedStmtInfo();
 
   switch (s.getAtomicKind()) {
-  case OpenACCAtomicKind::None:
-  case OpenACCAtomicKind::Update:
   case OpenACCAtomicKind::Capture:
     llvm_unreachable("Unimplemented atomic construct type, should have "
                      "diagnosed/returned above");
@@ -353,6 +361,50 @@ CIRGenFunction::emitOpenACCAtomicConstruct(const OpenACCAtomicConstruct &s) {
                        s.clauses());
     return mlir::success();
   }
+  case OpenACCAtomicKind::None:
+  case OpenACCAtomicKind::Update: {
+    mlir::Value x = emitLValue(inf.X).getPointer();
+    auto op =
+        mlir::acc::AtomicUpdateOp::create(builder, start, x, /*ifCond=*/{});
+    emitOpenACCClauses(op, s.getDirectiveKind(), s.getDirectiveLoc(),
+                       s.clauses());
+    mlir::LogicalResult res = mlir::success();
+    {
+      mlir::Type argTy = cast<cir::PointerType>(x.getType()).getPointee();
+      mlir::OpBuilder::InsertionGuard guardCase(builder);
+      std::array<mlir::Type, 1> recipeType{argTy};
+      std::array<mlir::Location, 1> recipeLoc{start};
+      auto *recipeBlock = builder.createBlock(
+          &op.getRegion(), op.getRegion().end(), recipeType, recipeLoc);
+      builder.setInsertionPointToEnd(recipeBlock);
+
+      // Since we have an initial value that we know is a scalar type, we can
+      // just emit the entire statement here after sneaking-in our 'alloca' in
+      // the right place, then loading out of it. Flang does a lot less work
+      // (probably does its own emitting!), but we have more complicated AST
+      // nodes to worry about, so we can just count on opt to remove the extra
+      // alloca/load/store set.
+      auto alloca = cir::AllocaOp::create(
+          builder, start, x.getType(), argTy, "x_var",
+          cgm.getSize(getContext().getTypeAlignInChars(inf.X->getType())));
+
+      alloca.setInitAttr(mlir::UnitAttr::get(&getMLIRContext()));
+      builder.CIRBaseBuilderTy::createStore(start, recipeBlock->getArgument(0),
+                                            alloca);
+
+      const VarDecl *xval = getLValueDecl(inf.X);
+      CIRGenFunction::DeclMapRevertingRAII declMapRAII{*this, xval};
+      replaceAddrOfLocalVar(
+          xval, Address{alloca, argTy, getContext().getDeclAlign(xval)});
+
+      res = emitStmt(s.getAssociatedStmt(), /*useCurrentScope=*/true);
+
+      auto load = cir::LoadOp::create(builder, start, {alloca});
+      mlir::acc::YieldOp::create(builder, end, {load});
+    }
+
+    return res;
+  }
   }
 
   llvm_unreachable("unknown OpenACC atomic kind");
diff --git a/clang/test/CIR/CodeGenOpenACC/atomic-update.cpp b/clang/test/CIR/CodeGenOpenACC/atomic-update.cpp
new file mode 100644
index 0000000000000..7ab6b62c4b41e
--- /dev/null
+++ b/clang/test/CIR/CodeGenOpenACC/atomic-update.cpp
@@ -0,0 +1,151 @@
+// RUN: %clang_cc1 -fopenacc -triple x86_64-linux-gnu -Wno-openacc-self-if-potential-conflict -emit-cir -fclangir -triple x86_64-linux-pc %s -o - | FileCheck %s
+
+struct HasOps {
+  operator float();
+  int thing();
+};
+
+void use(int x, unsigned int y, float f, HasOps ops) {
+  // CHECK: cir.func{{.*}}(%[[X_ARG:.*]]: !s32i{{.*}}, %[[Y_ARG:.*]]: !u32i{{.*}}, %[[F_ARG:.*]]: !cir.float{{.*}}){{.*}}, %[[OPS_ARG:.*]]: !rec_HasOps{{.*}}) {
+  // CHECK-NEXT: %[[X_ALLOCA:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["x", init]
+  // CHECK-NEXT: %[[Y_ALLOCA:.*]] = cir.alloca !u32i, !cir.ptr<!u32i>, ["y", init]
+  // CHECK-NEXT: %[[F_ALLOCA:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["f", init]
+  // CHECK-NEXT: %[[OPS_ALLOCA:.*]] = cir.alloca !rec_HasOps, !cir.ptr<!rec_HasOps>, ["ops", init]
+  // CHECK-NEXT: cir.store %[[X_ARG]], %[[X_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  // CHECK-NEXT: cir.store %[[Y_ARG]], %[[Y_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  // CHECK-NEXT: cir.store %[[F_ARG]], %[[F_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  // CHECK-NEXT: cir.store %[[OPS_ARG]], %[[OPS_ALLOCA]] : !rec_HasOps, !cir.ptr<!rec_HasOps>
+
+  // CHECK-NEXT: acc.atomic.update %[[X_ALLOCA]] : !cir.ptr<!s32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !s32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: %[[INC:.*]] = cir.unary(inc, %[[TEMP_LOAD]]) nsw : !s32i, !s32i
+  // CHECK-NEXT: cir.store {{.*}}%[[INC]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !s32i
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  ++x;
+
+  // CHECK-NEXT: acc.atomic.update %[[Y_ALLOCA]] : !cir.ptr<!u32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !u32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !u32i, !cir.ptr<!u32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: %[[INC:.*]] = cir.unary(inc, %[[TEMP_LOAD]]) : !u32i, !u32i
+  // CHECK-NEXT: cir.store {{.*}}%[[INC]], %[[TEMP_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  // 
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !u32i
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  y++;
+
+  // CHECK-NEXT: acc.atomic.update %[[F_ALLOCA]] : !cir.ptr<!cir.float> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !cir.float{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: %[[INC:.*]] = cir.unary(dec, %[[TEMP_LOAD]]) : !cir.float, !cir.float
+  // CHECK-NEXT: cir.store {{.*}}%[[INC]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  // 
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !cir.float
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  f--;
+
+  // CHECK-NEXT: acc.atomic.update %[[X_ALLOCA]] : !cir.ptr<!s32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !s32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[F_LOAD:.*]] = cir.load{{.*}} %[[F_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: %[[INT_TO_F:.*]] = cir.cast int_to_float %[[TEMP_LOAD]] : !s32i -> !cir.float
+  // CHECK-NEXT: %[[ADD:.*]] = cir.binop(add, %[[INT_TO_F]], %[[F_LOAD]]) : !cir.float
+  // CHECK-NEXT: %[[F_TO_INT:.*]] = cir.cast float_to_int %[[ADD]] : !cir.float -> !s32i
+  // CHECK-NEXT: cir.store{{.*}} %[[F_TO_INT]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !s32i
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  x += f;
+
+  // CHECK-NEXT: acc.atomic.update %[[F_ALLOCA]] : !cir.ptr<!cir.float> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !cir.float{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[Y_LOAD:.*]] = cir.load{{.*}} %[[Y_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: %[[INT_TO_F:.*]] = cir.cast int_to_float %[[Y_LOAD]] : !u32i -> !cir.float
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: %[[DIV:.*]] = cir.binop(div, %[[TEMP_LOAD]], %[[INT_TO_F]]) : !cir.float
+  // CHECK-NEXT: cir.store{{.*}} %[[DIV]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !cir.float
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  f /= y;
+
+  // CHECK-NEXT: acc.atomic.update %[[Y_ALLOCA]] : !cir.ptr<!u32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !u32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !u32i, !cir.ptr<!u32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: %[[CALL:.*]] = cir.call {{.*}}(%[[OPS_ALLOCA]]) : (!cir.ptr<!rec_HasOps>) -> !s32i
+  // CHECK-NEXT: %[[CALL_CAST:.*]] = cir.cast integral %[[CALL]] : !s32i -> !u32i
+  // CHECK-NEXT: %[[MUL:.*]] = cir.binop(mul, %[[TEMP_LOAD]], %[[CALL_CAST]]) : !u32i
+  // CHECK-NEXT: cir.store{{.*}} %[[MUL]], %[[TEMP_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !u32i
+  // CHECK-NEXT: }
+
+#pragma acc atomic update
+  y = y * ops.thing();
+
+  // CHECK-NEXT: acc.atomic.update %[[X_ALLOCA]] : !cir.ptr<!s32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !s32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[CALL:.*]] = cir.call {{.*}}(%[[OPS_ALLOCA]]) : (!cir.ptr<!rec_HasOps>) -> !s32i
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: %[[OR:.*]] = cir.binop(or, %[[CALL]], %[[INT_TO_F]]) : !s32i
+  // CHECK-NEXT: cir.store{{.*}} %[[OR]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !s32i
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  x = ops.thing() | x;
+
+  // CHECK-NEXT: %[[X_LOAD:.*]] = cir.load{{.*}} %[[X_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: %[[BOOL_CAST:.*]] = cir.cast int_to_bool %[[X_LOAD]] : !s32i -> !cir.bool
+  // CHECK-NEXT: %[[X_CAST:.*]] = builtin.unrealized_conversion_cast %[[BOOL_CAST]] : !cir.bool to i1
+  // CHECK-NEXT: acc.atomic.update if(%[[X_CAST]]) %[[F_ALLOCA]] : !cir.ptr<!cir.float> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !cir.float{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: %[[CALL:.*]] = cir.call {{.*}}(%[[OPS_ALLOCA]]) : (!cir.ptr<!rec_HasOps>) -> !cir.float
+  // CHECK-NEXT: %[[SUB:.*]] = cir.binop(sub, %[[TEMP_LOAD]], %[[CALL]]) : !cir.float
+  // CHECK-NEXT: cir.store{{.*}} %[[SUB]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !cir.float
+  // CHECK-NEXT: }
+#pragma acc atomic update if (x)
+  f = f - ops;
+}
diff --git a/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp b/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
index 33e12fe1cd833..b4d76e18bf345 100644
--- a/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
+++ b/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
@@ -3,8 +3,8 @@
 void HelloWorld(int *A, int *B, int *C, int N) {
 
 // expected-error@+1{{ClangIR code gen Not Yet Implemented: OpenACC Atomic Construct}}
-#pragma acc atomic
-  N = N + 1;
+#pragma acc atomic capture
+  B = A += ++N;
 
 // expected-error@+1{{ClangIR code gen Not Yet Implemented: OpenACC Declare Construct}}
 #pragma acc declare create(A)

llvmbot · 2025-10-23T15:37:26Z

@llvm/pr-subscribers-clang

Author: Erich Keane (erichkeane)

Changes

This is the 3rd of 4 forms of the 'atomic' construct. This one allows increment/decrement, compound-assign, and assign-to-bin-op(referencing
the original variable).

All of the above is enforced during Sema, but for our purposes, we ONLY need to know the variable on the LHS and the expression, so this does that.

The ACC dialect for acc.atomic.update uses a 'recipe' as well, which takes the VALUE, and yields the value of the updated value.

To simplify the implementation, our lowering very simply creates an alloca inside the recipe, stores the passed-in value, then loads/yields it at the end.

Full diff: https://github.com/llvm/llvm-project/pull/164836.diff

5 Files Affected:

(modified) clang/lib/AST/StmtOpenACC.cpp (+39-8)
(modified) clang/lib/CIR/CodeGen/CIRGenFunction.h (+6)
(modified) clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp (+58-6)
(added) clang/test/CIR/CodeGenOpenACC/atomic-update.cpp (+151)
(modified) clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp (+2-2)

diff --git a/clang/lib/AST/StmtOpenACC.cpp b/clang/lib/AST/StmtOpenACC.cpp
index 462a10d45fbf0..70fa6014aca3c 100644
--- a/clang/lib/AST/StmtOpenACC.cpp
+++ b/clang/lib/AST/StmtOpenACC.cpp
@@ -326,16 +326,30 @@ OpenACCAtomicConstruct *OpenACCAtomicConstruct::Create(
 
 static std::pair<const Expr *, const Expr *> getBinaryOpArgs(const Expr *Op) {
   if (const auto *BO = dyn_cast<BinaryOperator>(Op)) {
-    assert(BO->getOpcode() == BO_Assign);
+    assert(BO->isAssignmentOp());
     return {BO->getLHS(), BO->getRHS()};
   }
 
   const auto *OO = cast<CXXOperatorCallExpr>(Op);
-  assert(OO->getOperator() == OO_Equal);
-
+  assert(OO->isAssignmentOp());
   return {OO->getArg(0), OO->getArg(1)};
 }
 
+static std::pair<bool, const Expr *> getUnaryOpArgs(const Expr *Op) {
+  if (const auto *UO = dyn_cast<UnaryOperator>(Op))
+    return {true, UO->getSubExpr()};
+
+  if (const auto *OpCall = dyn_cast<CXXOperatorCallExpr>(Op)) {
+    // Post-inc/dec have a second unused argument to differentiate it, so we
+    // accept -- or ++ as unary, or any operator call with only 1 arg.
+    if (OpCall->getNumArgs() == 1 || OpCall->getOperator() != OO_PlusPlus ||
+        OpCall->getOperator() != OO_MinusMinus)
+      return {true, OpCall->getArg(0)};
+  }
+
+  return {false, nullptr};
+}
+
 const OpenACCAtomicConstruct::StmtInfo
 OpenACCAtomicConstruct::getAssociatedStmtInfo() const {
   // This ends up being a vastly simplified version of SemaOpenACCAtomic, since
@@ -343,18 +357,17 @@ OpenACCAtomicConstruct::getAssociatedStmtInfo() const {
   // asserts to ensure we don't get off into the weeds.
   assert(getAssociatedStmt() && "invalid associated stmt?");
 
+  const Expr *AssocStmt = cast<const Expr>(getAssociatedStmt());
   switch (AtomicKind) {
-  case OpenACCAtomicKind::None:
-  case OpenACCAtomicKind::Update:
   case OpenACCAtomicKind::Capture:
-    assert(false && "Only 'read'/'write' have been implemented here");
+    assert(false && "Only 'read'/'write'/'update' have been implemented here");
     return {};
   case OpenACCAtomicKind::Read: {
     // Read only supports the format 'v = x'; where both sides are a scalar
     // expression. This can come in 2 forms; BinaryOperator or
     // CXXOperatorCallExpr (rarely).
     std::pair<const Expr *, const Expr *> BinaryArgs =
-        getBinaryOpArgs(cast<const Expr>(getAssociatedStmt()));
+        getBinaryOpArgs(AssocStmt);
     // We want the L-value for each side, so we ignore implicit casts.
     return {BinaryArgs.first->IgnoreImpCasts(),
             BinaryArgs.second->IgnoreImpCasts(), /*expr=*/nullptr};
@@ -364,13 +377,31 @@ OpenACCAtomicConstruct::getAssociatedStmtInfo() const {
     // type, and 'x' is a scalar l value. As above, this can come in 2 forms;
     // Binary Operator or CXXOperatorCallExpr.
     std::pair<const Expr *, const Expr *> BinaryArgs =
-        getBinaryOpArgs(cast<const Expr>(getAssociatedStmt()));
+        getBinaryOpArgs(AssocStmt);
     // We want the L-value for ONLY the X side, so we ignore implicit casts. For
     // the right side (the expr), we emit it as an r-value so we need to
     // maintain implicit casts.
     return {/*v=*/nullptr, BinaryArgs.first->IgnoreImpCasts(),
             BinaryArgs.second};
   }
+  case OpenACCAtomicKind::None:
+  case OpenACCAtomicKind::Update: {
+    std::pair<bool, const Expr *> UnaryArgs = getUnaryOpArgs(AssocStmt);
+
+    // TODO: ERICH: Figure out what we are going to do to figure out this is an
+    // inc/dec?
+    if (UnaryArgs.first)
+      return {/*v=*/nullptr, UnaryArgs.second->IgnoreImpCasts(),
+              /*expr=*/nullptr};
+
+    std::pair<const Expr *, const Expr *> BinaryArgs =
+        getBinaryOpArgs(AssocStmt);
+    // For binary args, we just store the RHS as an expression (in the
+    // expression slot), since the codegen just wants the whole thing for a
+    // recipe.
+    return {/*v=*/nullptr, BinaryArgs.first->IgnoreImpCasts(),
+            BinaryArgs.second};
+  }
   }
 
   llvm_unreachable("unknown OpenACC atomic kind");
diff --git a/clang/lib/CIR/CodeGen/CIRGenFunction.h b/clang/lib/CIR/CodeGen/CIRGenFunction.h
index 5f9dbdc64b9e5..a8ffab79e8398 100644
--- a/clang/lib/CIR/CodeGen/CIRGenFunction.h
+++ b/clang/lib/CIR/CodeGen/CIRGenFunction.h
@@ -665,6 +665,12 @@ class CIRGenFunction : public CIRGenTypeCache {
     symbolTable.insert(vd, addr.getPointer());
   }
 
+  // Replaces the address of the local variable, if it exists.  Else does the
+  // same thing as setAddrOfLocalVar.
+  void replaceAddrOfLocalVar(const clang::VarDecl *vd, Address addr) {
+    localDeclMap.insert_or_assign(vd, addr);
+  }
+
   // A class to allow reverting changes to a var-decl's registration to the
   // localDeclMap. This is used in cases where things are being inserted into
   // the variable list but don't follow normal lookup/search rules, like in
diff --git a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
index 349b111c0d8fd..76765e351c2a6 100644
--- a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
@@ -304,12 +304,21 @@ CIRGenFunction::emitOpenACCCacheConstruct(const OpenACCCacheConstruct &s) {
   return mlir::success();
 }
 
+const VarDecl *getLValueDecl(const Expr *E) {
+  // We are going to assume that after stripping implicit casts, that the LValue
+  // is just a DRE around the var-decl.
+
+  E = E->IgnoreImpCasts();
+
+  const auto *DRE = cast<DeclRefExpr>(E);
+  return cast<VarDecl>(DRE->getDecl());
+}
+
 mlir::LogicalResult
 CIRGenFunction::emitOpenACCAtomicConstruct(const OpenACCAtomicConstruct &s) {
-  // For now, we are only support 'read'/'write', so diagnose. We can switch on
-  // the kind later once we start implementing the other 2 forms. While we
-  if (s.getAtomicKind() != OpenACCAtomicKind::Read &&
-      s.getAtomicKind() != OpenACCAtomicKind::Write) {
+  // For now, we are only support 'read'/'write'/'update', so diagnose. We can
+  // switch on the kind later once we implement the 'capture' form.
+  if (s.getAtomicKind() == OpenACCAtomicKind::Capture) {
     cgm.errorNYI(s.getSourceRange(), "OpenACC Atomic Construct");
     return mlir::failure();
   }
@@ -318,11 +327,10 @@ CIRGenFunction::emitOpenACCAtomicConstruct(const OpenACCAtomicConstruct &s) {
   // expression it is associated with rather than emitting it inside of it.  So
   // it has custom emit logic.
   mlir::Location start = getLoc(s.getSourceRange().getBegin());
+  mlir::Location end = getLoc(s.getSourceRange().getEnd());
   OpenACCAtomicConstruct::StmtInfo inf = s.getAssociatedStmtInfo();
 
   switch (s.getAtomicKind()) {
-  case OpenACCAtomicKind::None:
-  case OpenACCAtomicKind::Update:
   case OpenACCAtomicKind::Capture:
     llvm_unreachable("Unimplemented atomic construct type, should have "
                      "diagnosed/returned above");
@@ -353,6 +361,50 @@ CIRGenFunction::emitOpenACCAtomicConstruct(const OpenACCAtomicConstruct &s) {
                        s.clauses());
     return mlir::success();
   }
+  case OpenACCAtomicKind::None:
+  case OpenACCAtomicKind::Update: {
+    mlir::Value x = emitLValue(inf.X).getPointer();
+    auto op =
+        mlir::acc::AtomicUpdateOp::create(builder, start, x, /*ifCond=*/{});
+    emitOpenACCClauses(op, s.getDirectiveKind(), s.getDirectiveLoc(),
+                       s.clauses());
+    mlir::LogicalResult res = mlir::success();
+    {
+      mlir::Type argTy = cast<cir::PointerType>(x.getType()).getPointee();
+      mlir::OpBuilder::InsertionGuard guardCase(builder);
+      std::array<mlir::Type, 1> recipeType{argTy};
+      std::array<mlir::Location, 1> recipeLoc{start};
+      auto *recipeBlock = builder.createBlock(
+          &op.getRegion(), op.getRegion().end(), recipeType, recipeLoc);
+      builder.setInsertionPointToEnd(recipeBlock);
+
+      // Since we have an initial value that we know is a scalar type, we can
+      // just emit the entire statement here after sneaking-in our 'alloca' in
+      // the right place, then loading out of it. Flang does a lot less work
+      // (probably does its own emitting!), but we have more complicated AST
+      // nodes to worry about, so we can just count on opt to remove the extra
+      // alloca/load/store set.
+      auto alloca = cir::AllocaOp::create(
+          builder, start, x.getType(), argTy, "x_var",
+          cgm.getSize(getContext().getTypeAlignInChars(inf.X->getType())));
+
+      alloca.setInitAttr(mlir::UnitAttr::get(&getMLIRContext()));
+      builder.CIRBaseBuilderTy::createStore(start, recipeBlock->getArgument(0),
+                                            alloca);
+
+      const VarDecl *xval = getLValueDecl(inf.X);
+      CIRGenFunction::DeclMapRevertingRAII declMapRAII{*this, xval};
+      replaceAddrOfLocalVar(
+          xval, Address{alloca, argTy, getContext().getDeclAlign(xval)});
+
+      res = emitStmt(s.getAssociatedStmt(), /*useCurrentScope=*/true);
+
+      auto load = cir::LoadOp::create(builder, start, {alloca});
+      mlir::acc::YieldOp::create(builder, end, {load});
+    }
+
+    return res;
+  }
   }
 
   llvm_unreachable("unknown OpenACC atomic kind");
diff --git a/clang/test/CIR/CodeGenOpenACC/atomic-update.cpp b/clang/test/CIR/CodeGenOpenACC/atomic-update.cpp
new file mode 100644
index 0000000000000..7ab6b62c4b41e
--- /dev/null
+++ b/clang/test/CIR/CodeGenOpenACC/atomic-update.cpp
@@ -0,0 +1,151 @@
+// RUN: %clang_cc1 -fopenacc -triple x86_64-linux-gnu -Wno-openacc-self-if-potential-conflict -emit-cir -fclangir -triple x86_64-linux-pc %s -o - | FileCheck %s
+
+struct HasOps {
+  operator float();
+  int thing();
+};
+
+void use(int x, unsigned int y, float f, HasOps ops) {
+  // CHECK: cir.func{{.*}}(%[[X_ARG:.*]]: !s32i{{.*}}, %[[Y_ARG:.*]]: !u32i{{.*}}, %[[F_ARG:.*]]: !cir.float{{.*}}){{.*}}, %[[OPS_ARG:.*]]: !rec_HasOps{{.*}}) {
+  // CHECK-NEXT: %[[X_ALLOCA:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["x", init]
+  // CHECK-NEXT: %[[Y_ALLOCA:.*]] = cir.alloca !u32i, !cir.ptr<!u32i>, ["y", init]
+  // CHECK-NEXT: %[[F_ALLOCA:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["f", init]
+  // CHECK-NEXT: %[[OPS_ALLOCA:.*]] = cir.alloca !rec_HasOps, !cir.ptr<!rec_HasOps>, ["ops", init]
+  // CHECK-NEXT: cir.store %[[X_ARG]], %[[X_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  // CHECK-NEXT: cir.store %[[Y_ARG]], %[[Y_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  // CHECK-NEXT: cir.store %[[F_ARG]], %[[F_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  // CHECK-NEXT: cir.store %[[OPS_ARG]], %[[OPS_ALLOCA]] : !rec_HasOps, !cir.ptr<!rec_HasOps>
+
+  // CHECK-NEXT: acc.atomic.update %[[X_ALLOCA]] : !cir.ptr<!s32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !s32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: %[[INC:.*]] = cir.unary(inc, %[[TEMP_LOAD]]) nsw : !s32i, !s32i
+  // CHECK-NEXT: cir.store {{.*}}%[[INC]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !s32i
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  ++x;
+
+  // CHECK-NEXT: acc.atomic.update %[[Y_ALLOCA]] : !cir.ptr<!u32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !u32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !u32i, !cir.ptr<!u32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: %[[INC:.*]] = cir.unary(inc, %[[TEMP_LOAD]]) : !u32i, !u32i
+  // CHECK-NEXT: cir.store {{.*}}%[[INC]], %[[TEMP_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  // 
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !u32i
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  y++;
+
+  // CHECK-NEXT: acc.atomic.update %[[F_ALLOCA]] : !cir.ptr<!cir.float> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !cir.float{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: %[[INC:.*]] = cir.unary(dec, %[[TEMP_LOAD]]) : !cir.float, !cir.float
+  // CHECK-NEXT: cir.store {{.*}}%[[INC]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  // 
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !cir.float
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  f--;
+
+  // CHECK-NEXT: acc.atomic.update %[[X_ALLOCA]] : !cir.ptr<!s32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !s32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[F_LOAD:.*]] = cir.load{{.*}} %[[F_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: %[[INT_TO_F:.*]] = cir.cast int_to_float %[[TEMP_LOAD]] : !s32i -> !cir.float
+  // CHECK-NEXT: %[[ADD:.*]] = cir.binop(add, %[[INT_TO_F]], %[[F_LOAD]]) : !cir.float
+  // CHECK-NEXT: %[[F_TO_INT:.*]] = cir.cast float_to_int %[[ADD]] : !cir.float -> !s32i
+  // CHECK-NEXT: cir.store{{.*}} %[[F_TO_INT]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !s32i
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  x += f;
+
+  // CHECK-NEXT: acc.atomic.update %[[F_ALLOCA]] : !cir.ptr<!cir.float> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !cir.float{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[Y_LOAD:.*]] = cir.load{{.*}} %[[Y_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: %[[INT_TO_F:.*]] = cir.cast int_to_float %[[Y_LOAD]] : !u32i -> !cir.float
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: %[[DIV:.*]] = cir.binop(div, %[[TEMP_LOAD]], %[[INT_TO_F]]) : !cir.float
+  // CHECK-NEXT: cir.store{{.*}} %[[DIV]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !cir.float
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  f /= y;
+
+  // CHECK-NEXT: acc.atomic.update %[[Y_ALLOCA]] : !cir.ptr<!u32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !u32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !u32i, !cir.ptr<!u32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: %[[CALL:.*]] = cir.call {{.*}}(%[[OPS_ALLOCA]]) : (!cir.ptr<!rec_HasOps>) -> !s32i
+  // CHECK-NEXT: %[[CALL_CAST:.*]] = cir.cast integral %[[CALL]] : !s32i -> !u32i
+  // CHECK-NEXT: %[[MUL:.*]] = cir.binop(mul, %[[TEMP_LOAD]], %[[CALL_CAST]]) : !u32i
+  // CHECK-NEXT: cir.store{{.*}} %[[MUL]], %[[TEMP_ALLOCA]] : !u32i, !cir.ptr<!u32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!u32i>, !u32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !u32i
+  // CHECK-NEXT: }
+
+#pragma acc atomic update
+  y = y * ops.thing();
+
+  // CHECK-NEXT: acc.atomic.update %[[X_ALLOCA]] : !cir.ptr<!s32i> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !s32i{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[CALL:.*]] = cir.call {{.*}}(%[[OPS_ALLOCA]]) : (!cir.ptr<!rec_HasOps>) -> !s32i
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: %[[OR:.*]] = cir.binop(or, %[[CALL]], %[[INT_TO_F]]) : !s32i
+  // CHECK-NEXT: cir.store{{.*}} %[[OR]], %[[TEMP_ALLOCA]] : !s32i, !cir.ptr<!s32i>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !s32i
+  // CHECK-NEXT: }
+#pragma acc atomic update
+  x = ops.thing() | x;
+
+  // CHECK-NEXT: %[[X_LOAD:.*]] = cir.load{{.*}} %[[X_ALLOCA]] : !cir.ptr<!s32i>, !s32i
+  // CHECK-NEXT: %[[BOOL_CAST:.*]] = cir.cast int_to_bool %[[X_LOAD]] : !s32i -> !cir.bool
+  // CHECK-NEXT: %[[X_CAST:.*]] = builtin.unrealized_conversion_cast %[[BOOL_CAST]] : !cir.bool to i1
+  // CHECK-NEXT: acc.atomic.update if(%[[X_CAST]]) %[[F_ALLOCA]] : !cir.ptr<!cir.float> {
+  // CHECK-NEXT: ^bb0(%[[RECIPE_ARG:.*]]: !cir.float{{.*}}):
+  // CHECK-NEXT: %[[TEMP_ALLOCA:.*]] = cir.alloca !cir.float, !cir.ptr<!cir.float>, ["x_var", init]
+  // CHECK-NEXT: cir.store %[[RECIPE_ARG]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: %[[CALL:.*]] = cir.call {{.*}}(%[[OPS_ALLOCA]]) : (!cir.ptr<!rec_HasOps>) -> !cir.float
+  // CHECK-NEXT: %[[SUB:.*]] = cir.binop(sub, %[[TEMP_LOAD]], %[[CALL]]) : !cir.float
+  // CHECK-NEXT: cir.store{{.*}} %[[SUB]], %[[TEMP_ALLOCA]] : !cir.float, !cir.ptr<!cir.float>
+  //
+  // CHECK-NEXT: %[[TEMP_LOAD:.*]] = cir.load{{.*}} %[[TEMP_ALLOCA]] : !cir.ptr<!cir.float>, !cir.float
+  // CHECK-NEXT: acc.yield %[[TEMP_LOAD]] : !cir.float
+  // CHECK-NEXT: }
+#pragma acc atomic update if (x)
+  f = f - ops;
+}
diff --git a/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp b/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
index 33e12fe1cd833..b4d76e18bf345 100644
--- a/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
+++ b/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
@@ -3,8 +3,8 @@
 void HelloWorld(int *A, int *B, int *C, int N) {
 
 // expected-error@+1{{ClangIR code gen Not Yet Implemented: OpenACC Atomic Construct}}
-#pragma acc atomic
-  N = N + 1;
+#pragma acc atomic capture
+  B = A += ++N;
 
 // expected-error@+1{{ClangIR code gen Not Yet Implemented: OpenACC Declare Construct}}
 #pragma acc declare create(A)

bcardosolopes

LGTM % ultra nit

clang/lib/AST/StmtOpenACC.cpp

andykaylor

Looks good, with a few nits

andykaylor · 2025-10-24T17:44:17Z

clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp

+
+  E = E->IgnoreImpCasts();
+
+  const auto *DRE = cast<DeclRefExpr>(E);


Suggested change

const auto *DRE = cast<DeclRefExpr>(E);

const auto *dre = cast<DeclRefExpr>(e);

I'm sure you must love flipping between two coding styles in the same PR. 😄

As you can tell, it gets me all the time :D

andykaylor · 2025-10-24T17:45:17Z

clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp

+    mlir::LogicalResult res = mlir::success();
+    {
+      mlir::Type argTy = cast<cir::PointerType>(x.getType()).getPointee();
+      mlir::OpBuilder::InsertionGuard guardCase(builder);


It feels kind of weird for this not to be the first thing in the scope. Any reason for that?

Nope, good idea.

andykaylor · 2025-10-24T17:45:43Z

clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp

+      mlir::OpBuilder::InsertionGuard guardCase(builder);
+      std::array<mlir::Type, 1> recipeType{argTy};
+      std::array<mlir::Location, 1> recipeLoc{start};
+      auto *recipeBlock = builder.createBlock(


Suggested change

auto *recipeBlock = builder.createBlock(

mlir::Block *recipeBlock = builder.createBlock(

This is the 3rd of 4 forms of the 'atomic' construct. This one allows increment/decrement, compound-assign, and assign-to-bin-op(referencing the original variable). All of the above is enforced during Sema, but for our purposes, we ONLY need to know the variable on the LHS and the expression, so this does that. The ACC dialect for acc.atomic.update uses a 'recipe' as well, which takes the VALUE, and yields the value of the updated value. To simplify the implementation, our lowering very simply creates an alloca inside the recipe, stores the passed-in value, then loads/yields it at the end.

erichkeane requested review from andykaylor, bcardosolopes, lanza and xlauko as code owners October 23, 2025 15:36

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" ClangIR Anything related to the ClangIR project labels Oct 23, 2025

bcardosolopes approved these changes Oct 23, 2025

View reviewed changes

clang/lib/AST/StmtOpenACC.cpp Outdated Show resolved Hide resolved

Remove stale comment

7e0e63c

andykaylor approved these changes Oct 24, 2025

View reviewed changes

Fix Andy's suggestions

abc6029

erichkeane enabled auto-merge (squash) October 24, 2025 18:20

Fix the variable name I didn't lowercase

a2bf2cf

erichkeane merged commit f7b4018 into llvm:main Oct 24, 2025
7 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OpenACC][CIR] Implement atomic update lowering #164836

[OpenACC][CIR] Implement atomic update lowering #164836

erichkeane commented Oct 23, 2025

Uh oh!

llvmbot commented Oct 23, 2025

Uh oh!

llvmbot commented Oct 23, 2025

Uh oh!

bcardosolopes left a comment

Uh oh!

Uh oh!

andykaylor left a comment

Uh oh!

andykaylor Oct 24, 2025

Uh oh!

erichkeane Oct 24, 2025

Uh oh!

andykaylor Oct 24, 2025

Uh oh!

erichkeane Oct 24, 2025

Uh oh!

andykaylor Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		E = E->IgnoreImpCasts();

		const auto *DRE = cast<DeclRefExpr>(E);

	auto *recipeBlock = builder.createBlock(
	mlir::Block *recipeBlock = builder.createBlock(

[OpenACC][CIR] Implement atomic update lowering #164836

[OpenACC][CIR] Implement atomic update lowering #164836

Conversation

erichkeane commented Oct 23, 2025

Uh oh!

llvmbot commented Oct 23, 2025

Uh oh!

llvmbot commented Oct 23, 2025

Uh oh!

bcardosolopes left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andykaylor left a comment

Choose a reason for hiding this comment

Uh oh!

andykaylor Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

erichkeane Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

andykaylor Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

erichkeane Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

andykaylor Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants