Skip to content

Conversation

@andykaylor
Copy link
Contributor

This implements handling to destroy global arrays that require destruction. Unlike classic codegen, CIR emits the destructor loop into a 'dtor' region associated with the global array variable. Later, during LoweringPrepare, this code is moved into a helper function and a call to __cxa_atexit arranges for it to be called during the shared object shutdown.

This implements handling to destroy global arrays that require destruction.
Unlike classic codegen, CIR emits the destructor loop into a 'dtor' region
associated with the global array variable. Later, during LoweringPrepare,
this code is moved into a helper function and a call to __cxa_atexit
arranges for it to be called during the shared object shutdown.
@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Nov 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-clangir

Author: Andy Kaylor (andykaylor)

Changes

This implements handling to destroy global arrays that require destruction. Unlike classic codegen, CIR emits the destructor loop into a 'dtor' region associated with the global array variable. Later, during LoweringPrepare, this code is moved into a helper function and a call to __cxa_atexit arranges for it to be called during the shared object shutdown.


Full diff: https://github.com/llvm/llvm-project/pull/169070.diff

3 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenCXX.cpp (+13-3)
  • (modified) clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp (+110-11)
  • (modified) clang/test/CIR/CodeGen/global-init.cpp (+101-5)
diff --git a/clang/lib/CIR/CodeGen/CIRGenCXX.cpp b/clang/lib/CIR/CodeGen/CIRGenCXX.cpp
index a3e20817d2ca4..bfd0481073788 100644
--- a/clang/lib/CIR/CodeGen/CIRGenCXX.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenCXX.cpp
@@ -139,11 +139,21 @@ static void emitDeclDestroy(CIRGenFunction &cgf, const VarDecl *vd,
         cgf.getLoc(vd->getSourceRange()),
         mlir::FlatSymbolRefAttr::get(fnOp.getSymNameAttr()),
         mlir::ValueRange{cgm.getAddrOfGlobalVar(vd)});
+    assert(fnOp && "expected cir.func");
+    // TODO(cir): This doesn't do anything but check for unhandled conditions.
+    // What it is meant to do should really be happening in LoweringPrepare.
+    cgm.getCXXABI().registerGlobalDtor(vd, fnOp, nullptr);
   } else {
-    cgm.errorNYI(vd->getSourceRange(), "array destructor");
+    // Otherwise, a custom destroyed is needed. Classic codegen creates a helper
+    // function here and emits the destroy into the helper function, which is
+    // called from __cxa_atexit.
+    // In CIR, we just emit the destroy into the dtor region. It will be moved
+    // into a separate function during the LoweringPrepare pass.
+    mlir::Value globalVal = cgf.getBuilder().createGetGlobal(addr);
+    CharUnits alignment = cgf.getContext().getDeclAlign(vd);
+    Address globalAddr{globalVal, cgf.convertTypeForMem(type), alignment};
+    cgf.emitDestroy(globalAddr, type, cgf.getDestroyer(dtorKind));
   }
-  assert(fnOp && "expected cir.func");
-  cgm.getCXXABI().registerGlobalDtor(vd, fnOp, nullptr);
 
   builder.setInsertionPointToEnd(block);
   if (block->empty()) {
diff --git a/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp b/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
index 29b1211d2c351..91d817e09bfbb 100644
--- a/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+++ b/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
@@ -76,6 +76,11 @@ struct LoweringPreparePass
   /// Build the function that initializes the specified global
   cir::FuncOp buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op);
 
+  /// Handle the dtor region by registering destructor with __cxa_atexit
+  cir::FuncOp getOrCreateDtorFunc(CIRBaseBuilderTy &builder, cir::GlobalOp op,
+                                  mlir::Region &dtorRegion,
+                                  cir::CallOp &dtorCall);
+
   /// Build a module init function that calls all the dynamic initializers.
   void buildCXXGlobalInitFunc();
 
@@ -691,6 +696,101 @@ void LoweringPreparePass::lowerUnaryOp(cir::UnaryOp op) {
   op.erase();
 }
 
+cir::FuncOp LoweringPreparePass::getOrCreateDtorFunc(CIRBaseBuilderTy &builder,
+                                                     cir::GlobalOp op,
+                                                     mlir::Region &dtorRegion,
+                                                     cir::CallOp &dtorCall) {
+  assert(!cir::MissingFeatures::astVarDeclInterface());
+  assert(!cir::MissingFeatures::opGlobalThreadLocal());
+
+  cir::VoidType voidTy = builder.getVoidTy();
+  auto voidPtrTy = cir::PointerType::get(voidTy);
+
+  // Look for operations in dtorBlock
+  mlir::Block &dtorBlock = dtorRegion.front();
+
+  // The first operation should be a get_global to retrieve the address
+  // of the global variable we're destroying.
+  auto opIt = dtorBlock.getOperations().begin();
+  cir::GetGlobalOp ggop = mlir::cast<cir::GetGlobalOp>(*opIt);
+
+  // The simple case is just a call to a destructor, like this:
+  //
+  //   %0 = cir.get_global %globalS : !cir.ptr<!rec_S>
+  //   cir.call %_ZN1SD1Ev(%0) : (!cir.ptr<!rec_S>) -> ()
+  //   (implicit cir.yield)
+  //
+  // That is, if the second operation is a call that takes the get_global result
+  // as its only operand, and the only other operation is a yield, then we can
+  // just return the called function.
+  if (dtorBlock.getOperations().size() == 3) {
+    auto callOp = mlir::dyn_cast<cir::CallOp>(&*(++opIt));
+    auto yieldOp = mlir::dyn_cast<cir::YieldOp>(&*(++opIt));
+    if (yieldOp && callOp && callOp.getNumOperands() == 1 &&
+        callOp.getArgOperand(0) == ggop) {
+      dtorCall = callOp;
+      return getCalledFunction(callOp);
+    }
+  }
+
+  // Otherwise, we need to create a helper function to replace the dtor region.
+  // This name is kind of arbitrary, but it matches the name that classic
+  // codegen uses, based on the expected case that gets us here.
+  builder.setInsertionPointAfter(op);
+  SmallString<256> fnName("__cxx_global_array_dtor");
+  uint32_t cnt = dynamicInitializerNames[fnName]++;
+  if (cnt)
+    fnName += "." + llvm::Twine(cnt).str();
+
+  // Create the helper function.
+  auto fnType = cir::FuncType::get({voidPtrTy}, voidTy);
+  cir::FuncOp dtorFunc =
+      buildRuntimeFunction(builder, fnName, op.getLoc(), fnType,
+                           cir::GlobalLinkageKind::InternalLinkage);
+  mlir::Block *entryBB = dtorFunc.addEntryBlock();
+
+  // Move everything from the dtor region into the helper function.
+  entryBB->getOperations().splice(entryBB->begin(), dtorBlock.getOperations(),
+                                  dtorBlock.begin(), dtorBlock.end());
+
+  // Before erasing this, clone it back into the dtor region
+  cir::GetGlobalOp dtorGGop =
+      mlir::cast<cir::GetGlobalOp>(entryBB->getOperations().front());
+  builder.setInsertionPointToStart(&dtorBlock);
+  builder.clone(*dtorGGop.getOperation());
+
+  // Replace all uses of the help function's get_global with the function
+  // argument.
+  mlir::Value dtorArg = entryBB->getArgument(0);
+  dtorGGop.replaceAllUsesWith(dtorArg);
+  dtorGGop.erase();
+
+  // Replace the yield in the final block with a return
+  mlir::Block &finalBlock = dtorFunc.getBody().back();
+  auto yieldOp = cast<cir::YieldOp>(finalBlock.getTerminator());
+  builder.setInsertionPoint(yieldOp);
+  cir::ReturnOp::create(builder, yieldOp->getLoc());
+  yieldOp->erase();
+
+  // Create a call to the helper function, passing the original get_global op
+  // as the argument.
+  cir::GetGlobalOp origGGop =
+      mlir::cast<cir::GetGlobalOp>(dtorBlock.getOperations().front());
+  builder.setInsertionPointAfter(origGGop);
+  mlir::Value ggopResult = origGGop.getResult();
+  dtorCall = builder.createCallOp(op.getLoc(), dtorFunc, ggopResult);
+
+  // Add a yield after the call.
+  auto finalYield = cir::YieldOp::create(builder, op.getLoc());
+
+  // Erase everything after the yield.
+  dtorBlock.getOperations().erase(std::next(mlir::Block::iterator(finalYield)),
+                                  dtorBlock.end());
+  dtorRegion.getBlocks().erase(std::next(dtorRegion.begin()), dtorRegion.end());
+
+  return dtorFunc;
+}
+
 cir::FuncOp
 LoweringPreparePass::buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op) {
   // TODO(cir): Store this in the GlobalOp.
@@ -722,22 +822,20 @@ LoweringPreparePass::buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op) {
   if (!dtorRegion.empty()) {
     assert(!cir::MissingFeatures::astVarDeclInterface());
     assert(!cir::MissingFeatures::opGlobalThreadLocal());
+
     // Create a variable that binds the atexit to this shared object.
     builder.setInsertionPointToStart(&mlirModule.getBodyRegion().front());
     cir::GlobalOp handle = buildRuntimeVariable(
         builder, "__dso_handle", op.getLoc(), builder.getI8Type(),
         cir::GlobalLinkageKind::ExternalLinkage, cir::VisibilityKind::Hidden);
 
-    // Look for the destructor call in dtorBlock
-    mlir::Block &dtorBlock = dtorRegion.front();
+    // If this is a simple call to a destructor, get the called function.
+    // Otherwise, create a helper function for the entire dtor region,
+    // replacing the current dtor region body with a call to the helper
+    // function.
     cir::CallOp dtorCall;
-    for (auto op : reverse(dtorBlock.getOps<cir::CallOp>())) {
-      dtorCall = op;
-      break;
-    }
-    assert(dtorCall && "Expected a dtor call");
-    cir::FuncOp dtorFunc = getCalledFunction(dtorCall);
-    assert(dtorFunc && "Expected a dtor call");
+    cir::FuncOp dtorFunc =
+        getOrCreateDtorFunc(builder, op, dtorRegion, dtorCall);
 
     // Create a runtime helper function:
     //    extern "C" int __cxa_atexit(void (*f)(void *), void *p, void *d);
@@ -751,8 +849,8 @@ LoweringPreparePass::buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op) {
     cir::FuncOp fnAtExit =
         buildRuntimeFunction(builder, nameAtExit, op.getLoc(), fnAtExitType);
 
-    // Replace the dtor call with a call to __cxa_atexit(&dtor, &var,
-    // &__dso_handle)
+    // Replace the dtor (or helper) call with a call to
+    //   __cxa_atexit(&dtor, &var, &__dso_handle)
     builder.setInsertionPointAfter(dtorCall);
     mlir::Value args[3];
     auto dtorPtrTy = cir::PointerType::get(dtorFunc.getFunctionType());
@@ -768,6 +866,7 @@ LoweringPreparePass::buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op) {
                                        handle.getSymName());
     builder.createCallOp(dtorCall.getLoc(), fnAtExit, args);
     dtorCall->erase();
+    mlir::Block &dtorBlock = dtorRegion.front();
     entryBB->getOperations().splice(entryBB->end(), dtorBlock.getOperations(),
                                     dtorBlock.begin(),
                                     std::prev(dtorBlock.end()));
diff --git a/clang/test/CIR/CodeGen/global-init.cpp b/clang/test/CIR/CodeGen/global-init.cpp
index 01e2868278514..3510e3e82f4e8 100644
--- a/clang/test/CIR/CodeGen/global-init.cpp
+++ b/clang/test/CIR/CodeGen/global-init.cpp
@@ -15,12 +15,14 @@
 // LLVM: @needsCtor = global %struct.NeedsCtor zeroinitializer, align 1
 // LLVM: @needsDtor = global %struct.NeedsDtor zeroinitializer, align 1
 // LLVM: @needsCtorDtor = global %struct.NeedsCtorDtor zeroinitializer, align 1
+// LLVM: @arrDtor = global [16 x %struct.ArrayDtor] zeroinitializer, align 16
 // LLVM: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_[[FILENAME:.*]], ptr null }]
 
 // OGCG: @needsCtor = global %struct.NeedsCtor zeroinitializer, align 1
 // OGCG: @needsDtor = global %struct.NeedsDtor zeroinitializer, align 1
 // OGCG: @__dso_handle = external hidden global i8
 // OGCG: @needsCtorDtor = global %struct.NeedsCtorDtor zeroinitializer, align 1
+// OGCG: @arrDtor = global [16 x %struct.ArrayDtor] zeroinitializer, align 16
 // OGCG: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_[[FILENAME:.*]], ptr null }]
 
 struct NeedsCtor {
@@ -145,11 +147,11 @@ float fp;
 int i = (int)fp;
 
 // CIR-BEFORE-LPP: cir.global external @i = ctor : !s32i {
-// CIR-BEFORE-LPP:   %0 = cir.get_global @i : !cir.ptr<!s32i>
-// CIR-BEFORE-LPP:   %1 = cir.get_global @fp : !cir.ptr<!cir.float>
-// CIR-BEFORE-LPP:   %2 = cir.load{{.*}} %1 : !cir.ptr<!cir.float>, !cir.float
-// CIR-BEFORE-LPP:   %3 = cir.cast float_to_int %2 : !cir.float -> !s32i
-// CIR-BEFORE-LPP:   cir.store{{.*}} %3, %0 : !s32i, !cir.ptr<!s32i>
+// CIR-BEFORE-LPP:   %[[I:.*]] = cir.get_global @i : !cir.ptr<!s32i>
+// CIR-BEFORE-LPP:   %[[FP:.*]] = cir.get_global @fp : !cir.ptr<!cir.float>
+// CIR-BEFORE-LPP:   %[[FP_VAL:.*]] = cir.load{{.*}} %[[FP]] : !cir.ptr<!cir.float>, !cir.float
+// CIR-BEFORE-LPP:   %[[FP_I32:.*]] = cir.cast float_to_int %[[FP_VAL]] : !cir.float -> !s32i
+// CIR-BEFORE-LPP:   cir.store{{.*}} %[[FP_I32]], %[[I]] : !s32i, !cir.ptr<!s32i>
 // CIR-BEFORE-LPP: }
 
 // CIR: cir.func internal private @__cxx_global_var_init.4()
@@ -169,6 +171,97 @@ int i = (int)fp;
 // OGCG:   %[[FP_I32:.*]] = fptosi float %[[TMP_FP]] to i32
 // OGCG:   store i32 %[[FP_I32]], ptr @i, align 4
 
+struct ArrayDtor {
+  ~ArrayDtor();
+};
+
+ArrayDtor arrDtor[16];
+
+// CIR-BEFORE-LPP:      cir.global external @arrDtor = #cir.zero : !cir.array<!rec_ArrayDtor x 16>
+// CIR-BEFORE-LPP-SAME:   dtor {
+// CIR-BEFORE-LPP:          %[[THIS:.*]] = cir.get_global @arrDtor : !cir.ptr<!cir.array<!rec_ArrayDtor x 16>>
+// CIR-BEFORE-LPP:          cir.array.dtor %[[THIS]] : !cir.ptr<!cir.array<!rec_ArrayDtor x 16>> {
+// CIR-BEFORE-LPP:          ^bb0(%[[ELEM:.*]]: !cir.ptr<!rec_ArrayDtor>):
+// CIR-BEFORE-LPP:            cir.call @_ZN9ArrayDtorD1Ev(%[[ELEM]]) nothrow : (!cir.ptr<!rec_ArrayDtor>) -> ()
+// CIR-BEFORE-LPP:            cir.yield
+// CIR-BEFORE-LPP:          }
+// CIR-BEFORE-LPP:        }
+
+// CIR: cir.global external @arrDtor = #cir.zero : !cir.array<!rec_ArrayDtor x 16> {alignment = 16 : i64}
+// CIR: cir.func internal private @__cxx_global_array_dtor(%[[ARR_ARG:.*]]: !cir.ptr<!void> {{.*}}) {
+// CIR:   %[[CONST15:.*]] = cir.const #cir.int<15> : !u64i
+// CIR:   %[[BEGIN:.*]] = cir.cast array_to_ptrdecay %[[ARR_ARG]] : !cir.ptr<!void> -> !cir.ptr<!rec_ArrayDtor>
+// CIR:   %[[END:.*]] = cir.ptr_stride %[[BEGIN]], %[[CONST15]] : (!cir.ptr<!rec_ArrayDtor>, !u64i) -> !cir.ptr<!rec_ArrayDtor>
+// CIR:   %[[CUR_ADDR:.*]] = cir.alloca !cir.ptr<!rec_ArrayDtor>, !cir.ptr<!cir.ptr<!rec_ArrayDtor>>, ["__array_idx"]
+// CIR:   cir.store %[[END]], %[[CUR_ADDR]] : !cir.ptr<!rec_ArrayDtor>, !cir.ptr<!cir.ptr<!rec_ArrayDtor>>
+// CIR:   cir.do {
+// CIR:     %[[CUR:.*]] = cir.load %[[CUR_ADDR]] : !cir.ptr<!cir.ptr<!rec_ArrayDtor>>, !cir.ptr<!rec_ArrayDtor>
+// CIR:     cir.call @_ZN9ArrayDtorD1Ev(%[[CUR]]) nothrow : (!cir.ptr<!rec_ArrayDtor>) -> ()
+// CIR:     %[[NEG_ONE:.*]] = cir.const #cir.int<-1> : !s64i
+// CIR:     %[[NEXT:.*]] = cir.ptr_stride %[[CUR]], %[[NEG_ONE]] : (!cir.ptr<!rec_ArrayDtor>, !s64i) -> !cir.ptr<!rec_ArrayDtor>
+// CIR:     cir.store %[[NEXT]], %[[CUR_ADDR]] : !cir.ptr<!rec_ArrayDtor>, !cir.ptr<!cir.ptr<!rec_ArrayDtor>>
+// CIR:     cir.yield
+// CIR:   } while {
+// CIR:     %[[CUR:.*]] = cir.load %[[CUR_ADDR]] : !cir.ptr<!cir.ptr<!rec_ArrayDtor>>, !cir.ptr<!rec_ArrayDtor>
+// CIR:     %[[CMP:.*]] = cir.cmp(ne, %[[CUR]], %[[BEGIN]]) : !cir.ptr<!rec_ArrayDtor>, !cir.bool
+// CIR:     cir.condition(%[[CMP]])
+// CIR:   }
+// CIR:   cir.return
+// CIR: }
+//
+// CIR: cir.func internal private @__cxx_global_var_init.5() {
+// CIR:   %[[ARR:.*]] = cir.get_global @arrDtor : !cir.ptr<!cir.array<!rec_ArrayDtor x 16>>
+// CIR:   %[[DTOR:.*]] = cir.get_global @__cxx_global_array_dtor : !cir.ptr<!cir.func<(!cir.ptr<!void>)>>
+// CIR:   %[[DTOR_CAST:.*]] = cir.cast bitcast %[[DTOR]] : !cir.ptr<!cir.func<(!cir.ptr<!void>)>> -> !cir.ptr<!cir.func<(!cir.ptr<!void>)>>
+// CIR:   %[[ARR_CAST:.*]] = cir.cast bitcast %[[ARR]] : !cir.ptr<!cir.array<!rec_ArrayDtor x 16>> -> !cir.ptr<!void>
+// CIR:   %[[HANDLE:.*]] = cir.get_global @__dso_handle : !cir.ptr<i8>
+// CIR:   cir.call @__cxa_atexit(%[[DTOR_CAST]], %[[ARR_CAST]], %[[HANDLE]]) : (!cir.ptr<!cir.func<(!cir.ptr<!void>)>>, !cir.ptr<!void>, !cir.ptr<i8>) -> ()
+
+// LLVM: define internal void @__cxx_global_array_dtor(ptr %[[ARR_ARG:.*]]) {
+// LLVM:   %[[BEGIN:.*]] = getelementptr %struct.ArrayDtor, ptr %[[ARR_ARG]], i32 0
+// LLVM:   %[[END:.*]] = getelementptr %struct.ArrayDtor, ptr %[[BEGIN]], i64 15
+// LLVM:   %[[CUR_ADDR:.*]] = alloca ptr
+// LLVM:   store ptr %[[END]], ptr %[[CUR_ADDR]]
+// LLVM:   br label %[[LOOP_BODY:.*]]
+// LLVM: [[LOOP_COND:.*]]:
+// LLVM:   %[[CUR:.*]] = load ptr, ptr %[[CUR_ADDR]]
+// LLVM:   %[[CMP:.*]] = icmp ne ptr %[[CUR]], %[[BEGIN]]
+// LLVM:   br i1 %[[CMP]], label %[[LOOP_BODY]], label %[[LOOP_END:.*]]
+// LLVM: [[LOOP_BODY]]:
+// LLVM:   %[[CUR:.*]] = load ptr, ptr %[[CUR_ADDR]]
+// LLVM:   call void @_ZN9ArrayDtorD1Ev(ptr %[[CUR]]) #0
+// LLVM:   %[[PREV:.*]] = getelementptr %struct.ArrayDtor, ptr %[[CUR]], i64 -1
+// LLVM:   store ptr %[[PREV]], ptr %[[CUR_ADDR]]
+// LLVM:   br label %[[LOOP_COND]]
+// LLVM: [[LOOP_END]]:
+// LLVM:   ret void
+// LLVM: }
+//
+// LLVM: define internal void @__cxx_global_var_init.5() {
+// LLVM:   call void @__cxa_atexit(ptr @__cxx_global_array_dtor, ptr @arrDtor, ptr @__dso_handle)
+
+// Note: OGCG defines these functions in reverse order of CIR->LLVM.
+// Note also: OGCG doesn't pass the address of the array to the destructor function.
+//            Instead, it uses the global directly in the helper function.
+
+// OGCG: define internal void @__cxx_global_var_init.5() {{.*}} section ".text.startup" {
+// OGCG:   call i32 @__cxa_atexit(ptr @__cxx_global_array_dtor, ptr null, ptr @__dso_handle)
+
+// OGCG: define internal void @__cxx_global_array_dtor(ptr noundef %[[ARG:.*]]) {{.*}} section ".text.startup" {
+// OGCG: entry:
+// OGCG:   %[[UNUSED_ADDR:.*]] = alloca ptr
+// OGCG:   store ptr %[[ARG]], ptr %[[UNUSED_ADDR]]
+// OGCG:   br label %[[LOOP_BODY:.*]]
+// OGCG: [[LOOP_BODY]]:
+// OGCG:   %[[PREV:.*]] = phi ptr [ getelementptr inbounds (%struct.ArrayDtor, ptr @arrDtor, i64 16), %entry ], [ %[[CUR:.*]], %[[LOOP_BODY]] ]
+// OGCG:   %[[CUR]] = getelementptr inbounds %struct.ArrayDtor, ptr %[[PREV]], i64 -1
+// OGCG:   call void @_ZN9ArrayDtorD1Ev(ptr noundef nonnull align 1 dereferenceable(1) %[[CUR]])
+// OGCG:   %[[DONE:.*]] = icmp eq ptr %[[CUR]], @arrDtor
+// OGCG:   br i1 %[[DONE]], label %[[LOOP_END:.*]], label %[[LOOP_BODY]]
+// OGCG: [[LOOP_END]]:
+// OGCG:   ret void
+// OGCG: }
+
 // Common init function for all globals with default priority
 
 // CIR: cir.func private @_GLOBAL__sub_I_[[FILENAME:.*]]() {
@@ -177,6 +270,7 @@ int i = (int)fp;
 // CIR:   cir.call @__cxx_global_var_init.2() : () -> ()
 // CIR:   cir.call @__cxx_global_var_init.3() : () -> ()
 // CIR:   cir.call @__cxx_global_var_init.4() : () -> ()
+// CIR:   cir.call @__cxx_global_var_init.5() : () -> ()
 
 // LLVM: define void @_GLOBAL__sub_I_[[FILENAME]]()
 // LLVM:   call void @__cxx_global_var_init()
@@ -184,6 +278,7 @@ int i = (int)fp;
 // LLVM:   call void @__cxx_global_var_init.2()
 // LLVM:   call void @__cxx_global_var_init.3()
 // LLVM:   call void @__cxx_global_var_init.4()
+// LLVM:   call void @__cxx_global_var_init.5()
 
 // OGCG: define internal void @_GLOBAL__sub_I_[[FILENAME]]() {{.*}} section ".text.startup" {
 // OGCG:   call void @__cxx_global_var_init()
@@ -191,3 +286,4 @@ int i = (int)fp;
 // OGCG:   call void @__cxx_global_var_init.2()
 // OGCG:   call void @__cxx_global_var_init.3()
 // OGCG:   call void @__cxx_global_var_init.4()
+// OGCG:   call void @__cxx_global_var_init.5()

@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-clang

Author: Andy Kaylor (andykaylor)

Changes

This implements handling to destroy global arrays that require destruction. Unlike classic codegen, CIR emits the destructor loop into a 'dtor' region associated with the global array variable. Later, during LoweringPrepare, this code is moved into a helper function and a call to __cxa_atexit arranges for it to be called during the shared object shutdown.


Full diff: https://github.com/llvm/llvm-project/pull/169070.diff

3 Files Affected:

  • (modified) clang/lib/CIR/CodeGen/CIRGenCXX.cpp (+13-3)
  • (modified) clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp (+110-11)
  • (modified) clang/test/CIR/CodeGen/global-init.cpp (+101-5)
diff --git a/clang/lib/CIR/CodeGen/CIRGenCXX.cpp b/clang/lib/CIR/CodeGen/CIRGenCXX.cpp
index a3e20817d2ca4..bfd0481073788 100644
--- a/clang/lib/CIR/CodeGen/CIRGenCXX.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenCXX.cpp
@@ -139,11 +139,21 @@ static void emitDeclDestroy(CIRGenFunction &cgf, const VarDecl *vd,
         cgf.getLoc(vd->getSourceRange()),
         mlir::FlatSymbolRefAttr::get(fnOp.getSymNameAttr()),
         mlir::ValueRange{cgm.getAddrOfGlobalVar(vd)});
+    assert(fnOp && "expected cir.func");
+    // TODO(cir): This doesn't do anything but check for unhandled conditions.
+    // What it is meant to do should really be happening in LoweringPrepare.
+    cgm.getCXXABI().registerGlobalDtor(vd, fnOp, nullptr);
   } else {
-    cgm.errorNYI(vd->getSourceRange(), "array destructor");
+    // Otherwise, a custom destroyed is needed. Classic codegen creates a helper
+    // function here and emits the destroy into the helper function, which is
+    // called from __cxa_atexit.
+    // In CIR, we just emit the destroy into the dtor region. It will be moved
+    // into a separate function during the LoweringPrepare pass.
+    mlir::Value globalVal = cgf.getBuilder().createGetGlobal(addr);
+    CharUnits alignment = cgf.getContext().getDeclAlign(vd);
+    Address globalAddr{globalVal, cgf.convertTypeForMem(type), alignment};
+    cgf.emitDestroy(globalAddr, type, cgf.getDestroyer(dtorKind));
   }
-  assert(fnOp && "expected cir.func");
-  cgm.getCXXABI().registerGlobalDtor(vd, fnOp, nullptr);
 
   builder.setInsertionPointToEnd(block);
   if (block->empty()) {
diff --git a/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp b/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
index 29b1211d2c351..91d817e09bfbb 100644
--- a/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+++ b/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
@@ -76,6 +76,11 @@ struct LoweringPreparePass
   /// Build the function that initializes the specified global
   cir::FuncOp buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op);
 
+  /// Handle the dtor region by registering destructor with __cxa_atexit
+  cir::FuncOp getOrCreateDtorFunc(CIRBaseBuilderTy &builder, cir::GlobalOp op,
+                                  mlir::Region &dtorRegion,
+                                  cir::CallOp &dtorCall);
+
   /// Build a module init function that calls all the dynamic initializers.
   void buildCXXGlobalInitFunc();
 
@@ -691,6 +696,101 @@ void LoweringPreparePass::lowerUnaryOp(cir::UnaryOp op) {
   op.erase();
 }
 
+cir::FuncOp LoweringPreparePass::getOrCreateDtorFunc(CIRBaseBuilderTy &builder,
+                                                     cir::GlobalOp op,
+                                                     mlir::Region &dtorRegion,
+                                                     cir::CallOp &dtorCall) {
+  assert(!cir::MissingFeatures::astVarDeclInterface());
+  assert(!cir::MissingFeatures::opGlobalThreadLocal());
+
+  cir::VoidType voidTy = builder.getVoidTy();
+  auto voidPtrTy = cir::PointerType::get(voidTy);
+
+  // Look for operations in dtorBlock
+  mlir::Block &dtorBlock = dtorRegion.front();
+
+  // The first operation should be a get_global to retrieve the address
+  // of the global variable we're destroying.
+  auto opIt = dtorBlock.getOperations().begin();
+  cir::GetGlobalOp ggop = mlir::cast<cir::GetGlobalOp>(*opIt);
+
+  // The simple case is just a call to a destructor, like this:
+  //
+  //   %0 = cir.get_global %globalS : !cir.ptr<!rec_S>
+  //   cir.call %_ZN1SD1Ev(%0) : (!cir.ptr<!rec_S>) -> ()
+  //   (implicit cir.yield)
+  //
+  // That is, if the second operation is a call that takes the get_global result
+  // as its only operand, and the only other operation is a yield, then we can
+  // just return the called function.
+  if (dtorBlock.getOperations().size() == 3) {
+    auto callOp = mlir::dyn_cast<cir::CallOp>(&*(++opIt));
+    auto yieldOp = mlir::dyn_cast<cir::YieldOp>(&*(++opIt));
+    if (yieldOp && callOp && callOp.getNumOperands() == 1 &&
+        callOp.getArgOperand(0) == ggop) {
+      dtorCall = callOp;
+      return getCalledFunction(callOp);
+    }
+  }
+
+  // Otherwise, we need to create a helper function to replace the dtor region.
+  // This name is kind of arbitrary, but it matches the name that classic
+  // codegen uses, based on the expected case that gets us here.
+  builder.setInsertionPointAfter(op);
+  SmallString<256> fnName("__cxx_global_array_dtor");
+  uint32_t cnt = dynamicInitializerNames[fnName]++;
+  if (cnt)
+    fnName += "." + llvm::Twine(cnt).str();
+
+  // Create the helper function.
+  auto fnType = cir::FuncType::get({voidPtrTy}, voidTy);
+  cir::FuncOp dtorFunc =
+      buildRuntimeFunction(builder, fnName, op.getLoc(), fnType,
+                           cir::GlobalLinkageKind::InternalLinkage);
+  mlir::Block *entryBB = dtorFunc.addEntryBlock();
+
+  // Move everything from the dtor region into the helper function.
+  entryBB->getOperations().splice(entryBB->begin(), dtorBlock.getOperations(),
+                                  dtorBlock.begin(), dtorBlock.end());
+
+  // Before erasing this, clone it back into the dtor region
+  cir::GetGlobalOp dtorGGop =
+      mlir::cast<cir::GetGlobalOp>(entryBB->getOperations().front());
+  builder.setInsertionPointToStart(&dtorBlock);
+  builder.clone(*dtorGGop.getOperation());
+
+  // Replace all uses of the help function's get_global with the function
+  // argument.
+  mlir::Value dtorArg = entryBB->getArgument(0);
+  dtorGGop.replaceAllUsesWith(dtorArg);
+  dtorGGop.erase();
+
+  // Replace the yield in the final block with a return
+  mlir::Block &finalBlock = dtorFunc.getBody().back();
+  auto yieldOp = cast<cir::YieldOp>(finalBlock.getTerminator());
+  builder.setInsertionPoint(yieldOp);
+  cir::ReturnOp::create(builder, yieldOp->getLoc());
+  yieldOp->erase();
+
+  // Create a call to the helper function, passing the original get_global op
+  // as the argument.
+  cir::GetGlobalOp origGGop =
+      mlir::cast<cir::GetGlobalOp>(dtorBlock.getOperations().front());
+  builder.setInsertionPointAfter(origGGop);
+  mlir::Value ggopResult = origGGop.getResult();
+  dtorCall = builder.createCallOp(op.getLoc(), dtorFunc, ggopResult);
+
+  // Add a yield after the call.
+  auto finalYield = cir::YieldOp::create(builder, op.getLoc());
+
+  // Erase everything after the yield.
+  dtorBlock.getOperations().erase(std::next(mlir::Block::iterator(finalYield)),
+                                  dtorBlock.end());
+  dtorRegion.getBlocks().erase(std::next(dtorRegion.begin()), dtorRegion.end());
+
+  return dtorFunc;
+}
+
 cir::FuncOp
 LoweringPreparePass::buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op) {
   // TODO(cir): Store this in the GlobalOp.
@@ -722,22 +822,20 @@ LoweringPreparePass::buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op) {
   if (!dtorRegion.empty()) {
     assert(!cir::MissingFeatures::astVarDeclInterface());
     assert(!cir::MissingFeatures::opGlobalThreadLocal());
+
     // Create a variable that binds the atexit to this shared object.
     builder.setInsertionPointToStart(&mlirModule.getBodyRegion().front());
     cir::GlobalOp handle = buildRuntimeVariable(
         builder, "__dso_handle", op.getLoc(), builder.getI8Type(),
         cir::GlobalLinkageKind::ExternalLinkage, cir::VisibilityKind::Hidden);
 
-    // Look for the destructor call in dtorBlock
-    mlir::Block &dtorBlock = dtorRegion.front();
+    // If this is a simple call to a destructor, get the called function.
+    // Otherwise, create a helper function for the entire dtor region,
+    // replacing the current dtor region body with a call to the helper
+    // function.
     cir::CallOp dtorCall;
-    for (auto op : reverse(dtorBlock.getOps<cir::CallOp>())) {
-      dtorCall = op;
-      break;
-    }
-    assert(dtorCall && "Expected a dtor call");
-    cir::FuncOp dtorFunc = getCalledFunction(dtorCall);
-    assert(dtorFunc && "Expected a dtor call");
+    cir::FuncOp dtorFunc =
+        getOrCreateDtorFunc(builder, op, dtorRegion, dtorCall);
 
     // Create a runtime helper function:
     //    extern "C" int __cxa_atexit(void (*f)(void *), void *p, void *d);
@@ -751,8 +849,8 @@ LoweringPreparePass::buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op) {
     cir::FuncOp fnAtExit =
         buildRuntimeFunction(builder, nameAtExit, op.getLoc(), fnAtExitType);
 
-    // Replace the dtor call with a call to __cxa_atexit(&dtor, &var,
-    // &__dso_handle)
+    // Replace the dtor (or helper) call with a call to
+    //   __cxa_atexit(&dtor, &var, &__dso_handle)
     builder.setInsertionPointAfter(dtorCall);
     mlir::Value args[3];
     auto dtorPtrTy = cir::PointerType::get(dtorFunc.getFunctionType());
@@ -768,6 +866,7 @@ LoweringPreparePass::buildCXXGlobalVarDeclInitFunc(cir::GlobalOp op) {
                                        handle.getSymName());
     builder.createCallOp(dtorCall.getLoc(), fnAtExit, args);
     dtorCall->erase();
+    mlir::Block &dtorBlock = dtorRegion.front();
     entryBB->getOperations().splice(entryBB->end(), dtorBlock.getOperations(),
                                     dtorBlock.begin(),
                                     std::prev(dtorBlock.end()));
diff --git a/clang/test/CIR/CodeGen/global-init.cpp b/clang/test/CIR/CodeGen/global-init.cpp
index 01e2868278514..3510e3e82f4e8 100644
--- a/clang/test/CIR/CodeGen/global-init.cpp
+++ b/clang/test/CIR/CodeGen/global-init.cpp
@@ -15,12 +15,14 @@
 // LLVM: @needsCtor = global %struct.NeedsCtor zeroinitializer, align 1
 // LLVM: @needsDtor = global %struct.NeedsDtor zeroinitializer, align 1
 // LLVM: @needsCtorDtor = global %struct.NeedsCtorDtor zeroinitializer, align 1
+// LLVM: @arrDtor = global [16 x %struct.ArrayDtor] zeroinitializer, align 16
 // LLVM: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_[[FILENAME:.*]], ptr null }]
 
 // OGCG: @needsCtor = global %struct.NeedsCtor zeroinitializer, align 1
 // OGCG: @needsDtor = global %struct.NeedsDtor zeroinitializer, align 1
 // OGCG: @__dso_handle = external hidden global i8
 // OGCG: @needsCtorDtor = global %struct.NeedsCtorDtor zeroinitializer, align 1
+// OGCG: @arrDtor = global [16 x %struct.ArrayDtor] zeroinitializer, align 16
 // OGCG: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_[[FILENAME:.*]], ptr null }]
 
 struct NeedsCtor {
@@ -145,11 +147,11 @@ float fp;
 int i = (int)fp;
 
 // CIR-BEFORE-LPP: cir.global external @i = ctor : !s32i {
-// CIR-BEFORE-LPP:   %0 = cir.get_global @i : !cir.ptr<!s32i>
-// CIR-BEFORE-LPP:   %1 = cir.get_global @fp : !cir.ptr<!cir.float>
-// CIR-BEFORE-LPP:   %2 = cir.load{{.*}} %1 : !cir.ptr<!cir.float>, !cir.float
-// CIR-BEFORE-LPP:   %3 = cir.cast float_to_int %2 : !cir.float -> !s32i
-// CIR-BEFORE-LPP:   cir.store{{.*}} %3, %0 : !s32i, !cir.ptr<!s32i>
+// CIR-BEFORE-LPP:   %[[I:.*]] = cir.get_global @i : !cir.ptr<!s32i>
+// CIR-BEFORE-LPP:   %[[FP:.*]] = cir.get_global @fp : !cir.ptr<!cir.float>
+// CIR-BEFORE-LPP:   %[[FP_VAL:.*]] = cir.load{{.*}} %[[FP]] : !cir.ptr<!cir.float>, !cir.float
+// CIR-BEFORE-LPP:   %[[FP_I32:.*]] = cir.cast float_to_int %[[FP_VAL]] : !cir.float -> !s32i
+// CIR-BEFORE-LPP:   cir.store{{.*}} %[[FP_I32]], %[[I]] : !s32i, !cir.ptr<!s32i>
 // CIR-BEFORE-LPP: }
 
 // CIR: cir.func internal private @__cxx_global_var_init.4()
@@ -169,6 +171,97 @@ int i = (int)fp;
 // OGCG:   %[[FP_I32:.*]] = fptosi float %[[TMP_FP]] to i32
 // OGCG:   store i32 %[[FP_I32]], ptr @i, align 4
 
+struct ArrayDtor {
+  ~ArrayDtor();
+};
+
+ArrayDtor arrDtor[16];
+
+// CIR-BEFORE-LPP:      cir.global external @arrDtor = #cir.zero : !cir.array<!rec_ArrayDtor x 16>
+// CIR-BEFORE-LPP-SAME:   dtor {
+// CIR-BEFORE-LPP:          %[[THIS:.*]] = cir.get_global @arrDtor : !cir.ptr<!cir.array<!rec_ArrayDtor x 16>>
+// CIR-BEFORE-LPP:          cir.array.dtor %[[THIS]] : !cir.ptr<!cir.array<!rec_ArrayDtor x 16>> {
+// CIR-BEFORE-LPP:          ^bb0(%[[ELEM:.*]]: !cir.ptr<!rec_ArrayDtor>):
+// CIR-BEFORE-LPP:            cir.call @_ZN9ArrayDtorD1Ev(%[[ELEM]]) nothrow : (!cir.ptr<!rec_ArrayDtor>) -> ()
+// CIR-BEFORE-LPP:            cir.yield
+// CIR-BEFORE-LPP:          }
+// CIR-BEFORE-LPP:        }
+
+// CIR: cir.global external @arrDtor = #cir.zero : !cir.array<!rec_ArrayDtor x 16> {alignment = 16 : i64}
+// CIR: cir.func internal private @__cxx_global_array_dtor(%[[ARR_ARG:.*]]: !cir.ptr<!void> {{.*}}) {
+// CIR:   %[[CONST15:.*]] = cir.const #cir.int<15> : !u64i
+// CIR:   %[[BEGIN:.*]] = cir.cast array_to_ptrdecay %[[ARR_ARG]] : !cir.ptr<!void> -> !cir.ptr<!rec_ArrayDtor>
+// CIR:   %[[END:.*]] = cir.ptr_stride %[[BEGIN]], %[[CONST15]] : (!cir.ptr<!rec_ArrayDtor>, !u64i) -> !cir.ptr<!rec_ArrayDtor>
+// CIR:   %[[CUR_ADDR:.*]] = cir.alloca !cir.ptr<!rec_ArrayDtor>, !cir.ptr<!cir.ptr<!rec_ArrayDtor>>, ["__array_idx"]
+// CIR:   cir.store %[[END]], %[[CUR_ADDR]] : !cir.ptr<!rec_ArrayDtor>, !cir.ptr<!cir.ptr<!rec_ArrayDtor>>
+// CIR:   cir.do {
+// CIR:     %[[CUR:.*]] = cir.load %[[CUR_ADDR]] : !cir.ptr<!cir.ptr<!rec_ArrayDtor>>, !cir.ptr<!rec_ArrayDtor>
+// CIR:     cir.call @_ZN9ArrayDtorD1Ev(%[[CUR]]) nothrow : (!cir.ptr<!rec_ArrayDtor>) -> ()
+// CIR:     %[[NEG_ONE:.*]] = cir.const #cir.int<-1> : !s64i
+// CIR:     %[[NEXT:.*]] = cir.ptr_stride %[[CUR]], %[[NEG_ONE]] : (!cir.ptr<!rec_ArrayDtor>, !s64i) -> !cir.ptr<!rec_ArrayDtor>
+// CIR:     cir.store %[[NEXT]], %[[CUR_ADDR]] : !cir.ptr<!rec_ArrayDtor>, !cir.ptr<!cir.ptr<!rec_ArrayDtor>>
+// CIR:     cir.yield
+// CIR:   } while {
+// CIR:     %[[CUR:.*]] = cir.load %[[CUR_ADDR]] : !cir.ptr<!cir.ptr<!rec_ArrayDtor>>, !cir.ptr<!rec_ArrayDtor>
+// CIR:     %[[CMP:.*]] = cir.cmp(ne, %[[CUR]], %[[BEGIN]]) : !cir.ptr<!rec_ArrayDtor>, !cir.bool
+// CIR:     cir.condition(%[[CMP]])
+// CIR:   }
+// CIR:   cir.return
+// CIR: }
+//
+// CIR: cir.func internal private @__cxx_global_var_init.5() {
+// CIR:   %[[ARR:.*]] = cir.get_global @arrDtor : !cir.ptr<!cir.array<!rec_ArrayDtor x 16>>
+// CIR:   %[[DTOR:.*]] = cir.get_global @__cxx_global_array_dtor : !cir.ptr<!cir.func<(!cir.ptr<!void>)>>
+// CIR:   %[[DTOR_CAST:.*]] = cir.cast bitcast %[[DTOR]] : !cir.ptr<!cir.func<(!cir.ptr<!void>)>> -> !cir.ptr<!cir.func<(!cir.ptr<!void>)>>
+// CIR:   %[[ARR_CAST:.*]] = cir.cast bitcast %[[ARR]] : !cir.ptr<!cir.array<!rec_ArrayDtor x 16>> -> !cir.ptr<!void>
+// CIR:   %[[HANDLE:.*]] = cir.get_global @__dso_handle : !cir.ptr<i8>
+// CIR:   cir.call @__cxa_atexit(%[[DTOR_CAST]], %[[ARR_CAST]], %[[HANDLE]]) : (!cir.ptr<!cir.func<(!cir.ptr<!void>)>>, !cir.ptr<!void>, !cir.ptr<i8>) -> ()
+
+// LLVM: define internal void @__cxx_global_array_dtor(ptr %[[ARR_ARG:.*]]) {
+// LLVM:   %[[BEGIN:.*]] = getelementptr %struct.ArrayDtor, ptr %[[ARR_ARG]], i32 0
+// LLVM:   %[[END:.*]] = getelementptr %struct.ArrayDtor, ptr %[[BEGIN]], i64 15
+// LLVM:   %[[CUR_ADDR:.*]] = alloca ptr
+// LLVM:   store ptr %[[END]], ptr %[[CUR_ADDR]]
+// LLVM:   br label %[[LOOP_BODY:.*]]
+// LLVM: [[LOOP_COND:.*]]:
+// LLVM:   %[[CUR:.*]] = load ptr, ptr %[[CUR_ADDR]]
+// LLVM:   %[[CMP:.*]] = icmp ne ptr %[[CUR]], %[[BEGIN]]
+// LLVM:   br i1 %[[CMP]], label %[[LOOP_BODY]], label %[[LOOP_END:.*]]
+// LLVM: [[LOOP_BODY]]:
+// LLVM:   %[[CUR:.*]] = load ptr, ptr %[[CUR_ADDR]]
+// LLVM:   call void @_ZN9ArrayDtorD1Ev(ptr %[[CUR]]) #0
+// LLVM:   %[[PREV:.*]] = getelementptr %struct.ArrayDtor, ptr %[[CUR]], i64 -1
+// LLVM:   store ptr %[[PREV]], ptr %[[CUR_ADDR]]
+// LLVM:   br label %[[LOOP_COND]]
+// LLVM: [[LOOP_END]]:
+// LLVM:   ret void
+// LLVM: }
+//
+// LLVM: define internal void @__cxx_global_var_init.5() {
+// LLVM:   call void @__cxa_atexit(ptr @__cxx_global_array_dtor, ptr @arrDtor, ptr @__dso_handle)
+
+// Note: OGCG defines these functions in reverse order of CIR->LLVM.
+// Note also: OGCG doesn't pass the address of the array to the destructor function.
+//            Instead, it uses the global directly in the helper function.
+
+// OGCG: define internal void @__cxx_global_var_init.5() {{.*}} section ".text.startup" {
+// OGCG:   call i32 @__cxa_atexit(ptr @__cxx_global_array_dtor, ptr null, ptr @__dso_handle)
+
+// OGCG: define internal void @__cxx_global_array_dtor(ptr noundef %[[ARG:.*]]) {{.*}} section ".text.startup" {
+// OGCG: entry:
+// OGCG:   %[[UNUSED_ADDR:.*]] = alloca ptr
+// OGCG:   store ptr %[[ARG]], ptr %[[UNUSED_ADDR]]
+// OGCG:   br label %[[LOOP_BODY:.*]]
+// OGCG: [[LOOP_BODY]]:
+// OGCG:   %[[PREV:.*]] = phi ptr [ getelementptr inbounds (%struct.ArrayDtor, ptr @arrDtor, i64 16), %entry ], [ %[[CUR:.*]], %[[LOOP_BODY]] ]
+// OGCG:   %[[CUR]] = getelementptr inbounds %struct.ArrayDtor, ptr %[[PREV]], i64 -1
+// OGCG:   call void @_ZN9ArrayDtorD1Ev(ptr noundef nonnull align 1 dereferenceable(1) %[[CUR]])
+// OGCG:   %[[DONE:.*]] = icmp eq ptr %[[CUR]], @arrDtor
+// OGCG:   br i1 %[[DONE]], label %[[LOOP_END:.*]], label %[[LOOP_BODY]]
+// OGCG: [[LOOP_END]]:
+// OGCG:   ret void
+// OGCG: }
+
 // Common init function for all globals with default priority
 
 // CIR: cir.func private @_GLOBAL__sub_I_[[FILENAME:.*]]() {
@@ -177,6 +270,7 @@ int i = (int)fp;
 // CIR:   cir.call @__cxx_global_var_init.2() : () -> ()
 // CIR:   cir.call @__cxx_global_var_init.3() : () -> ()
 // CIR:   cir.call @__cxx_global_var_init.4() : () -> ()
+// CIR:   cir.call @__cxx_global_var_init.5() : () -> ()
 
 // LLVM: define void @_GLOBAL__sub_I_[[FILENAME]]()
 // LLVM:   call void @__cxx_global_var_init()
@@ -184,6 +278,7 @@ int i = (int)fp;
 // LLVM:   call void @__cxx_global_var_init.2()
 // LLVM:   call void @__cxx_global_var_init.3()
 // LLVM:   call void @__cxx_global_var_init.4()
+// LLVM:   call void @__cxx_global_var_init.5()
 
 // OGCG: define internal void @_GLOBAL__sub_I_[[FILENAME]]() {{.*}} section ".text.startup" {
 // OGCG:   call void @__cxx_global_var_init()
@@ -191,3 +286,4 @@ int i = (int)fp;
 // OGCG:   call void @__cxx_global_var_init.2()
 // OGCG:   call void @__cxx_global_var_init.3()
 // OGCG:   call void @__cxx_global_var_init.4()
+// OGCG:   call void @__cxx_global_var_init.5()


// The simple case is just a call to a destructor, like this:
//
// %0 = cir.get_global %globalS : !cir.ptr<!rec_S>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a little weird that this global is required... could we make a helper for 'createDestroy' (or whatever it is?) to take the 'location' of the get and synthesize it? It just seems odd we have a 'get_global' .

Alternatively, can we assert that the global IS the current variable as a sanity assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's weird, but we need an mlir::Value to reference in subsequent operations. We could introduce a special operation that's only valid in GlobalOp ctor and dtor regions, like cir.get_current_global but there would still need to be an operation here. A special purpose op might be good because otherwise, it would be tricky to decide during lowering which get_global ops should refer to the current global and which should not in cases like this:

int someGlobal;
float anotherGlobal = (float)someGlobal;

which produces

cir.global external dso_local @someGlobal = #cir.int<0> : !s32i {alignment = 4 : i64}
cir.global external dso_local @anotherGlobal = ctor : !cir.float {
  %0 = cir.get_global @anotherGlobal : !cir.ptr<!cir.float>
  %1 = cir.get_global @someGlobal : !cir.ptr<!s32i>
  %2 = cir.load align(4) %1 : !cir.ptr<!s32i>, !s32i
  %3 = cir.cast int_to_float %2 : !s32i -> !cir.float
  cir.store align(4) %3, %0 : !cir.float, !cir.ptr<!cir.float>
}

Obviously, I'd want to defer that to a future change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, that seems sensible. Just seemed a little odd. I guess I was envisioning some sort of 'get-self' type thing...

I guess I see what you mean, any call (in the non-array case) would need that. The way I could think of would be to have the block take an argument that has that value (like array.dtor's bb0).

But I can see that being fine in a followup/with a FIXME put somewhere.

SmallString<256> fnName("__cxx_global_array_dtor");
uint32_t cnt = dynamicInitializerNames[fnName]++;
if (cnt)
fnName += "." + llvm::Twine(cnt).str();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fnName += "." + llvm::Twine(cnt).str();
fnName += '.';
fnName += cnt;

Seems pretty wasteful to create/alloc the temporary string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggested change here doesn't work. It appends "\01" rather than "1" because it's hitting the SmallString & operator+= (char C) and there is no operator for converting an integer to a string.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah shoot, thats unfortunate. I was hoping we had an in-place overload :)

Can we at least do:

fnName += (. + llvm::Twine(cnt)).str();?
(OR: fnName += (llvm::Twine('.') + llvm::Twine(cnt)).str();?

Rather than creating a string, immediately pre-pending before adding? I guess we're stuck with an extra allocation anyway, unless we resort to an osstringstream, and IDK if that is worth it for as little addition as we're doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change I made matches how we unique strings in other places. LLVM's getUniqueIntrinsicName does it the way you suggested. It's not a bad idea to update the CIR uses to do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm not complaining about HOW we're uniquing them, just how we are constructing the string.

The twine.str causes a new string to be created. Then the operator+ prepends a "." to it, which requires 2nd string to be allocated (for the result), which can finally THEN be added to the fnName.

I was suggesting we do the '.' + llvm::Twine(cnt) math as a single twine+twine operation, which reduces the 2 string allocations to 1 (see putting the parens around the '.' and hte llvm::Twine ctor.

@github-actions
Copy link

github-actions bot commented Nov 21, 2025

🐧 Linux x64 Test Results

  • 112083 tests passed
  • 4079 tests skipped

// called from __cxa_atexit.
// In CIR, we just emit the destroy into the dtor region. It will be moved
// into a separate function during the LoweringPrepare pass.
mlir::Value globalVal = cgf.getBuilder().createGetGlobal(addr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mlir::Value globalVal = cgf.getBuilder().createGetGlobal(addr);
mlir::Value globalVal = builder.createGetGlobal(addr);

@andykaylor andykaylor merged commit 136c9da into llvm:main Nov 21, 2025
10 checks passed
@andykaylor andykaylor deleted the cir-global-array-dtor branch November 21, 2025 23:47
aadeshps-mcw pushed a commit to aadeshps-mcw/llvm-project that referenced this pull request Nov 26, 2025
This implements handling to destroy global arrays that require
destruction. Unlike classic codegen, CIR emits the destructor loop into
a 'dtor' region associated with the global array variable. Later, during
LoweringPrepare, this code is moved into a helper function and a call to
__cxa_atexit arranges for it to be called during the shared object
shutdown.
Priyanshu3820 pushed a commit to Priyanshu3820/llvm-project that referenced this pull request Nov 26, 2025
This implements handling to destroy global arrays that require
destruction. Unlike classic codegen, CIR emits the destructor loop into
a 'dtor' region associated with the global array variable. Later, during
LoweringPrepare, this code is moved into a helper function and a call to
__cxa_atexit arranges for it to be called during the shared object
shutdown.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants