-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang] Fix self-capturing __block
variables
#89475
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: None (ille-apple) ChangesThis is an updated version of https://reviews.llvm.org/D90434 from 2020. Due to time constraints I stopped working on the patch, and unfortunately never got back around to it until now. Clang special-cases Fix this by initializing such variables directly on the heap, and add a warning for when this needs to be done. Background on the special case
What's tricky is when this move happens while the variable is still being initialized. For example: int copyBlockAndReturnInt(void (^block)(void)) {
Block_copy(block);
return 42;
}
int main() {
__block int val = copyBlockAndReturnInt(^{
printf("%d\n", val);
});
} Here, when This can only happen if a block literal that captures the variable is located syntactically within the initializer itself. If it's before the initializer then it couldn't capture the variable, and if it's after the initializer then it couldn't execute during initialization. So Clang detects when such a literal exists inside the initializer; if (capturedByInit)
drillIntoBlockVariable(*this, lvalue, cast<VarDecl>(D));
Okay, but then why are there two different code paths? If the address calculation can be safely deferred, why not just do that all the time rather than having a separate mode just for self-capturing Because it can't always be safely deferred. Harder casesFirst, consider the case where variable being initialized is a C struct. This is already problematic. struct Foo { int a, b; };
struct Foo foo = {1, foo.a}; But if any one of the field evaluations can move the object, then Clang would need to reload the forwarding pointer before storing each field. And that would need to apply recursively to any subfields. Such a scheme is definitely possible, but Clang doesn't implement it. And if instead of a C struct you have a C++ class, supporting relocation to the heap becomes flat-out impossible. Consider: struct Foo {
void (^block_)(void);
Foo(void (^block)(void)) {
printf("I am at %p\n", this);
block_ = Block_copy(block);
printf("I am now at %p\n", this); // shouldn't change!
}
};
__block Foo foo(^{
printf("%p\n", &foo);
}); C++ constructors can observe the address of the object being constructed, and expect that address not to change. Once the object has finished being constructed, it can be relocated by calling the move or copy constructor, and that's what Clang uses for block captures in the non-self-capturing case. But in this case, Clang's current behaviorSo what does Clang currently do? Well, for both C structs and C++ classes (as well as unions2), it runs into this TODO dating to 2011: case TEK_Aggregate:
[...]
// TODO: how can we delay here if D is captured by its initializer? Clang simply neglects to call SolutionAs stated, it's impossible for self-capturing variables of at least some record types to support moving to the heap. But there is a way to make them work regardless. Any given This patch applies this approach to all self-capturing The obvious downside of the approach that we pay the cost of a heap allocation even if To assist performance-sensitive codebases, this patch adds a warning, Allocation failureAnother downside of the approach is that because the heap allocation is completely implicit, there is no way to check for allocation failure. Normally, heap allocation happens only when you call ...Or so I thought. It turns out that checking The point is, if recovery from allocation failure is already not a thing, then this patch isn't responsible for breaking it. Even if blocks are changed someday to support it, it would probably be desired only by a tiny minority of codebases, and those codebases could avoid implicit allocations by enabling the new warning and setting it as an error. ABI limitationThe blocks runtime has no API to allocate a byref struct directly on the heap, because it's never been necessary before. There is only Instead, the codegen works with the existing runtime, as follows. First, a byref struct is allocated on the stack (including space for the object), but only the header is initialized; the object part is left uninitialized. Then Improved self-capture analysisThe existing check for self-capturing This patch replaces the whole thing with an
This patch also moves the self-capture check earlier in the compilation process, from CodeGen to Sema. This is necessary in order to support the aforementioned warning. Constant initializationThis patch also fixes a related bug where self-capturing assert(!capturedByInit && "constant init contains a capturing block?"); The assert is important, because constant initialization doesn't support the First, the initializer can syntactically contain a capturing block literal but never evaluate it: __block constexpr int foo = true ? 1 : ^{ return foo; }(); This is addressed in this patch by forcing constant-initialized variables to be treated as non-self-capturing. Second, the initializer can simply not be constant. With variables of types like Patch is 33.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/89475.diff 12 Files Affected:
diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 8b121896d66d15..dc816786d7f11d 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -1103,6 +1103,11 @@ class VarDecl : public DeclaratorDecl, public Redeclarable<VarDecl> {
LLVM_PREFERRED_TYPE(bool)
unsigned IsCXXCondDecl : 1;
+
+ /// Whether this is a __block variable that is captured by a block
+ /// referenced in its own initializer.
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned CapturedByOwnInit : 1;
};
union {
@@ -1588,10 +1593,23 @@ class VarDecl : public DeclaratorDecl, public Redeclarable<VarDecl> {
/// escaping block.
bool isNonEscapingByref() const;
- void setEscapingByref() {
- NonParmVarDeclBits.EscapingByref = true;
+ void setEscapingByref();
+
+ /// Indicates this is a __block variable that is captured by a block
+ /// referenced in its own initializer.
+ bool isCapturedByOwnInit() const {
+ return !isa<ParmVarDecl>(this) && NonParmVarDeclBits.CapturedByOwnInit;
}
+ void setCapturedByOwnInit() {
+ assert(!isa<ParmVarDecl>(this));
+ NonParmVarDeclBits.CapturedByOwnInit = true;
+ }
+
+ /// Check if this is a __block variable which needs to be initialized
+ /// directly on the heap.
+ bool needsInitOnHeap() const;
+
bool isCXXCondDecl() const {
return isa<ParmVarDecl>(this) ? false : NonParmVarDeclBits.IsCXXCondDecl;
}
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 1bbe76ff6bd2ac..f19ee8fbc21f3a 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -2356,6 +2356,12 @@ def note_var_fixit_add_initialization : Note<
def note_uninit_fixit_remove_cond : Note<
"remove the %select{'%1' if its condition|condition if it}0 "
"is always %select{false|true}2">;
+def warn_implicit_block_var_alloc : Warning<
+ "variable %q0 will be initialized in a heap allocation">,
+ InGroup<DiagGroup<"implicit-block-var-alloc">>, DefaultIgnore;
+def note_because_captured_by_block : Note<
+ "because it is captured by a block used in its own initializer">;
+
def err_init_incomplete_type : Error<"initialization of incomplete type %0">;
def err_list_init_in_parens : Error<
"cannot initialize %select{non-class|reference}0 type %1 with a "
diff --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp
index 474e0ccde5bbf7..fda9010ba126d7 100644
--- a/clang/lib/AST/Decl.cpp
+++ b/clang/lib/AST/Decl.cpp
@@ -2680,6 +2680,19 @@ bool VarDecl::isNonEscapingByref() const {
return hasAttr<BlocksAttr>() && !NonParmVarDeclBits.EscapingByref;
}
+void VarDecl::setEscapingByref() {
+ assert(!isa<ParmVarDecl>(this) && hasAttr<BlocksAttr>());
+ NonParmVarDeclBits.EscapingByref = true;
+}
+
+bool VarDecl::needsInitOnHeap() const {
+ // We need direct initialization on the heap for self-capturing __block
+ // variables of types that cause CodeGenFunction::getEvaluationKind to return
+ // TEK_Aggregate. The only such types that can be captured are records.
+ return isCapturedByOwnInit() &&
+ getType().getAtomicUnqualifiedType()->isRecordType();
+}
+
bool VarDecl::hasDependentAlignment() const {
QualType T = getType();
return T->isDependentType() || T->isUndeducedType() ||
diff --git a/clang/lib/CodeGen/CGBlocks.cpp b/clang/lib/CodeGen/CGBlocks.cpp
index 2742c39965b2c8..4bea057a15a52c 100644
--- a/clang/lib/CodeGen/CGBlocks.cpp
+++ b/clang/lib/CodeGen/CGBlocks.cpp
@@ -45,6 +45,13 @@ CGBlockInfo::CGBlockInfo(const BlockDecl *block, StringRef name)
name = name.substr(1);
}
+BlockByrefHelpers::BlockByrefHelpers(const BlockByrefInfo &info)
+ : CopyHelper(nullptr), DisposeHelper(nullptr),
+ // The alignment we care about for the purposes of uniquing byref
+ // helpers is the alignment of the actual byref value field.
+ Alignment(info.ByrefAlignment.alignmentAtOffset(info.FieldOffset)),
+ ForceNoopCopy(info.ForceNoopCopy) {}
+
// Anchor the vtable to this translation unit.
BlockByrefHelpers::~BlockByrefHelpers() {}
@@ -2127,8 +2134,8 @@ class ObjectByrefHelpers final : public BlockByrefHelpers {
BlockFieldFlags Flags;
public:
- ObjectByrefHelpers(CharUnits alignment, BlockFieldFlags flags)
- : BlockByrefHelpers(alignment), Flags(flags) {}
+ ObjectByrefHelpers(const BlockByrefInfo &info, BlockFieldFlags flags)
+ : BlockByrefHelpers(info), Flags(flags) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2161,7 +2168,7 @@ class ObjectByrefHelpers final : public BlockByrefHelpers {
/// Emits the copy/dispose helpers for an ARC __block __weak variable.
class ARCWeakByrefHelpers final : public BlockByrefHelpers {
public:
- ARCWeakByrefHelpers(CharUnits alignment) : BlockByrefHelpers(alignment) {}
+ ARCWeakByrefHelpers(const BlockByrefInfo &info) : BlockByrefHelpers(info) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2182,7 +2189,7 @@ class ARCWeakByrefHelpers final : public BlockByrefHelpers {
/// that's not of block-pointer type.
class ARCStrongByrefHelpers final : public BlockByrefHelpers {
public:
- ARCStrongByrefHelpers(CharUnits alignment) : BlockByrefHelpers(alignment) {}
+ ARCStrongByrefHelpers(const BlockByrefInfo &info) : BlockByrefHelpers(info) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2218,8 +2225,8 @@ class ARCStrongByrefHelpers final : public BlockByrefHelpers {
/// variable that's of block-pointer type.
class ARCStrongBlockByrefHelpers final : public BlockByrefHelpers {
public:
- ARCStrongBlockByrefHelpers(CharUnits alignment)
- : BlockByrefHelpers(alignment) {}
+ ARCStrongBlockByrefHelpers(const BlockByrefInfo &info)
+ : BlockByrefHelpers(info) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2248,9 +2255,9 @@ class CXXByrefHelpers final : public BlockByrefHelpers {
const Expr *CopyExpr;
public:
- CXXByrefHelpers(CharUnits alignment, QualType type,
+ CXXByrefHelpers(const BlockByrefInfo &info, QualType type,
const Expr *copyExpr)
- : BlockByrefHelpers(alignment), VarType(type), CopyExpr(copyExpr) {}
+ : BlockByrefHelpers(info), VarType(type), CopyExpr(copyExpr) {}
bool needsCopy() const override { return CopyExpr != nullptr; }
void emitCopy(CodeGenFunction &CGF, Address destField,
@@ -2276,8 +2283,8 @@ class NonTrivialCStructByrefHelpers final : public BlockByrefHelpers {
QualType VarType;
public:
- NonTrivialCStructByrefHelpers(CharUnits alignment, QualType type)
- : BlockByrefHelpers(alignment), VarType(type) {}
+ NonTrivialCStructByrefHelpers(const BlockByrefInfo &info, QualType type)
+ : BlockByrefHelpers(info), VarType(type) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2336,7 +2343,7 @@ generateByrefCopyHelper(CodeGenFunction &CGF, const BlockByrefInfo &byrefInfo,
// Create a scope with an artificial location for the body of this function.
auto AL = ApplyDebugLocation::CreateArtificial(CGF);
- if (generator.needsCopy()) {
+ if (generator.needsCopy() && !generator.ForceNoopCopy) {
// dst->x
Address destField = CGF.GetAddrOfLocalVar(&Dst);
destField = Address(CGF.Builder.CreateLoad(destField), byrefInfo.Type,
@@ -2458,18 +2465,13 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
auto &byrefInfo = getBlockByrefInfo(&var);
- // The alignment we care about for the purposes of uniquing byref
- // helpers is the alignment of the actual byref value field.
- CharUnits valueAlignment =
- byrefInfo.ByrefAlignment.alignmentAtOffset(byrefInfo.FieldOffset);
-
if (const CXXRecordDecl *record = type->getAsCXXRecordDecl()) {
const Expr *copyExpr =
CGM.getContext().getBlockVarCopyInit(&var).getCopyExpr();
if (!copyExpr && record->hasTrivialDestructor()) return nullptr;
return ::buildByrefHelpers(
- CGM, byrefInfo, CXXByrefHelpers(valueAlignment, type, copyExpr));
+ CGM, byrefInfo, CXXByrefHelpers(byrefInfo, type, copyExpr));
}
// If type is a non-trivial C struct type that is non-trivial to
@@ -2477,7 +2479,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
if (type.isNonTrivialToPrimitiveDestructiveMove() == QualType::PCK_Struct ||
type.isDestructedType() == QualType::DK_nontrivial_c_struct)
return ::buildByrefHelpers(
- CGM, byrefInfo, NonTrivialCStructByrefHelpers(valueAlignment, type));
+ CGM, byrefInfo, NonTrivialCStructByrefHelpers(byrefInfo, type));
// Otherwise, if we don't have a retainable type, there's nothing to do.
// that the runtime does extra copies.
@@ -2499,7 +2501,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
// byref routines.
case Qualifiers::OCL_Weak:
return ::buildByrefHelpers(CGM, byrefInfo,
- ARCWeakByrefHelpers(valueAlignment));
+ ARCWeakByrefHelpers(byrefInfo));
// ARC __strong __block variables need to be retained.
case Qualifiers::OCL_Strong:
@@ -2507,13 +2509,13 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
// transfer possible.
if (type->isBlockPointerType()) {
return ::buildByrefHelpers(CGM, byrefInfo,
- ARCStrongBlockByrefHelpers(valueAlignment));
+ ARCStrongBlockByrefHelpers(byrefInfo));
// Otherwise, we transfer ownership of the retain from the stack
// to the heap.
} else {
return ::buildByrefHelpers(CGM, byrefInfo,
- ARCStrongByrefHelpers(valueAlignment));
+ ARCStrongByrefHelpers(byrefInfo));
}
}
llvm_unreachable("fell out of lifetime switch!");
@@ -2533,7 +2535,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
flags |= BLOCK_FIELD_IS_WEAK;
return ::buildByrefHelpers(CGM, byrefInfo,
- ObjectByrefHelpers(valueAlignment, flags));
+ ObjectByrefHelpers(byrefInfo, flags));
}
Address CodeGenFunction::emitBlockByrefAddress(Address baseAddr,
@@ -2653,11 +2655,39 @@ const BlockByrefInfo &CodeGenFunction::getBlockByrefInfo(const VarDecl *D) {
info.FieldOffset = varOffset;
info.ByrefAlignment = std::max(varAlign, getPointerAlign());
+ // If we're initializing directly on the heap, then we should emit a
+ // no-op copy helper, both because we don't need a real one (the
+ // object will never move), and because a real one would break the
+ // pre-init _Block_object_assign.
+ info.ForceNoopCopy = D->needsInitOnHeap();
+
auto pair = BlockByrefInfos.insert({D, info});
assert(pair.second && "info was inserted recursively?");
return pair.first->second;
}
+void CodeGenFunction::emitByrefInitOnHeap(llvm::Value *P) {
+ // The object itself is initialized directly on the heap. But for ABI
+ // backwards compatibility reasons, we have to set up a fake byref struct on
+ // the stack (with the structural components initialized but not the object
+ // itself), then call _Block_object_assign to move it to the heap (which is
+ // safe because we forced a no-op copy helper), then call
+ // _Block_object_dispose to release the extra ref from _Block_object_assign.
+ //
+ // 'P' points to the fake byref struct.
+
+ BlockFieldFlags flags = BLOCK_FIELD_IS_BYREF;
+ // Ignored out-parameter. We'll use the forwarding pointer instead.
+ RawAddress out = CreateDefaultAlignTempAlloca(P->getType(), "initOnHeap.out");
+
+ llvm::Value *args[] = {Builder.CreateBitCast(out.getPointer(), VoidPtrTy),
+ Builder.CreateBitCast(P, VoidPtrTy),
+ llvm::ConstantInt::get(Int32Ty, flags.getBitMask())};
+ EmitNounwindRuntimeCall(CGM.getBlockObjectAssign(), args);
+
+ BuildBlockRelease(P, flags, false);
+}
+
/// Initialize the structural components of a __block variable, i.e.
/// everything but the actual object.
void CodeGenFunction::emitByrefStructureInit(const AutoVarEmission &emission) {
@@ -2699,8 +2729,8 @@ void CodeGenFunction::emitByrefStructureInit(const AutoVarEmission &emission) {
storeHeaderField(V, getPointerSize(), "byref.isa");
// Store the address of the variable into its own forwarding pointer.
- storeHeaderField(addr.emitRawPointer(*this), getPointerSize(),
- "byref.forwarding");
+ llvm::Value *pointer = addr.emitRawPointer(*this);
+ storeHeaderField(pointer, getPointerSize(), "byref.forwarding");
// Blocks ABI:
// c) the flags field is set to either 0 if no helper functions are
@@ -2764,6 +2794,9 @@ void CodeGenFunction::emitByrefStructureInit(const AutoVarEmission &emission) {
auto layoutInfo = CGM.getObjCRuntime().BuildByrefLayout(CGM, type);
storeHeaderField(layoutInfo, getPointerSize(), "byref.layout");
}
+
+ if (emission.NeedsInitOnHeap)
+ emitByrefInitOnHeap(pointer);
}
void CodeGenFunction::BuildBlockRelease(llvm::Value *V, BlockFieldFlags flags,
diff --git a/clang/lib/CodeGen/CGBlocks.h b/clang/lib/CodeGen/CGBlocks.h
index 8d10c4f69b2026..526dbbfd9b38d3 100644
--- a/clang/lib/CodeGen/CGBlocks.h
+++ b/clang/lib/CodeGen/CGBlocks.h
@@ -139,6 +139,7 @@ class BlockByrefInfo {
unsigned FieldIndex;
CharUnits ByrefAlignment;
CharUnits FieldOffset;
+ bool ForceNoopCopy;
};
/// Represents a type of copy/destroy operation that should be performed for an
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index ce6d6d8956076e..954dd1079215fd 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -1450,6 +1450,8 @@ CodeGenFunction::EmitAutoVarAlloca(const VarDecl &D) {
bool isEscapingByRef = D.isEscapingByref();
emission.IsEscapingByRef = isEscapingByRef;
+ emission.IsCapturedByOwnInit = D.isCapturedByOwnInit();
+ emission.NeedsInitOnHeap = D.needsInitOnHeap();
CharUnits alignment = getContext().getDeclAlign(&D);
@@ -1692,68 +1694,6 @@ CodeGenFunction::EmitAutoVarAlloca(const VarDecl &D) {
return emission;
}
-static bool isCapturedBy(const VarDecl &, const Expr *);
-
-/// Determines whether the given __block variable is potentially
-/// captured by the given statement.
-static bool isCapturedBy(const VarDecl &Var, const Stmt *S) {
- if (const Expr *E = dyn_cast<Expr>(S))
- return isCapturedBy(Var, E);
- for (const Stmt *SubStmt : S->children())
- if (isCapturedBy(Var, SubStmt))
- return true;
- return false;
-}
-
-/// Determines whether the given __block variable is potentially
-/// captured by the given expression.
-static bool isCapturedBy(const VarDecl &Var, const Expr *E) {
- // Skip the most common kinds of expressions that make
- // hierarchy-walking expensive.
- E = E->IgnoreParenCasts();
-
- if (const BlockExpr *BE = dyn_cast<BlockExpr>(E)) {
- const BlockDecl *Block = BE->getBlockDecl();
- for (const auto &I : Block->captures()) {
- if (I.getVariable() == &Var)
- return true;
- }
-
- // No need to walk into the subexpressions.
- return false;
- }
-
- if (const StmtExpr *SE = dyn_cast<StmtExpr>(E)) {
- const CompoundStmt *CS = SE->getSubStmt();
- for (const auto *BI : CS->body())
- if (const auto *BIE = dyn_cast<Expr>(BI)) {
- if (isCapturedBy(Var, BIE))
- return true;
- }
- else if (const auto *DS = dyn_cast<DeclStmt>(BI)) {
- // special case declarations
- for (const auto *I : DS->decls()) {
- if (const auto *VD = dyn_cast<VarDecl>((I))) {
- const Expr *Init = VD->getInit();
- if (Init && isCapturedBy(Var, Init))
- return true;
- }
- }
- }
- else
- // FIXME. Make safe assumption assuming arbitrary statements cause capturing.
- // Later, provide code to poke into statements for capture analysis.
- return true;
- return false;
- }
-
- for (const Stmt *SubStmt : E->children())
- if (isCapturedBy(Var, SubStmt))
- return true;
-
- return false;
-}
-
/// Determine whether the given initializer is trivial in the sense
/// that it requires no code to be generated.
bool CodeGenFunction::isTrivialInitializer(const Expr *Init) {
@@ -1918,8 +1858,7 @@ void CodeGenFunction::EmitAutoVarInit(const AutoVarEmission &emission) {
// Check whether this is a byref variable that's potentially
// captured and moved by its own initializer. If so, we'll need to
// emit the initializer first, then copy into the variable.
- bool capturedByInit =
- Init && emission.IsEscapingByRef && isCapturedBy(D, Init);
+ bool capturedByInit = emission.IsCapturedByOwnInit;
bool locIsByrefHeader = !capturedByInit;
const Address Loc =
@@ -1951,8 +1890,9 @@ void CodeGenFunction::EmitAutoVarInit(const AutoVarEmission &emission) {
llvm::Constant *constant = nullptr;
if (emission.IsConstantAggregate ||
D.mightBeUsableInConstantExpressions(getContext())) {
- assert(!capturedByInit && "constant init contains a capturing block?");
constant = ConstantEmitter(*this).tryEmitAbstractForInitializer(D);
+ assert(!(constant && capturedByInit) &&
+ "constant init contains a capturing block?");
if (constant && !constant->isZeroValue() &&
(trivialAutoVarInit !=
LangOptions::TrivialAutoVarInitKind::Uninitialized)) {
@@ -1975,6 +1915,10 @@ void CodeGenFunction::EmitAutoVarInit(const AutoVarEmission &emission) {
initializeWhatIsTechnicallyUninitialized(Loc);
LValue lv = MakeAddrLValue(Loc, type);
lv.setNonGC(true);
+ if (emission.NeedsInitOnHeap) {
+ drillIntoBlockVariable(*this, lv, &D);
+ capturedByInit = false; // We don't need to drill again.
+ }
return EmitExprAsInit(Init, &D, lv, capturedByInit);
}
@@ -2023,6 +1967,9 @@ void CodeGenFunction::EmitExprAsInit(const Expr *init, const ValueDecl *D,
return;
}
case TEK_Aggregate:
+ assert(
+ !capturedByInit &&
+ "Capture-by-initializer should have been handled by NeedsInitOnHeap!");
if (type->isAtomicType()) {
EmitAtomicInit(const_cast<Expr*>(init), lvalue);
} else {
@@ -2031,7 +1978,6 @@ void CodeGenFunction::EmitExprAsInit(const Expr *init, const ValueDecl *D,
Overlap = AggValueSlot::DoesNotOverlap;
else if (auto *FD = dyn_cast<FieldDecl>(D))
Overlap = getOverlapForFieldInit(FD);
- // TODO: how can we delay here if D is captured by its initializer?
EmitAggExpr(init, AggValueSlot::forLValue(
lvalue, *this, AggValueSlot::IsDestructed,
AggValueSlot::DoesNotNeedGCBarriers,
@@ -2120,27 +2066,31 @@ void CodeGenFunction::EmitAutoVarCleanups(const AutoVarEmission &emission) {
// us from jumping into any of these scopes anyway.
if (!HaveInsertPoint()) return;
- const VarDecl &D = *emission.Variable;
+ // If we're initializing directly on the heap, _Block_object_destroy will
+ // handle destruction, so we don't need to perform cleanups directly.
+ if (!emission.NeedsInitOnHeap) {
+ const VarDecl &D = *emission.Variable;
- // Check the type for a cleanup.
- if (QualType::DestructionKind dtorKind = D.needsDestruction(getContext()))
- emitAutoVarTypeCleanup(emission, dtorKind);
+ // Check the type for a cleanup.
+ if (QualType::DestructionKind dtorKind = D.needsDestruction(getContext()))
+ emitAutoVarTypeCleanup(emission, dtorKind);
- // In GC mode, honor objc_precise_lifetime.
- if (getLangOpts().getGC() != LangOptions::NonGC &&
- D.hasAttr<ObjCPreciseLifetimeAttr>()) {
- EHStack.pushCleanup<ExtendGCLifetime>(NormalCleanup, &D);
- }
+ // In GC mode, honor objc_precise_lifetime.
+ if (getLangOpts().getGC() != LangOptions::NonGC &&
+ D.hasAttr<ObjCPreciseLifetimeAttr>()) {
+ EHStack.pushCleanup<ExtendGCLifetime>(NormalCleanup, &D);
+ }
- // Handle the cleanup attribute.
- if (const CleanupAttr *CA = D.getAttr<CleanupAttr>()) {
- const FunctionDecl *FD = CA->getFunctionDec...
[truncated]
Footnotes
|
@llvm/pr-subscribers-clang-modules Author: None (ille-apple) ChangesThis is an updated version of https://reviews.llvm.org/D90434 from 2020. Due to time constraints I stopped working on the patch, and unfortunately never got back around to it until now. Clang special-cases Fix this by initializing such variables directly on the heap, and add a warning for when this needs to be done. Background on the special case
What's tricky is when this move happens while the variable is still being initialized. For example: int copyBlockAndReturnInt(void (^block)(void)) {
Block_copy(block);
return 42;
}
int main() {
__block int val = copyBlockAndReturnInt(^{
printf("%d\n", val);
});
} Here, when This can only happen if a block literal that captures the variable is located syntactically within the initializer itself. If it's before the initializer then it couldn't capture the variable, and if it's after the initializer then it couldn't execute during initialization. So Clang detects when such a literal exists inside the initializer; if (capturedByInit)
drillIntoBlockVariable(*this, lvalue, cast<VarDecl>(D));
Okay, but then why are there two different code paths? If the address calculation can be safely deferred, why not just do that all the time rather than having a separate mode just for self-capturing Because it can't always be safely deferred. Harder casesFirst, consider the case where variable being initialized is a C struct. This is already problematic. struct Foo { int a, b; };
struct Foo foo = {1, foo.a}; But if any one of the field evaluations can move the object, then Clang would need to reload the forwarding pointer before storing each field. And that would need to apply recursively to any subfields. Such a scheme is definitely possible, but Clang doesn't implement it. And if instead of a C struct you have a C++ class, supporting relocation to the heap becomes flat-out impossible. Consider: struct Foo {
void (^block_)(void);
Foo(void (^block)(void)) {
printf("I am at %p\n", this);
block_ = Block_copy(block);
printf("I am now at %p\n", this); // shouldn't change!
}
};
__block Foo foo(^{
printf("%p\n", &foo);
}); C++ constructors can observe the address of the object being constructed, and expect that address not to change. Once the object has finished being constructed, it can be relocated by calling the move or copy constructor, and that's what Clang uses for block captures in the non-self-capturing case. But in this case, Clang's current behaviorSo what does Clang currently do? Well, for both C structs and C++ classes (as well as unions2), it runs into this TODO dating to 2011: case TEK_Aggregate:
[...]
// TODO: how can we delay here if D is captured by its initializer? Clang simply neglects to call SolutionAs stated, it's impossible for self-capturing variables of at least some record types to support moving to the heap. But there is a way to make them work regardless. Any given This patch applies this approach to all self-capturing The obvious downside of the approach that we pay the cost of a heap allocation even if To assist performance-sensitive codebases, this patch adds a warning, Allocation failureAnother downside of the approach is that because the heap allocation is completely implicit, there is no way to check for allocation failure. Normally, heap allocation happens only when you call ...Or so I thought. It turns out that checking The point is, if recovery from allocation failure is already not a thing, then this patch isn't responsible for breaking it. Even if blocks are changed someday to support it, it would probably be desired only by a tiny minority of codebases, and those codebases could avoid implicit allocations by enabling the new warning and setting it as an error. ABI limitationThe blocks runtime has no API to allocate a byref struct directly on the heap, because it's never been necessary before. There is only Instead, the codegen works with the existing runtime, as follows. First, a byref struct is allocated on the stack (including space for the object), but only the header is initialized; the object part is left uninitialized. Then Improved self-capture analysisThe existing check for self-capturing This patch replaces the whole thing with an
This patch also moves the self-capture check earlier in the compilation process, from CodeGen to Sema. This is necessary in order to support the aforementioned warning. Constant initializationThis patch also fixes a related bug where self-capturing assert(!capturedByInit && "constant init contains a capturing block?"); The assert is important, because constant initialization doesn't support the First, the initializer can syntactically contain a capturing block literal but never evaluate it: __block constexpr int foo = true ? 1 : ^{ return foo; }(); This is addressed in this patch by forcing constant-initialized variables to be treated as non-self-capturing. Second, the initializer can simply not be constant. With variables of types like Patch is 33.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/89475.diff 12 Files Affected:
diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 8b121896d66d15..dc816786d7f11d 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -1103,6 +1103,11 @@ class VarDecl : public DeclaratorDecl, public Redeclarable<VarDecl> {
LLVM_PREFERRED_TYPE(bool)
unsigned IsCXXCondDecl : 1;
+
+ /// Whether this is a __block variable that is captured by a block
+ /// referenced in its own initializer.
+ LLVM_PREFERRED_TYPE(bool)
+ unsigned CapturedByOwnInit : 1;
};
union {
@@ -1588,10 +1593,23 @@ class VarDecl : public DeclaratorDecl, public Redeclarable<VarDecl> {
/// escaping block.
bool isNonEscapingByref() const;
- void setEscapingByref() {
- NonParmVarDeclBits.EscapingByref = true;
+ void setEscapingByref();
+
+ /// Indicates this is a __block variable that is captured by a block
+ /// referenced in its own initializer.
+ bool isCapturedByOwnInit() const {
+ return !isa<ParmVarDecl>(this) && NonParmVarDeclBits.CapturedByOwnInit;
}
+ void setCapturedByOwnInit() {
+ assert(!isa<ParmVarDecl>(this));
+ NonParmVarDeclBits.CapturedByOwnInit = true;
+ }
+
+ /// Check if this is a __block variable which needs to be initialized
+ /// directly on the heap.
+ bool needsInitOnHeap() const;
+
bool isCXXCondDecl() const {
return isa<ParmVarDecl>(this) ? false : NonParmVarDeclBits.IsCXXCondDecl;
}
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 1bbe76ff6bd2ac..f19ee8fbc21f3a 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -2356,6 +2356,12 @@ def note_var_fixit_add_initialization : Note<
def note_uninit_fixit_remove_cond : Note<
"remove the %select{'%1' if its condition|condition if it}0 "
"is always %select{false|true}2">;
+def warn_implicit_block_var_alloc : Warning<
+ "variable %q0 will be initialized in a heap allocation">,
+ InGroup<DiagGroup<"implicit-block-var-alloc">>, DefaultIgnore;
+def note_because_captured_by_block : Note<
+ "because it is captured by a block used in its own initializer">;
+
def err_init_incomplete_type : Error<"initialization of incomplete type %0">;
def err_list_init_in_parens : Error<
"cannot initialize %select{non-class|reference}0 type %1 with a "
diff --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp
index 474e0ccde5bbf7..fda9010ba126d7 100644
--- a/clang/lib/AST/Decl.cpp
+++ b/clang/lib/AST/Decl.cpp
@@ -2680,6 +2680,19 @@ bool VarDecl::isNonEscapingByref() const {
return hasAttr<BlocksAttr>() && !NonParmVarDeclBits.EscapingByref;
}
+void VarDecl::setEscapingByref() {
+ assert(!isa<ParmVarDecl>(this) && hasAttr<BlocksAttr>());
+ NonParmVarDeclBits.EscapingByref = true;
+}
+
+bool VarDecl::needsInitOnHeap() const {
+ // We need direct initialization on the heap for self-capturing __block
+ // variables of types that cause CodeGenFunction::getEvaluationKind to return
+ // TEK_Aggregate. The only such types that can be captured are records.
+ return isCapturedByOwnInit() &&
+ getType().getAtomicUnqualifiedType()->isRecordType();
+}
+
bool VarDecl::hasDependentAlignment() const {
QualType T = getType();
return T->isDependentType() || T->isUndeducedType() ||
diff --git a/clang/lib/CodeGen/CGBlocks.cpp b/clang/lib/CodeGen/CGBlocks.cpp
index 2742c39965b2c8..4bea057a15a52c 100644
--- a/clang/lib/CodeGen/CGBlocks.cpp
+++ b/clang/lib/CodeGen/CGBlocks.cpp
@@ -45,6 +45,13 @@ CGBlockInfo::CGBlockInfo(const BlockDecl *block, StringRef name)
name = name.substr(1);
}
+BlockByrefHelpers::BlockByrefHelpers(const BlockByrefInfo &info)
+ : CopyHelper(nullptr), DisposeHelper(nullptr),
+ // The alignment we care about for the purposes of uniquing byref
+ // helpers is the alignment of the actual byref value field.
+ Alignment(info.ByrefAlignment.alignmentAtOffset(info.FieldOffset)),
+ ForceNoopCopy(info.ForceNoopCopy) {}
+
// Anchor the vtable to this translation unit.
BlockByrefHelpers::~BlockByrefHelpers() {}
@@ -2127,8 +2134,8 @@ class ObjectByrefHelpers final : public BlockByrefHelpers {
BlockFieldFlags Flags;
public:
- ObjectByrefHelpers(CharUnits alignment, BlockFieldFlags flags)
- : BlockByrefHelpers(alignment), Flags(flags) {}
+ ObjectByrefHelpers(const BlockByrefInfo &info, BlockFieldFlags flags)
+ : BlockByrefHelpers(info), Flags(flags) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2161,7 +2168,7 @@ class ObjectByrefHelpers final : public BlockByrefHelpers {
/// Emits the copy/dispose helpers for an ARC __block __weak variable.
class ARCWeakByrefHelpers final : public BlockByrefHelpers {
public:
- ARCWeakByrefHelpers(CharUnits alignment) : BlockByrefHelpers(alignment) {}
+ ARCWeakByrefHelpers(const BlockByrefInfo &info) : BlockByrefHelpers(info) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2182,7 +2189,7 @@ class ARCWeakByrefHelpers final : public BlockByrefHelpers {
/// that's not of block-pointer type.
class ARCStrongByrefHelpers final : public BlockByrefHelpers {
public:
- ARCStrongByrefHelpers(CharUnits alignment) : BlockByrefHelpers(alignment) {}
+ ARCStrongByrefHelpers(const BlockByrefInfo &info) : BlockByrefHelpers(info) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2218,8 +2225,8 @@ class ARCStrongByrefHelpers final : public BlockByrefHelpers {
/// variable that's of block-pointer type.
class ARCStrongBlockByrefHelpers final : public BlockByrefHelpers {
public:
- ARCStrongBlockByrefHelpers(CharUnits alignment)
- : BlockByrefHelpers(alignment) {}
+ ARCStrongBlockByrefHelpers(const BlockByrefInfo &info)
+ : BlockByrefHelpers(info) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2248,9 +2255,9 @@ class CXXByrefHelpers final : public BlockByrefHelpers {
const Expr *CopyExpr;
public:
- CXXByrefHelpers(CharUnits alignment, QualType type,
+ CXXByrefHelpers(const BlockByrefInfo &info, QualType type,
const Expr *copyExpr)
- : BlockByrefHelpers(alignment), VarType(type), CopyExpr(copyExpr) {}
+ : BlockByrefHelpers(info), VarType(type), CopyExpr(copyExpr) {}
bool needsCopy() const override { return CopyExpr != nullptr; }
void emitCopy(CodeGenFunction &CGF, Address destField,
@@ -2276,8 +2283,8 @@ class NonTrivialCStructByrefHelpers final : public BlockByrefHelpers {
QualType VarType;
public:
- NonTrivialCStructByrefHelpers(CharUnits alignment, QualType type)
- : BlockByrefHelpers(alignment), VarType(type) {}
+ NonTrivialCStructByrefHelpers(const BlockByrefInfo &info, QualType type)
+ : BlockByrefHelpers(info), VarType(type) {}
void emitCopy(CodeGenFunction &CGF, Address destField,
Address srcField) override {
@@ -2336,7 +2343,7 @@ generateByrefCopyHelper(CodeGenFunction &CGF, const BlockByrefInfo &byrefInfo,
// Create a scope with an artificial location for the body of this function.
auto AL = ApplyDebugLocation::CreateArtificial(CGF);
- if (generator.needsCopy()) {
+ if (generator.needsCopy() && !generator.ForceNoopCopy) {
// dst->x
Address destField = CGF.GetAddrOfLocalVar(&Dst);
destField = Address(CGF.Builder.CreateLoad(destField), byrefInfo.Type,
@@ -2458,18 +2465,13 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
auto &byrefInfo = getBlockByrefInfo(&var);
- // The alignment we care about for the purposes of uniquing byref
- // helpers is the alignment of the actual byref value field.
- CharUnits valueAlignment =
- byrefInfo.ByrefAlignment.alignmentAtOffset(byrefInfo.FieldOffset);
-
if (const CXXRecordDecl *record = type->getAsCXXRecordDecl()) {
const Expr *copyExpr =
CGM.getContext().getBlockVarCopyInit(&var).getCopyExpr();
if (!copyExpr && record->hasTrivialDestructor()) return nullptr;
return ::buildByrefHelpers(
- CGM, byrefInfo, CXXByrefHelpers(valueAlignment, type, copyExpr));
+ CGM, byrefInfo, CXXByrefHelpers(byrefInfo, type, copyExpr));
}
// If type is a non-trivial C struct type that is non-trivial to
@@ -2477,7 +2479,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
if (type.isNonTrivialToPrimitiveDestructiveMove() == QualType::PCK_Struct ||
type.isDestructedType() == QualType::DK_nontrivial_c_struct)
return ::buildByrefHelpers(
- CGM, byrefInfo, NonTrivialCStructByrefHelpers(valueAlignment, type));
+ CGM, byrefInfo, NonTrivialCStructByrefHelpers(byrefInfo, type));
// Otherwise, if we don't have a retainable type, there's nothing to do.
// that the runtime does extra copies.
@@ -2499,7 +2501,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
// byref routines.
case Qualifiers::OCL_Weak:
return ::buildByrefHelpers(CGM, byrefInfo,
- ARCWeakByrefHelpers(valueAlignment));
+ ARCWeakByrefHelpers(byrefInfo));
// ARC __strong __block variables need to be retained.
case Qualifiers::OCL_Strong:
@@ -2507,13 +2509,13 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
// transfer possible.
if (type->isBlockPointerType()) {
return ::buildByrefHelpers(CGM, byrefInfo,
- ARCStrongBlockByrefHelpers(valueAlignment));
+ ARCStrongBlockByrefHelpers(byrefInfo));
// Otherwise, we transfer ownership of the retain from the stack
// to the heap.
} else {
return ::buildByrefHelpers(CGM, byrefInfo,
- ARCStrongByrefHelpers(valueAlignment));
+ ARCStrongByrefHelpers(byrefInfo));
}
}
llvm_unreachable("fell out of lifetime switch!");
@@ -2533,7 +2535,7 @@ CodeGenFunction::buildByrefHelpers(llvm::StructType &byrefType,
flags |= BLOCK_FIELD_IS_WEAK;
return ::buildByrefHelpers(CGM, byrefInfo,
- ObjectByrefHelpers(valueAlignment, flags));
+ ObjectByrefHelpers(byrefInfo, flags));
}
Address CodeGenFunction::emitBlockByrefAddress(Address baseAddr,
@@ -2653,11 +2655,39 @@ const BlockByrefInfo &CodeGenFunction::getBlockByrefInfo(const VarDecl *D) {
info.FieldOffset = varOffset;
info.ByrefAlignment = std::max(varAlign, getPointerAlign());
+ // If we're initializing directly on the heap, then we should emit a
+ // no-op copy helper, both because we don't need a real one (the
+ // object will never move), and because a real one would break the
+ // pre-init _Block_object_assign.
+ info.ForceNoopCopy = D->needsInitOnHeap();
+
auto pair = BlockByrefInfos.insert({D, info});
assert(pair.second && "info was inserted recursively?");
return pair.first->second;
}
+void CodeGenFunction::emitByrefInitOnHeap(llvm::Value *P) {
+ // The object itself is initialized directly on the heap. But for ABI
+ // backwards compatibility reasons, we have to set up a fake byref struct on
+ // the stack (with the structural components initialized but not the object
+ // itself), then call _Block_object_assign to move it to the heap (which is
+ // safe because we forced a no-op copy helper), then call
+ // _Block_object_dispose to release the extra ref from _Block_object_assign.
+ //
+ // 'P' points to the fake byref struct.
+
+ BlockFieldFlags flags = BLOCK_FIELD_IS_BYREF;
+ // Ignored out-parameter. We'll use the forwarding pointer instead.
+ RawAddress out = CreateDefaultAlignTempAlloca(P->getType(), "initOnHeap.out");
+
+ llvm::Value *args[] = {Builder.CreateBitCast(out.getPointer(), VoidPtrTy),
+ Builder.CreateBitCast(P, VoidPtrTy),
+ llvm::ConstantInt::get(Int32Ty, flags.getBitMask())};
+ EmitNounwindRuntimeCall(CGM.getBlockObjectAssign(), args);
+
+ BuildBlockRelease(P, flags, false);
+}
+
/// Initialize the structural components of a __block variable, i.e.
/// everything but the actual object.
void CodeGenFunction::emitByrefStructureInit(const AutoVarEmission &emission) {
@@ -2699,8 +2729,8 @@ void CodeGenFunction::emitByrefStructureInit(const AutoVarEmission &emission) {
storeHeaderField(V, getPointerSize(), "byref.isa");
// Store the address of the variable into its own forwarding pointer.
- storeHeaderField(addr.emitRawPointer(*this), getPointerSize(),
- "byref.forwarding");
+ llvm::Value *pointer = addr.emitRawPointer(*this);
+ storeHeaderField(pointer, getPointerSize(), "byref.forwarding");
// Blocks ABI:
// c) the flags field is set to either 0 if no helper functions are
@@ -2764,6 +2794,9 @@ void CodeGenFunction::emitByrefStructureInit(const AutoVarEmission &emission) {
auto layoutInfo = CGM.getObjCRuntime().BuildByrefLayout(CGM, type);
storeHeaderField(layoutInfo, getPointerSize(), "byref.layout");
}
+
+ if (emission.NeedsInitOnHeap)
+ emitByrefInitOnHeap(pointer);
}
void CodeGenFunction::BuildBlockRelease(llvm::Value *V, BlockFieldFlags flags,
diff --git a/clang/lib/CodeGen/CGBlocks.h b/clang/lib/CodeGen/CGBlocks.h
index 8d10c4f69b2026..526dbbfd9b38d3 100644
--- a/clang/lib/CodeGen/CGBlocks.h
+++ b/clang/lib/CodeGen/CGBlocks.h
@@ -139,6 +139,7 @@ class BlockByrefInfo {
unsigned FieldIndex;
CharUnits ByrefAlignment;
CharUnits FieldOffset;
+ bool ForceNoopCopy;
};
/// Represents a type of copy/destroy operation that should be performed for an
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index ce6d6d8956076e..954dd1079215fd 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -1450,6 +1450,8 @@ CodeGenFunction::EmitAutoVarAlloca(const VarDecl &D) {
bool isEscapingByRef = D.isEscapingByref();
emission.IsEscapingByRef = isEscapingByRef;
+ emission.IsCapturedByOwnInit = D.isCapturedByOwnInit();
+ emission.NeedsInitOnHeap = D.needsInitOnHeap();
CharUnits alignment = getContext().getDeclAlign(&D);
@@ -1692,68 +1694,6 @@ CodeGenFunction::EmitAutoVarAlloca(const VarDecl &D) {
return emission;
}
-static bool isCapturedBy(const VarDecl &, const Expr *);
-
-/// Determines whether the given __block variable is potentially
-/// captured by the given statement.
-static bool isCapturedBy(const VarDecl &Var, const Stmt *S) {
- if (const Expr *E = dyn_cast<Expr>(S))
- return isCapturedBy(Var, E);
- for (const Stmt *SubStmt : S->children())
- if (isCapturedBy(Var, SubStmt))
- return true;
- return false;
-}
-
-/// Determines whether the given __block variable is potentially
-/// captured by the given expression.
-static bool isCapturedBy(const VarDecl &Var, const Expr *E) {
- // Skip the most common kinds of expressions that make
- // hierarchy-walking expensive.
- E = E->IgnoreParenCasts();
-
- if (const BlockExpr *BE = dyn_cast<BlockExpr>(E)) {
- const BlockDecl *Block = BE->getBlockDecl();
- for (const auto &I : Block->captures()) {
- if (I.getVariable() == &Var)
- return true;
- }
-
- // No need to walk into the subexpressions.
- return false;
- }
-
- if (const StmtExpr *SE = dyn_cast<StmtExpr>(E)) {
- const CompoundStmt *CS = SE->getSubStmt();
- for (const auto *BI : CS->body())
- if (const auto *BIE = dyn_cast<Expr>(BI)) {
- if (isCapturedBy(Var, BIE))
- return true;
- }
- else if (const auto *DS = dyn_cast<DeclStmt>(BI)) {
- // special case declarations
- for (const auto *I : DS->decls()) {
- if (const auto *VD = dyn_cast<VarDecl>((I))) {
- const Expr *Init = VD->getInit();
- if (Init && isCapturedBy(Var, Init))
- return true;
- }
- }
- }
- else
- // FIXME. Make safe assumption assuming arbitrary statements cause capturing.
- // Later, provide code to poke into statements for capture analysis.
- return true;
- return false;
- }
-
- for (const Stmt *SubStmt : E->children())
- if (isCapturedBy(Var, SubStmt))
- return true;
-
- return false;
-}
-
/// Determine whether the given initializer is trivial in the sense
/// that it requires no code to be generated.
bool CodeGenFunction::isTrivialInitializer(const Expr *Init) {
@@ -1918,8 +1858,7 @@ void CodeGenFunction::EmitAutoVarInit(const AutoVarEmission &emission) {
// Check whether this is a byref variable that's potentially
// captured and moved by its own initializer. If so, we'll need to
// emit the initializer first, then copy into the variable.
- bool capturedByInit =
- Init && emission.IsEscapingByRef && isCapturedBy(D, Init);
+ bool capturedByInit = emission.IsCapturedByOwnInit;
bool locIsByrefHeader = !capturedByInit;
const Address Loc =
@@ -1951,8 +1890,9 @@ void CodeGenFunction::EmitAutoVarInit(const AutoVarEmission &emission) {
llvm::Constant *constant = nullptr;
if (emission.IsConstantAggregate ||
D.mightBeUsableInConstantExpressions(getContext())) {
- assert(!capturedByInit && "constant init contains a capturing block?");
constant = ConstantEmitter(*this).tryEmitAbstractForInitializer(D);
+ assert(!(constant && capturedByInit) &&
+ "constant init contains a capturing block?");
if (constant && !constant->isZeroValue() &&
(trivialAutoVarInit !=
LangOptions::TrivialAutoVarInitKind::Uninitialized)) {
@@ -1975,6 +1915,10 @@ void CodeGenFunction::EmitAutoVarInit(const AutoVarEmission &emission) {
initializeWhatIsTechnicallyUninitialized(Loc);
LValue lv = MakeAddrLValue(Loc, type);
lv.setNonGC(true);
+ if (emission.NeedsInitOnHeap) {
+ drillIntoBlockVariable(*this, lv, &D);
+ capturedByInit = false; // We don't need to drill again.
+ }
return EmitExprAsInit(Init, &D, lv, capturedByInit);
}
@@ -2023,6 +1967,9 @@ void CodeGenFunction::EmitExprAsInit(const Expr *init, const ValueDecl *D,
return;
}
case TEK_Aggregate:
+ assert(
+ !capturedByInit &&
+ "Capture-by-initializer should have been handled by NeedsInitOnHeap!");
if (type->isAtomicType()) {
EmitAtomicInit(const_cast<Expr*>(init), lvalue);
} else {
@@ -2031,7 +1978,6 @@ void CodeGenFunction::EmitExprAsInit(const Expr *init, const ValueDecl *D,
Overlap = AggValueSlot::DoesNotOverlap;
else if (auto *FD = dyn_cast<FieldDecl>(D))
Overlap = getOverlapForFieldInit(FD);
- // TODO: how can we delay here if D is captured by its initializer?
EmitAggExpr(init, AggValueSlot::forLValue(
lvalue, *this, AggValueSlot::IsDestructed,
AggValueSlot::DoesNotNeedGCBarriers,
@@ -2120,27 +2066,31 @@ void CodeGenFunction::EmitAutoVarCleanups(const AutoVarEmission &emission) {
// us from jumping into any of these scopes anyway.
if (!HaveInsertPoint()) return;
- const VarDecl &D = *emission.Variable;
+ // If we're initializing directly on the heap, _Block_object_destroy will
+ // handle destruction, so we don't need to perform cleanups directly.
+ if (!emission.NeedsInitOnHeap) {
+ const VarDecl &D = *emission.Variable;
- // Check the type for a cleanup.
- if (QualType::DestructionKind dtorKind = D.needsDestruction(getContext()))
- emitAutoVarTypeCleanup(emission, dtorKind);
+ // Check the type for a cleanup.
+ if (QualType::DestructionKind dtorKind = D.needsDestruction(getContext()))
+ emitAutoVarTypeCleanup(emission, dtorKind);
- // In GC mode, honor objc_precise_lifetime.
- if (getLangOpts().getGC() != LangOptions::NonGC &&
- D.hasAttr<ObjCPreciseLifetimeAttr>()) {
- EHStack.pushCleanup<ExtendGCLifetime>(NormalCleanup, &D);
- }
+ // In GC mode, honor objc_precise_lifetime.
+ if (getLangOpts().getGC() != LangOptions::NonGC &&
+ D.hasAttr<ObjCPreciseLifetimeAttr>()) {
+ EHStack.pushCleanup<ExtendGCLifetime>(NormalCleanup, &D);
+ }
- // Handle the cleanup attribute.
- if (const CleanupAttr *CA = D.getAttr<CleanupAttr>()) {
- const FunctionDecl *FD = CA->getFunctionDec...
[truncated]
Footnotes
|
e41ced7
to
9c61a1d
Compare
Rebased, because the original version failed CI due to a regression on |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Ping :) |
Clang special-cases `__block` variables whose initializer contains a block literal that captures the variable itself. But this special case doesn't work for variables of record types, causing Clang to emit broken code. Fix this by initializing such variables directly on the heap, and add a warning for when this needs to be done. rdar://70498802
fda6b18
to
26be4b6
Compare
Rebased due to merge conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serialization related change looks trivial and good. But I feel better to leave the formal approval to CG part reviewers.
Weekly ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks okay, but see the comment about exception safety.
// Otherwise, we transfer ownership of the retain from the stack | ||
// to the heap. | ||
// Otherwise, we transfer ownership of the retain from the stack | ||
// to the heap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation here is correct — it's a continuation of the same thought as the comment before the if
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, but the re-indentation came from clang-format (because I modified the code next to it), and not doing it will upset the CI checks.
How about I move the first comment after the if
and indent both of them?
if (type->isBlockPointerType()) {
// Block pointers need to be copied, and there's no direct
// transfer possible.
return ::buildByrefHelpers(CGM, byrefInfo,
ARCStrongBlockByrefHelpers(byrefInfo));
} else {
// Otherwise, we transfer ownership of the retain from the stack
// to the heap.
return ::buildByrefHelpers(CGM, byrefInfo,
ARCStrongByrefHelpers(byrefInfo));
}
clang/lib/CodeGen/CGBlocks.cpp
Outdated
RawAddress out = CreateDefaultAlignTempAlloca(P->getType(), "initOnHeap.out"); | ||
|
||
llvm::Value *args[] = {Builder.CreateBitCast(out.getPointer(), VoidPtrTy), | ||
Builder.CreateBitCast(P, VoidPtrTy), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These bitcasts should no longer be necessary, right? Do we have a mode that still generates typed pointers in IR?
If you're trying to handle address-space differences, this is the wrong way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is an artifact of the change having originally been written in 2020. Looks like I can just drop the bitcast. I don't think address-space differences are possible.
clang/lib/CodeGen/CGDecl.cpp
Outdated
@@ -2120,27 +2066,33 @@ void CodeGenFunction::EmitAutoVarCleanups(const AutoVarEmission &emission) { | |||
// us from jumping into any of these scopes anyway. | |||
if (!HaveInsertPoint()) return; | |||
|
|||
const VarDecl &D = *emission.Variable; | |||
// If we're initializing directly on the heap, _Block_object_destroy will | |||
// handle destruction, so we don't need to perform cleanups directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the right way to think of this is that the stack copy of the byref is always uninitialized in this situation, so there's nothing for us to destroy. Basically, we make an uninitialized byref structure on the stack, ask the runtime to move it to the heap with a trivial move function, and then initialize the object on the heap directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed... and looking at this again, I put too much code inside the if statement.
Not only that, there's an existing bug here when combining __attribute__((cleanup))
with __block
: if the variable was copied to the heap, the cleanup function will be called after the byref allocation has already been freed.
Will fix.
clang/lib/CodeGen/CGBlocks.cpp
Outdated
auto pair = BlockByrefInfos.insert({D, info}); | ||
assert(pair.second && "info was inserted recursively?"); | ||
return pair.first->second; | ||
} | ||
|
||
void CodeGenFunction::emitByrefInitOnHeap(llvm::Value *P) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually doing the initialization, so I think this is misnamed. This is just setting up for the initialization by allocating a byref on the heap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll change the name to emitByrefHeapAlloc
.
clang/lib/CodeGen/CGBlocks.cpp
Outdated
@@ -2764,6 +2794,9 @@ void CodeGenFunction::emitByrefStructureInit(const AutoVarEmission &emission) { | |||
auto layoutInfo = CGM.getObjCRuntime().BuildByrefLayout(CGM, type); | |||
storeHeaderField(layoutInfo, getPointerSize(), "byref.layout"); | |||
} | |||
|
|||
if (emission.NeedsInitOnHeap) | |||
emitByrefInitOnHeap(pointer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is not exception-safe, right? If the initialization throws, we have no way to release the byref object we're allocating here.
Now, it looks like the blocks runtime is itself not exception-safe, so maybe we can hand-wave this? _Block_byref_assign_copy
allocates, calls the copy operation, and makes no effort to deallocate on a throw. On the other hand, the byref "copy" operation is typically a call to a move constructor, so it is a lot less likely in practice that it'll throw than that the original initialization will.
The only way I can think of to handle this would be to actually put a "successfully initialized" flag in the byref payload and then no-op the destructor if it isn't set. That's implementable but pretty awful. So maybe we should just acknowledge it's not exception-safe and suggest that the blocks runtime add new APIs to support this properly. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh, you're right.
In principle the BLOCK_HAS_COPY_DISPOSE
flag could be used for that purpose, instead of creating a separate field in the payload. It could start out unset (making the runtime skip the call to byref_dispose
), and then be set after initialization.
But the initialization might send the block capturing the object to another thread. If initialization then throws, I assume that nobody will subsequently call the block, or if they do, the block won't actually try to use the object. That's straightforward UB. But it doesn't seem like it should be UB just to send the block. So there might be concurrent calls to _Block_object_dispose
. Not only that, 'our' _Block_object_dispose
call (the one that would hypothetically be added on the cleanup path to fix the memory leak) might not be the one to drop the last reference.
Therefore, flags
would have to be modified atomically, particularly because the same word is also used for the reference count. And even with an atomic operation, you would still have theoretical UB in the blocks runtime when it reads flags
non-atomically. Though, not any more UB than it's already incurring thanks to potential concurrent refcount modifications... But what if an alternate block runtime tries to cache flags
? The only alternate runtime I know of is GNUstep's and it looks like it doesn't cache it, but this still feels very dubious.
So then there's the alternative you suggested of just acknowledging that this isn't exception-safe (specifically, that it leaks memory if initialization throws). But that makes this whole patch feel like an awkward middle ground. If self-capturing __block
variables are not going to be supported properly (pending new APIs), then maybe they shouldn't be supported at all and should just be an error (pending new APIs).
But if new APIs are added, teaching Clang how to recognize when the target OS supports them would be quite a lot of work for what is ultimately an edge case.
I'm going to see how ugly it ends up being to add a "successfully initialized" flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to add the flag. It's not all that bad.
In fact, if, as I suggested in my last comment, we want to support the byref lingering after unwinding, then there must be some kind of runtime state – to track whether whoever drops the last reference needs to run the destructor. It would be more natural for the Blocks runtime to track that state rather than relying on compiler-generated code, but the state has to live somewhere.
If, on the other hand, we don't want to support the byref lingering after unwinding, then I could go back to the idea of mutating the existing flags
field. In that scenario, the existence of runtime state would be unnecessary, but the implementation would be relatively unobtrusive.
This is an updated version of https://reviews.llvm.org/D90434 from 2020. Due to time constraints I stopped working on the patch, and unfortunately never got back around to it until now.
Clang special-cases
__block
variables whose initializer contains a block literal that captures the variable itself. But this special case doesn't work for variables of record types, causing Clang to emit broken code.Fix this by initializing such variables directly on the heap, and add a warning for when this needs to be done.
Background on the special case
__block
variables are unique in that their address can change during the lifetime of the variable. Each__block
variable starts on the stack. But when a block is retained usingBlock_copy
, both the block itself and any__block
variables it captures are moved to the heap.What's tricky is when this move happens while the variable is still being initialized. For example:
Here, when
Block_copy
is called,val
is moved to the heap while still in an uninitialized state. Then whencopyBlockAndReturnInt
returns,main
needs to store the return value to the variable's new location, not to the original stack slot.This can only happen if a block literal that captures the variable is located syntactically within the initializer itself. If it's before the initializer then it couldn't capture the variable, and if it's after the initializer then it couldn't execute during initialization.
So Clang detects when such a literal exists inside the initializer;
CGDecl.cpp
refers to this condition ascapturedByInit
. If it does, then the initialization code operates in a special mode. In several functions in that file, there's anLValue
object and an accompanyingcapturedByInit
flag. IfcapturedByInit
is false (the typical case), then the lvalue points to the object being initialized. But if it's true, the lvalue points to the byref header. Then, depending on the type of the variable, you end up at one of many checks of the form:drillIntoBlockVariable
emits a load of the forwarding pointer from the byref header (to find the object's current location), and then modifieslvalue
to point to the object, unifying the two cases. Importantly, this happens at the last minute: after emitting code to evaluate the initializer, but before emitting thestore
instruction to write the evaluated value into the variable.Okay, but then why are there two different code paths? If the address calculation can be safely deferred, why not just do that all the time rather than having a separate mode just for self-capturing
__block
variables?Because it can't always be safely deferred.
Harder cases
First, consider the case where variable being initialized is a C struct. This is already problematic.
drillIntoBlockVariable
is supposed to be called after evaluating the initializer but before doing astore
. But when initializing a struct, Clang doesn't evaluate the whole initializer before storing. Instead it evaluates the first field, stores the first field, evaluates the second field, stores the second field, and so on. This is actually necessary for correctness1 given code like:But if any one of the field evaluations can move the object, then Clang would need to reload the forwarding pointer before storing each field. And that would need to apply recursively to any subfields. Such a scheme is definitely possible, but Clang doesn't implement it.
And if instead of a C struct you have a C++ class, supporting relocation to the heap becomes flat-out impossible. Consider:
C++ constructors can observe the address of the object being constructed, and expect that address not to change. Once the object has finished being constructed, it can be relocated by calling the move or copy constructor, and that's what Clang uses for block captures in the non-self-capturing case. But in this case,
Block_copy
is called while the stack object is still running its constructor. It would be nonsensical to try to move out of it in this state.Clang's current behavior
So what does Clang currently do? Well, for both C structs and C++ classes (as well as unions2), it runs into this TODO dating to 2011:
Clang simply neglects to call
drillIntoBlockVariable
and falls into its normal initialization code. That means the lvalue is initialized as if it pointed to the object, when it actually points to the byref header. As such, the generated code is non-functional. And this is true for all__block
variables of record type that Clang detects as self-capturing, even if the initializer never actually callsBlock_copy
.Solution
As stated, it's impossible for self-capturing variables of at least some record types to support moving to the heap. But there is a way to make them work regardless. Any given
__block
variable can move at most once, from the stack to the heap. Once on the heap, it never needs to move again. So, as suggested by @rjmccall in 2020, we can create a heap allocation before initialization even starts, and then initialize the object directly on the heap, knowing its address will remain stable.This patch applies this approach to all self-capturing
__block
variables with record type, i.e. all variables that previously triggered the broken code generation.The obvious downside of the approach that we pay the cost of a heap allocation even if
Block_copy
is never called. Conveniently, this can never be a performance regression per se, since the affected cases didn't work at all before. Still, there is room for future improvement, such as by making mid-initialization relocation work for C structs3, or detecting initializers that do contain a block literal but can't possibly callBlock_copy
on it.To assist performance-sensitive codebases, this patch adds a warning,
-Wimplicit-block-var-alloc
, that triggers whenever a block requires an implicit allocation. Since most codebases probably don't care, this warning is disabled by default and is not included in any standard warning groups.Allocation failure
Another downside of the approach is that because the heap allocation is completely implicit, there is no way to check for allocation failure. Normally, heap allocation happens only when you call
Block_copy
, which means you can catch allocation failure by checking whetherBlock_copy
returns NULL....Or so I thought. It turns out that checking
Block_copy
's return value is only a half-measure.Block_copy
does return NULL if there's an allocation failure when moving the block itself to the heap. But thenBlock_copy
calls into the copy helper to move captured variables to the heap, which can also fail. If it does, the copy helper has no way to report that failure. Instead you just get a crash (which should probably be a proper abort; Apple folks, see rdar://126632757). This limitation is more or less baked into the ABI, since copy helpers return void. Even xnu, which uses blocks in kernel space, does not support allocation failure for them - though, luckily, xnu's version waits for memory to be available rather than crashing.The point is, if recovery from allocation failure is already not a thing, then this patch isn't responsible for breaking it. Even if blocks are changed someday to support it, it would probably be desired only by a tiny minority of codebases, and those codebases could avoid implicit allocations by enabling the new warning and setting it as an error.
ABI limitation
The blocks runtime has no API to allocate a byref struct directly on the heap, because it's never been necessary before. There is only
_Block_object_assign
to move one from the stack to the heap. It would be nice to add an API to allocate directly on the heap, but for the dual sakes of backward compatibility and laziness, I have not attempted to do so.Instead, the codegen works with the existing runtime, as follows. First, a byref struct is allocated on the stack (including space for the object), but only the header is initialized; the object part is left uninitialized. Then
_Block_object_assign
is called to migrate it to the heap (which unfortunately pays the cost of memcpying the uninitialized stack memory to the heap). This produces the desired heap allocation, but also increases the reference count, which is balanced by immediately calling_Block_object_dispose
. After that, the object can finally be initialized inside the new heap allocation.Improved self-capture analysis
The existing check for self-capturing
__block
variables is implemented in the functionisCapturedBy
. But it turns out that that implementation is overcomplicated for no good reason. In ancient days (2011),isCapturedBy
only worked onExpr
s, and it recursed on theExpr
's children, which it assumed would also beExpr
s. This assumption turned out to be false for GNU statement expressions. The simple fix would have been to changeisCapturedBy
to accept and recurse on arbitraryStmt
s instead, sincechildren()
is defined for allStmt
s, not justExpr
s. I'm not sure if there was some reason that wasn't possible at the time, but what happened instead is that a special case was added for statement expressions, which over a series of patches got more and more convoluted.This patch replaces the whole thing with an
EvaluatedExprVisitor
(another suggestion from @rjmccall). This is not only simpler, but also more precise, eliminating false positive self-captures including:Block literals in unevaluated contexts (
sizeof(^{ ... })
).Some types of statements inside GNU statement expressions.
This patch also moves the self-capture check earlier in the compilation process, from CodeGen to Sema. This is necessary in order to support the aforementioned warning.
Constant initialization
This patch also fixes a related bug where self-capturing
__block
variables could trigger this assert:The assert is important, because constant initialization doesn't support the
drillIntoBlockVariable
mechanism. Now, a constant initializer obviously can't include a call toBlock_copy
; in fact, it can't include block literals at all unless they have no captures. But I found two cases where the assert can be triggered anyway.First, the initializer can syntactically contain a capturing block literal but never evaluate it:
This is addressed in this patch by forcing constant-initialized variables to be treated as non-self-capturing.
Second, the initializer can simply not be constant. With variables of types like
const int
, Clang first tries to constant-evaluate the initializer, then falls back to runtime initialization if that fails. But the assert is located in a place where it triggers just based on the try. This is addressed in this patch by moving the assert down and conditioning it on constant evaluation succeeding.Footnotes
Supporting such code appears to be required by the C++ spec (see
https://timsong-cpp.github.io/cppwp/n4861/dcl.init.aggr#6), though not
by the C spec (see https://port70.net/~nsz/c/c11/n1570.html#6.7.9p23). ↩
There are other
TEK_Aggregate
types besides records, but all suchtypes either can't be captured by blocks (arrays) or can't be the
type of a local variable at all. ↩
Though even in C, moving mid-initializer could break code that took the address of the variable:
However, it's a bit less obvious that such code needs to work; perhaps the user should just be aware the variable will be moving. It's not like C++ where construction is clearly an indivisible operation performed at a single address. ↩