-
Notifications
You must be signed in to change notification settings - Fork 14.8k
Fix codegen of consteval functions returning an empty class, and related issues #93115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-clang Author: Eli Friedman (efriedma-quic) ChangesIf a class is empty, don't store it to memory: the store might overwrite useful data. (See also d60c3d0.) Fixes #93040. Full diff: https://github.com/llvm/llvm-project/pull/93115.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index bba00257fd4f0..b1638fa318270 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -135,6 +135,17 @@ class AggExprEmitter : public StmtVisitor<AggExprEmitter> {
EnsureDest(E->getType());
if (llvm::Value *Result = ConstantEmitter(CGF).tryEmitConstantExpr(E)) {
+ // An empty record can overlap other data (if declared with
+ // no_unique_address); omit the store for such types - as there is no
+ // actual data to store.
+ if (CGF.getLangOpts().CPlusPlus) {
+ if (const RecordType *RT = E->getType()->getAs<RecordType>()) {
+ CXXRecordDecl *Record = cast<CXXRecordDecl>(RT->getDecl());
+ if (Record->isEmpty())
+ return;
+ }
+ }
+
Address StoreDest = Dest.getAddress();
// The emitted value is guaranteed to have the same size as the
// destination but can have a different type. Just do a bitcast in this
diff --git a/clang/test/CodeGenCXX/cxx2a-consteval.cpp b/clang/test/CodeGenCXX/cxx2a-consteval.cpp
index 075cab58358ab..5d5a62f9928fe 100644
--- a/clang/test/CodeGenCXX/cxx2a-consteval.cpp
+++ b/clang/test/CodeGenCXX/cxx2a-consteval.cpp
@@ -1,4 +1,3 @@
-// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// RUN: %clang_cc1 -emit-llvm %s -std=c++2a -triple x86_64-unknown-linux-gnu -o %t.ll
// RUN: FileCheck -check-prefix=EVAL -input-file=%t.ll %s
// RUN: FileCheck -check-prefix=EVAL-STATIC -input-file=%t.ll %s
@@ -275,3 +274,26 @@ void f() {
// EVAL-FN: call void @_ZN7GH821542S3C2Ei
}
}
+
+namespace GH93040 {
+struct C { char c = 1; };
+struct Empty { consteval Empty() {} };
+struct Test : C, Empty {
+ [[no_unique_address]] Empty e;
+};
+
+void f() {
+ Test test;
+
+// Make sure we don't overwrite the initialization of c.
+
+// EVAL-FN-LABEL: define {{.*}} void @_ZN7GH930404TestC2Ev
+// EVAL-FN: entry:
+// EVAL-FN-NEXT: [[THIS_ADDR:%.*]] = alloca ptr, align 8
+// EVAL-FN-NEXT: store ptr {{.*}}, ptr [[THIS_ADDR]], align 8
+// EVAL-FN-NEXT: [[THIS:%.*]] = load ptr, ptr [[THIS_ADDR]], align 8
+// EVAL-FN-NEXT: call void @_ZN7GH930401CC2Ev(ptr noundef nonnull align 1 dereferenceable(1) [[THIS]])
+// EVAL-FN-NEXT: %0 = getelementptr inbounds i8, ptr [[THIS]], i64 1
+// EVAL-FN-NEXT: ret void
+}
+}
|
@llvm/pr-subscribers-clang-codegen Author: Eli Friedman (efriedma-quic) ChangesIf a class is empty, don't store it to memory: the store might overwrite useful data. (See also d60c3d0.) Fixes #93040. Full diff: https://github.com/llvm/llvm-project/pull/93115.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index bba00257fd4f0..b1638fa318270 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -135,6 +135,17 @@ class AggExprEmitter : public StmtVisitor<AggExprEmitter> {
EnsureDest(E->getType());
if (llvm::Value *Result = ConstantEmitter(CGF).tryEmitConstantExpr(E)) {
+ // An empty record can overlap other data (if declared with
+ // no_unique_address); omit the store for such types - as there is no
+ // actual data to store.
+ if (CGF.getLangOpts().CPlusPlus) {
+ if (const RecordType *RT = E->getType()->getAs<RecordType>()) {
+ CXXRecordDecl *Record = cast<CXXRecordDecl>(RT->getDecl());
+ if (Record->isEmpty())
+ return;
+ }
+ }
+
Address StoreDest = Dest.getAddress();
// The emitted value is guaranteed to have the same size as the
// destination but can have a different type. Just do a bitcast in this
diff --git a/clang/test/CodeGenCXX/cxx2a-consteval.cpp b/clang/test/CodeGenCXX/cxx2a-consteval.cpp
index 075cab58358ab..5d5a62f9928fe 100644
--- a/clang/test/CodeGenCXX/cxx2a-consteval.cpp
+++ b/clang/test/CodeGenCXX/cxx2a-consteval.cpp
@@ -1,4 +1,3 @@
-// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// RUN: %clang_cc1 -emit-llvm %s -std=c++2a -triple x86_64-unknown-linux-gnu -o %t.ll
// RUN: FileCheck -check-prefix=EVAL -input-file=%t.ll %s
// RUN: FileCheck -check-prefix=EVAL-STATIC -input-file=%t.ll %s
@@ -275,3 +274,26 @@ void f() {
// EVAL-FN: call void @_ZN7GH821542S3C2Ei
}
}
+
+namespace GH93040 {
+struct C { char c = 1; };
+struct Empty { consteval Empty() {} };
+struct Test : C, Empty {
+ [[no_unique_address]] Empty e;
+};
+
+void f() {
+ Test test;
+
+// Make sure we don't overwrite the initialization of c.
+
+// EVAL-FN-LABEL: define {{.*}} void @_ZN7GH930404TestC2Ev
+// EVAL-FN: entry:
+// EVAL-FN-NEXT: [[THIS_ADDR:%.*]] = alloca ptr, align 8
+// EVAL-FN-NEXT: store ptr {{.*}}, ptr [[THIS_ADDR]], align 8
+// EVAL-FN-NEXT: [[THIS:%.*]] = load ptr, ptr [[THIS_ADDR]], align 8
+// EVAL-FN-NEXT: call void @_ZN7GH930401CC2Ev(ptr noundef nonnull align 1 dereferenceable(1) [[THIS]])
+// EVAL-FN-NEXT: %0 = getelementptr inbounds i8, ptr [[THIS]], i64 1
+// EVAL-FN-NEXT: ret void
+}
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Do we also need to worry about overwriting tail padding here?
I didn't think so at first glance... but yes, we do, in certain obscure cases:
I'll look into reworking this. |
clang/lib/CodeGen/CGExprAgg.cpp
Outdated
@@ -135,6 +135,17 @@ class AggExprEmitter : public StmtVisitor<AggExprEmitter> { | |||
EnsureDest(E->getType()); | |||
|
|||
if (llvm::Value *Result = ConstantEmitter(CGF).tryEmitConstantExpr(E)) { | |||
// An empty record can overlap other data (if declared with | |||
// no_unique_address); omit the store for such types - as there is no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Candide question: Empty record still need one byte when their address is taken (thus this comment about no_unique_address
I guess), why don't we see that in the diff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See what, exactly? Given the derived class, computing the address of the base class doesn't take any instructions, because it's the same address.
bdfcc72
to
19f3b67
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
@@ -177,7 +179,12 @@ kernel void KernelTwoMember(struct StructTwoMember u) { | |||
// AMDGCN-LABEL: define{{.*}} amdgpu_kernel void @KernelLargeTwoMember | |||
// AMDGCN-SAME: (%struct.LargeStructTwoMember %[[u_coerce:.*]]) | |||
// AMDGCN: %[[u:.*]] = alloca %struct.LargeStructTwoMember, align 8, addrspace(5) | |||
// AMDGCN: store %struct.LargeStructTwoMember %[[u_coerce]], ptr addrspace(5) %[[u]] | |||
// AMDGCN: %[[U_PTR0:.*]] = getelementptr inbounds %struct.LargeStructTwoMember, ptr addrspace(5) %[[u]], i32 0, i32 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unifying the codepaths makes FCA promotion happen more often.
@@ -46,9 +46,9 @@ int mane() { | |||
char1 f1{1}; | |||
char1 f2{1}; | |||
|
|||
// CHECK: [[TMP:%.+]] = alloca i16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The revised version of casting integers is a bit more aggressive; it's hard to make it precisely match the old code while still preserving the correct semantics.
// CHECK-NEXT: [[VALUE_COERCE_FCA_0_1_EXTRACT:%.*]] = extractvalue [[STRUCT_UINT32X4X2_T]] [[VALUE_COERCE]], 0, 1 | ||
// CHECK-NEXT: call void @llvm.arm.mve.vst2q.p0.v4i32(ptr [[ADDR:%.*]], <4 x i32> [[VALUE_COERCE_FCA_0_0_EXTRACT]], <4 x i32> [[VALUE_COERCE_FCA_0_1_EXTRACT]], i32 0) | ||
// CHECK-NEXT: call void @llvm.arm.mve.vst2q.p0.v4i32(ptr [[ADDR]], <4 x i32> [[VALUE_COERCE_FCA_0_0_EXTRACT]], <4 x i32> [[VALUE_COERCE_FCA_0_1_EXTRACT]], i32 1) | ||
// CHECK-NEXT: [[TMP0:%.*]] = extractvalue [[STRUCT_UINT32X4X2_T:%.*]] [[VALUE_COERCE:%.*]], 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently I've stumbled over some limitation of instcombine.
@@ -44,20 +44,20 @@ struct S1 f1(struct S1 s1) { return s1; } | |||
|
|||
// CHECK-SOFT: define{{.*}} void @_Z2f22S2(ptr dead_on_unwind noalias nocapture writable writeonly sret(%struct.S2) align 8 %agg.result, [4 x i32] %s2.coerce) | |||
// CHECK-HARD: define{{.*}} arm_aapcs_vfpcc [2 x <2 x i32>] @_Z2f22S2([2 x <2 x i32>] returned %s2.coerce) | |||
// CHECK-FULL: define{{.*}} arm_aapcs_vfpcc %struct.S2 @_Z2f22S2(%struct.S2 returned %s2.coerce) | |||
// CHECK-FULL: define{{.*}} arm_aapcs_vfpcc %struct.S2 @_Z2f22S2(%struct.S2 %s2.coerce) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also the instcombine issue.
19f3b67
to
816ceb2
Compare
(I'd like a re-review of the latest version: I made significant revisions to address the tail-padding issues.) |
clang/lib/CodeGen/CodeGenFunction.h
Outdated
/// Build all the stores needed to initialize an aggregate at Dest with the | ||
/// value Val. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment looks out of date.
…ted issues If a class is empty, don't store it to memory: the store might overwrite useful data. Similarly, if a class has tail padding that might overlap other fields, don't store the tail padding to memory. The problem here turned out a bit more general than I initially thought: basically all uses of EmitAggregateStore were broken. Call lowering had a method that did mostly the right thing, though: CreateCoercedStore. Adapt CreateCoercedStore so it always does the conservatively right thing, and use it for both calls and ConstantExpr. Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was set incorrectly for empty classes in some cases. Fixes llvm#93040.
816ceb2
to
75a99e3
Compare
Is this worth back porting as it is a bugfix over code gen? |
IMO, it's worth considering, but if we want to go down this route, I think we need to do so relatively quickly -- we have about two weeks until rc3, and given the size of this change, I'm not certain we should try landing it any later than rc3 just due to risk. |
This is maybe slightly risky in terms of possible regressions, but it is a fix for a miscompile, and we're early enough in the release process that it's probably fine. /cherry-pick 1762e01 |
Failed to create pull request for issue93115 https://github.com/llvm/llvm-project/actions/runs/10255950901 |
/cherry-pick 1762e01 |
…ted issues (llvm#93115) Fix codegen of consteval functions returning an empty class, and related issues If a class is empty, don't store it to memory: the store might overwrite useful data. Similarly, if a class has tail padding that might overlap other fields, don't store the tail padding to memory. The problem here turned out a bit more general than I initially thought: basically all uses of EmitAggregateStore were broken. Call lowering had a method that did mostly the right thing, though: CreateCoercedStore. Adapt CreateCoercedStore so it always does the conservatively right thing, and use it for both calls and ConstantExpr. Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was set incorrectly for empty classes in some cases. Fixes llvm#93040. (cherry picked from commit 1762e01)
/pull-request #102070 |
…ted issues (llvm#93115) Fix codegen of consteval functions returning an empty class, and related issues If a class is empty, don't store it to memory: the store might overwrite useful data. Similarly, if a class has tail padding that might overlap other fields, don't store the tail padding to memory. The problem here turned out a bit more general than I initially thought: basically all uses of EmitAggregateStore were broken. Call lowering had a method that did mostly the right thing, though: CreateCoercedStore. Adapt CreateCoercedStore so it always does the conservatively right thing, and use it for both calls and ConstantExpr. Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was set incorrectly for empty classes in some cases. Fixes llvm#93040. (cherry picked from commit 1762e01)
Fix codegen of consteval functions returning an empty class, and related issues
If a class is empty, don't store it to memory: the store might overwrite useful data. Similarly, if a class has tail padding that might overlap other fields, don't store the tail padding to memory.
The problem here turned out a bit more general than I initially thought: basically all uses of EmitAggregateStore were broken. Call lowering had a method that did mostly the right thing, though: CreateCoercedStore. Adapt CreateCoercedStore so it always does the conservatively right thing, and use it for both calls and ConstantExpr.
Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was set incorrectly for empty classes in some cases.
Fixes #93040.