Skip to content

Fix codegen of consteval functions returning an empty class, and related issues #93115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 1, 2024

Conversation

efriedma-quic
Copy link
Collaborator

@efriedma-quic efriedma-quic commented May 23, 2024

Fix codegen of consteval functions returning an empty class, and related issues

If a class is empty, don't store it to memory: the store might overwrite useful data. Similarly, if a class has tail padding that might overlap other fields, don't store the tail padding to memory.

The problem here turned out a bit more general than I initially thought: basically all uses of EmitAggregateStore were broken. Call lowering had a method that did mostly the right thing, though: CreateCoercedStore. Adapt CreateCoercedStore so it always does the conservatively right thing, and use it for both calls and ConstantExpr.

Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was set incorrectly for empty classes in some cases.

Fixes #93040.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels May 23, 2024
@llvmbot
Copy link
Member

llvmbot commented May 23, 2024

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang

Author: Eli Friedman (efriedma-quic)

Changes

If a class is empty, don't store it to memory: the store might overwrite useful data.

(See also d60c3d0.)

Fixes #93040.


Full diff: https://github.com/llvm/llvm-project/pull/93115.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGExprAgg.cpp (+11)
  • (modified) clang/test/CodeGenCXX/cxx2a-consteval.cpp (+23-1)
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index bba00257fd4f0..b1638fa318270 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -135,6 +135,17 @@ class AggExprEmitter : public StmtVisitor<AggExprEmitter> {
     EnsureDest(E->getType());
 
     if (llvm::Value *Result = ConstantEmitter(CGF).tryEmitConstantExpr(E)) {
+      // An empty record can overlap other data (if declared with
+      // no_unique_address); omit the store for such types - as there is no
+      // actual data to store.
+      if (CGF.getLangOpts().CPlusPlus) {
+        if (const RecordType *RT = E->getType()->getAs<RecordType>()) {
+          CXXRecordDecl *Record = cast<CXXRecordDecl>(RT->getDecl());
+          if (Record->isEmpty())
+            return;
+        }
+      }
+
       Address StoreDest = Dest.getAddress();
       // The emitted value is guaranteed to have the same size as the
       // destination but can have a different type. Just do a bitcast in this
diff --git a/clang/test/CodeGenCXX/cxx2a-consteval.cpp b/clang/test/CodeGenCXX/cxx2a-consteval.cpp
index 075cab58358ab..5d5a62f9928fe 100644
--- a/clang/test/CodeGenCXX/cxx2a-consteval.cpp
+++ b/clang/test/CodeGenCXX/cxx2a-consteval.cpp
@@ -1,4 +1,3 @@
-// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -emit-llvm %s -std=c++2a -triple x86_64-unknown-linux-gnu -o %t.ll
 // RUN: FileCheck -check-prefix=EVAL -input-file=%t.ll %s
 // RUN: FileCheck -check-prefix=EVAL-STATIC -input-file=%t.ll %s
@@ -275,3 +274,26 @@ void f() {
     // EVAL-FN:     call void @_ZN7GH821542S3C2Ei
 }
 }
+
+namespace GH93040 {
+struct C { char c = 1; };
+struct Empty { consteval Empty() {} };
+struct Test : C, Empty {
+  [[no_unique_address]] Empty e;
+};
+
+void f() {
+  Test test;
+
+// Make sure we don't overwrite the initialization of c.
+
+// EVAL-FN-LABEL: define {{.*}} void @_ZN7GH930404TestC2Ev
+// EVAL-FN: entry:
+// EVAL-FN-NEXT:  [[THIS_ADDR:%.*]] = alloca ptr, align 8
+// EVAL-FN-NEXT:  store ptr {{.*}}, ptr [[THIS_ADDR]], align 8
+// EVAL-FN-NEXT:  [[THIS:%.*]] = load ptr, ptr [[THIS_ADDR]], align 8
+// EVAL-FN-NEXT:  call void @_ZN7GH930401CC2Ev(ptr noundef nonnull align 1 dereferenceable(1) [[THIS]])
+// EVAL-FN-NEXT:  %0 = getelementptr inbounds i8, ptr [[THIS]], i64 1
+// EVAL-FN-NEXT:  ret void
+}
+}

@llvmbot
Copy link
Member

llvmbot commented May 23, 2024

@llvm/pr-subscribers-clang-codegen

Author: Eli Friedman (efriedma-quic)

Changes

If a class is empty, don't store it to memory: the store might overwrite useful data.

(See also d60c3d0.)

Fixes #93040.


Full diff: https://github.com/llvm/llvm-project/pull/93115.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGExprAgg.cpp (+11)
  • (modified) clang/test/CodeGenCXX/cxx2a-consteval.cpp (+23-1)
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index bba00257fd4f0..b1638fa318270 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -135,6 +135,17 @@ class AggExprEmitter : public StmtVisitor<AggExprEmitter> {
     EnsureDest(E->getType());
 
     if (llvm::Value *Result = ConstantEmitter(CGF).tryEmitConstantExpr(E)) {
+      // An empty record can overlap other data (if declared with
+      // no_unique_address); omit the store for such types - as there is no
+      // actual data to store.
+      if (CGF.getLangOpts().CPlusPlus) {
+        if (const RecordType *RT = E->getType()->getAs<RecordType>()) {
+          CXXRecordDecl *Record = cast<CXXRecordDecl>(RT->getDecl());
+          if (Record->isEmpty())
+            return;
+        }
+      }
+
       Address StoreDest = Dest.getAddress();
       // The emitted value is guaranteed to have the same size as the
       // destination but can have a different type. Just do a bitcast in this
diff --git a/clang/test/CodeGenCXX/cxx2a-consteval.cpp b/clang/test/CodeGenCXX/cxx2a-consteval.cpp
index 075cab58358ab..5d5a62f9928fe 100644
--- a/clang/test/CodeGenCXX/cxx2a-consteval.cpp
+++ b/clang/test/CodeGenCXX/cxx2a-consteval.cpp
@@ -1,4 +1,3 @@
-// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
 // RUN: %clang_cc1 -emit-llvm %s -std=c++2a -triple x86_64-unknown-linux-gnu -o %t.ll
 // RUN: FileCheck -check-prefix=EVAL -input-file=%t.ll %s
 // RUN: FileCheck -check-prefix=EVAL-STATIC -input-file=%t.ll %s
@@ -275,3 +274,26 @@ void f() {
     // EVAL-FN:     call void @_ZN7GH821542S3C2Ei
 }
 }
+
+namespace GH93040 {
+struct C { char c = 1; };
+struct Empty { consteval Empty() {} };
+struct Test : C, Empty {
+  [[no_unique_address]] Empty e;
+};
+
+void f() {
+  Test test;
+
+// Make sure we don't overwrite the initialization of c.
+
+// EVAL-FN-LABEL: define {{.*}} void @_ZN7GH930404TestC2Ev
+// EVAL-FN: entry:
+// EVAL-FN-NEXT:  [[THIS_ADDR:%.*]] = alloca ptr, align 8
+// EVAL-FN-NEXT:  store ptr {{.*}}, ptr [[THIS_ADDR]], align 8
+// EVAL-FN-NEXT:  [[THIS:%.*]] = load ptr, ptr [[THIS_ADDR]], align 8
+// EVAL-FN-NEXT:  call void @_ZN7GH930401CC2Ev(ptr noundef nonnull align 1 dereferenceable(1) [[THIS]])
+// EVAL-FN-NEXT:  %0 = getelementptr inbounds i8, ptr [[THIS]], i64 1
+// EVAL-FN-NEXT:  ret void
+}
+}

@efriedma-quic efriedma-quic requested a review from mstorsjo May 23, 2024 00:36
Copy link
Collaborator

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Do we also need to worry about overwriting tail padding here?

@efriedma-quic
Copy link
Collaborator Author

I didn't think so at first glance... but yes, we do, in certain obscure cases:

#include <new>
struct A { char c; A(); };
struct __attribute((packed)) S  { char a; int x; __attribute((aligned(2))) char y; consteval S() : x(1), a(3), y(2) {} };
struct S2 { [[no_unique_address]] S s; [[no_unique_address]] A a; };
static_assert(sizeof(S)==8 && sizeof(S2)==8);
void f2(S2 *s) { new (&s->s) S; }

I'll look into reworking this.

@@ -135,6 +135,17 @@ class AggExprEmitter : public StmtVisitor<AggExprEmitter> {
EnsureDest(E->getType());

if (llvm::Value *Result = ConstantEmitter(CGF).tryEmitConstantExpr(E)) {
// An empty record can overlap other data (if declared with
// no_unique_address); omit the store for such types - as there is no
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Candide question: Empty record still need one byte when their address is taken (thus this comment about no_unique_address I guess), why don't we see that in the diff?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See what, exactly? Given the derived class, computing the address of the base class doesn't take any instructions, because it's the same address.

@cor3ntin cor3ntin requested a review from erichkeane May 23, 2024 19:08
@efriedma-quic efriedma-quic force-pushed the consteval-empty-struct branch from bdfcc72 to 19f3b67 Compare May 24, 2024 01:43
@efriedma-quic efriedma-quic changed the title Fix codegen of consteval functions returning an empty class. Fix codegen of consteval functions returning an empty class, and related issues May 24, 2024
Copy link

github-actions bot commented May 24, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -177,7 +179,12 @@ kernel void KernelTwoMember(struct StructTwoMember u) {
// AMDGCN-LABEL: define{{.*}} amdgpu_kernel void @KernelLargeTwoMember
// AMDGCN-SAME: (%struct.LargeStructTwoMember %[[u_coerce:.*]])
// AMDGCN: %[[u:.*]] = alloca %struct.LargeStructTwoMember, align 8, addrspace(5)
// AMDGCN: store %struct.LargeStructTwoMember %[[u_coerce]], ptr addrspace(5) %[[u]]
// AMDGCN: %[[U_PTR0:.*]] = getelementptr inbounds %struct.LargeStructTwoMember, ptr addrspace(5) %[[u]], i32 0, i32 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unifying the codepaths makes FCA promotion happen more often.

@@ -46,9 +46,9 @@ int mane() {
char1 f1{1};
char1 f2{1};

// CHECK: [[TMP:%.+]] = alloca i16
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revised version of casting integers is a bit more aggressive; it's hard to make it precisely match the old code while still preserving the correct semantics.

// CHECK-NEXT: [[VALUE_COERCE_FCA_0_1_EXTRACT:%.*]] = extractvalue [[STRUCT_UINT32X4X2_T]] [[VALUE_COERCE]], 0, 1
// CHECK-NEXT: call void @llvm.arm.mve.vst2q.p0.v4i32(ptr [[ADDR:%.*]], <4 x i32> [[VALUE_COERCE_FCA_0_0_EXTRACT]], <4 x i32> [[VALUE_COERCE_FCA_0_1_EXTRACT]], i32 0)
// CHECK-NEXT: call void @llvm.arm.mve.vst2q.p0.v4i32(ptr [[ADDR]], <4 x i32> [[VALUE_COERCE_FCA_0_0_EXTRACT]], <4 x i32> [[VALUE_COERCE_FCA_0_1_EXTRACT]], i32 1)
// CHECK-NEXT: [[TMP0:%.*]] = extractvalue [[STRUCT_UINT32X4X2_T:%.*]] [[VALUE_COERCE:%.*]], 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I've stumbled over some limitation of instcombine.

@@ -44,20 +44,20 @@ struct S1 f1(struct S1 s1) { return s1; }

// CHECK-SOFT: define{{.*}} void @_Z2f22S2(ptr dead_on_unwind noalias nocapture writable writeonly sret(%struct.S2) align 8 %agg.result, [4 x i32] %s2.coerce)
// CHECK-HARD: define{{.*}} arm_aapcs_vfpcc [2 x <2 x i32>] @_Z2f22S2([2 x <2 x i32>] returned %s2.coerce)
// CHECK-FULL: define{{.*}} arm_aapcs_vfpcc %struct.S2 @_Z2f22S2(%struct.S2 returned %s2.coerce)
// CHECK-FULL: define{{.*}} arm_aapcs_vfpcc %struct.S2 @_Z2f22S2(%struct.S2 %s2.coerce)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also the instcombine issue.

@efriedma-quic efriedma-quic force-pushed the consteval-empty-struct branch from 19f3b67 to 816ceb2 Compare June 9, 2024 21:02
@efriedma-quic
Copy link
Collaborator Author

(I'd like a re-review of the latest version: I made significant revisions to address the tail-padding issues.)

Comment on lines 4773 to 4774
/// Build all the stores needed to initialize an aggregate at Dest with the
/// value Val.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks out of date.

…ted issues

If a class is empty, don't store it to memory: the store might overwrite
useful data.  Similarly, if a class has tail padding that might overlap
other fields, don't store the tail padding to memory.

The problem here turned out a bit more general than I initially thought:
basically all uses of EmitAggregateStore were broken. Call lowering had
a method that did mostly the right thing, though: CreateCoercedStore.
Adapt CreateCoercedStore so it always does the conservatively right
thing, and use it for both calls and ConstantExpr.

Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was
set incorrectly for empty classes in some cases.

Fixes llvm#93040.
@efriedma-quic efriedma-quic force-pushed the consteval-empty-struct branch from 816ceb2 to 75a99e3 Compare July 30, 2024 06:31
@efriedma-quic efriedma-quic merged commit 1762e01 into llvm:main Aug 1, 2024
7 checks passed
@AZero13
Copy link
Contributor

AZero13 commented Aug 2, 2024

Is this worth back porting as it is a bugfix over code gen?

@AaronBallman
Copy link
Collaborator

Is this worth back porting as it is a bugfix over code gen?

IMO, it's worth considering, but if we want to go down this route, I think we need to do so relatively quickly -- we have about two weeks until rc3, and given the size of this change, I'm not certain we should try landing it any later than rc3 just due to risk.

@efriedma-quic efriedma-quic added this to the LLVM 19.X Release milestone Aug 5, 2024
@efriedma-quic
Copy link
Collaborator Author

This is maybe slightly risky in terms of possible regressions, but it is a fix for a miscompile, and we're early enough in the release process that it's probably fine.

/cherry-pick 1762e01

@llvmbot
Copy link
Member

llvmbot commented Aug 5, 2024

Failed to create pull request for issue93115 https://github.com/llvm/llvm-project/actions/runs/10255950901

@efriedma-quic
Copy link
Collaborator Author

/cherry-pick 1762e01

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Aug 5, 2024
…ted issues (llvm#93115)

Fix codegen of consteval functions returning an empty class, and related
issues

If a class is empty, don't store it to memory: the store might overwrite
useful data. Similarly, if a class has tail padding that might overlap
other fields, don't store the tail padding to memory.

The problem here turned out a bit more general than I initially thought:
basically all uses of EmitAggregateStore were broken. Call lowering had
a method that did mostly the right thing, though: CreateCoercedStore.
Adapt CreateCoercedStore so it always does the conservatively right
thing, and use it for both calls and ConstantExpr.

Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was
set incorrectly for empty classes in some cases.

Fixes llvm#93040.

(cherry picked from commit 1762e01)
@llvmbot
Copy link
Member

llvmbot commented Aug 5, 2024

/pull-request #102070

tru pushed a commit to llvmbot/llvm-project that referenced this pull request Sep 10, 2024
…ted issues (llvm#93115)

Fix codegen of consteval functions returning an empty class, and related
issues

If a class is empty, don't store it to memory: the store might overwrite
useful data. Similarly, if a class has tail padding that might overlap
other fields, don't store the tail padding to memory.

The problem here turned out a bit more general than I initially thought:
basically all uses of EmitAggregateStore were broken. Call lowering had
a method that did mostly the right thing, though: CreateCoercedStore.
Adapt CreateCoercedStore so it always does the conservatively right
thing, and use it for both calls and ConstantExpr.

Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was
set incorrectly for empty classes in some cases.

Fixes llvm#93040.

(cherry picked from commit 1762e01)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
Development

Successfully merging this pull request may close these issues.

Codegen bug: Derived ctor with consteval base ctor
6 participants