Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clang] Set writable and dead_on_unwind attributes on sret arguments #77116

Merged
merged 1 commit into from Jan 11, 2024

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Jan 5, 2024

Set the writable and dead_on_unwind attributes for sret arguments. These indicate that the argument points to writable memory (and it's legal to introduce spurious writes to it on entry to the function) and that the argument memory will not be used if the call unwinds.

This enables additional MemCpyOpt/DSE/LICM optimizations.

I hope there isn't some subtle NRVO-related reason why this is illegal...

Set the writable and dead_on_unwind attributes for sret arguments.
These indicate that the argument points to writable memory (and it's
legal to introduce spurious writes to it on entry to the function)
and that the argument memory will not be used if the call unwinds.

This enables additional MemCpyOpt/DSE/LICM optimizations.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU backend:RISC-V backend:WebAssembly backend:X86 clang:modules C++20 modules and Clang Header Modules clang:codegen HLSL HLSL Language Support coroutines C++20 coroutines clang:openmp OpenMP related changes to Clang labels Jan 5, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 5, 2024

@llvm/pr-subscribers-clang-modules
@llvm/pr-subscribers-backend-webassembly
@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-coroutines
@llvm/pr-subscribers-hlsl
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: Nikita Popov (nikic)

Changes

Set the writable and dead_on_unwind attributes for sret arguments. These indicate that the argument points to writable memory (and it's legal to introduce spurious writes to it on entry to the function) and that the argument memory will not be used if the call unwinds.

This enables additional MemCpyOpt/DSE/LICM optimizations.

I hope there isn't some subtle NRVO-related reason why this is illegal...


Patch is 416.97 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77116.diff

134 Files Affected:

  • (modified) clang/lib/CodeGen/CGCall.cpp (+2)
  • (modified) clang/test/CodeGen/2006-05-19-SingleEltReturn.c (+2-2)
  • (modified) clang/test/CodeGen/64bit-swiftcall.c (+11-11)
  • (modified) clang/test/CodeGen/CSKY/csky-abi.c (+2-2)
  • (modified) clang/test/CodeGen/CSKY/csky-hard-abi.c (+22-22)
  • (modified) clang/test/CodeGen/CSKY/csky-soft-abi.c (+22-22)
  • (modified) clang/test/CodeGen/PowerPC/aix-alignment.c (+2-2)
  • (modified) clang/test/CodeGen/PowerPC/powerpc-c99complex.c (+3-3)
  • (modified) clang/test/CodeGen/PowerPC/ppc-aggregate-abi.cpp (+11-11)
  • (modified) clang/test/CodeGen/PowerPC/ppc32-and-aix-struct-return.c (+11-11)
  • (modified) clang/test/CodeGen/PowerPC/ppc64-align-struct.c (+8-8)
  • (modified) clang/test/CodeGen/PowerPC/ppc64-elf-abi.c (+1-1)
  • (modified) clang/test/CodeGen/PowerPC/ppc64-soft-float.c (+23-23)
  • (modified) clang/test/CodeGen/PowerPC/ppc64-vector.c (+2-2)
  • (modified) clang/test/CodeGen/PowerPC/ppc64le-aggregates.c (+6-6)
  • (modified) clang/test/CodeGen/PowerPC/ppc64le-f128Aggregates.c (+2-2)
  • (modified) clang/test/CodeGen/RISCV/bfloat-abi.c (+2-2)
  • (modified) clang/test/CodeGen/RISCV/riscv-abi.cpp (+4-4)
  • (modified) clang/test/CodeGen/RISCV/riscv32-abi.c (+33-33)
  • (modified) clang/test/CodeGen/RISCV/riscv64-abi.c (+5-5)
  • (modified) clang/test/CodeGen/SystemZ/gnu-atomic-builtins-i128-8Al.c (+15-39)
  • (modified) clang/test/CodeGen/SystemZ/systemz-abi-vector.c (+62-62)
  • (modified) clang/test/CodeGen/SystemZ/systemz-abi.c (+47-47)
  • (modified) clang/test/CodeGen/SystemZ/systemz-abi.cpp (+18-18)
  • (modified) clang/test/CodeGen/SystemZ/systemz-inline-asm.c (+1-1)
  • (modified) clang/test/CodeGen/WebAssembly/wasm-arguments.c (+6-6)
  • (modified) clang/test/CodeGen/WebAssembly/wasm-varargs.c (+2-2)
  • (modified) clang/test/CodeGen/X86/x86_32-arguments-darwin.c (+9-9)
  • (modified) clang/test/CodeGen/X86/x86_32-arguments-iamcu.c (+1-1)
  • (modified) clang/test/CodeGen/X86/x86_64-arguments-nacl.c (+1-1)
  • (modified) clang/test/CodeGen/X86/x86_64-arguments-win32.c (+1-1)
  • (modified) clang/test/CodeGen/X86/x86_64-arguments.c (+3-3)
  • (modified) clang/test/CodeGen/aarch64-sve-acle-__ARM_FEATURE_SVE_VECTOR_OPERATORS.c (+1-1)
  • (modified) clang/test/CodeGen/aarch64-varargs.c (+2-2)
  • (modified) clang/test/CodeGen/aggregate-assign-call.c (+2-2)
  • (modified) clang/test/CodeGen/aligned-sret.c (+1-1)
  • (modified) clang/test/CodeGen/arc/arguments.c (+4-4)
  • (modified) clang/test/CodeGen/arm-aapcs-vfp.c (+1-1)
  • (modified) clang/test/CodeGen/arm-arguments.c (+14-14)
  • (modified) clang/test/CodeGen/arm-homogenous.c (+4-4)
  • (modified) clang/test/CodeGen/arm-neon-vld.c (+72-72)
  • (modified) clang/test/CodeGen/arm-swiftcall.c (+13-13)
  • (modified) clang/test/CodeGen/arm-varargs.c (+9-9)
  • (modified) clang/test/CodeGen/arm-vector-arguments.c (+3-3)
  • (modified) clang/test/CodeGen/arm-vfp16-arguments.c (+1-1)
  • (modified) clang/test/CodeGen/arm-vfp16-arguments2.cpp (+5-5)
  • (modified) clang/test/CodeGen/arm64-arguments.c (+2-2)
  • (modified) clang/test/CodeGen/arm64-microsoft-arguments.cpp (+17-17)
  • (modified) clang/test/CodeGen/arm64_32.c (+1-1)
  • (modified) clang/test/CodeGen/armv7k-abi.c (+1-1)
  • (modified) clang/test/CodeGen/attr-noundef.cpp (+3-3)
  • (modified) clang/test/CodeGen/blocks.c (+1-1)
  • (modified) clang/test/CodeGen/c11atomics-ios.c (+2-2)
  • (modified) clang/test/CodeGen/c11atomics.c (+2-2)
  • (modified) clang/test/CodeGen/ext-int-cc.c (+62-62)
  • (modified) clang/test/CodeGen/isfpclass.c (+1-1)
  • (modified) clang/test/CodeGen/lanai-arguments.c (+2-2)
  • (modified) clang/test/CodeGen/mcu-struct-return.c (+2-2)
  • (modified) clang/test/CodeGen/mingw-long-double.c (+4-4)
  • (modified) clang/test/CodeGen/mips-vector-return.c (+3-3)
  • (modified) clang/test/CodeGen/mips-zero-sized-struct.c (+1-1)
  • (modified) clang/test/CodeGen/mips64-nontrivial-return.cpp (+1-1)
  • (modified) clang/test/CodeGen/mips64-padding-arg.c (+3-3)
  • (modified) clang/test/CodeGen/ms_abi.c (+2-2)
  • (modified) clang/test/CodeGen/paren-list-agg-init.cpp (+3-3)
  • (modified) clang/test/CodeGen/regcall2.c (+1-1)
  • (modified) clang/test/CodeGen/regparm-struct.c (+1-1)
  • (modified) clang/test/CodeGen/renderscript.c (+9-9)
  • (modified) clang/test/CodeGen/sparcv9-abi.c (+1-1)
  • (modified) clang/test/CodeGen/sret.c (+5-5)
  • (modified) clang/test/CodeGen/vectorcall.c (+2-2)
  • (modified) clang/test/CodeGen/windows-struct-abi.c (+1-1)
  • (modified) clang/test/CodeGen/windows-swiftcall.c (+2-2)
  • (modified) clang/test/CodeGenCXX/aix-alignment.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/arm-cc.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/arm-swiftcall.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/attr-musttail.cpp (+3-3)
  • (modified) clang/test/CodeGenCXX/call-with-static-chain.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/conditional-gnu-ext.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/cxx1z-copy-omission.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/cxx1z-lambda-star-this.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/exceptions.cpp (+3-3)
  • (modified) clang/test/CodeGenCXX/homogeneous-aggregates.cpp (+14-14)
  • (modified) clang/test/CodeGenCXX/lambda-expressions.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/matrix-casts.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/matrix-type-builtins.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/matrix-type.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-byval-sret.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-byval-thunks.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-cdecl-method-sret.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-eh-cleanups.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-sret-and-byval.cpp (+44-44)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-unknown-arch.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-vmemptr-conflicts.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/ms-thread_local.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/nrvo.cpp (+9-9)
  • (modified) clang/test/CodeGenCXX/pass-by-value-noalias.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/regcall.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/regcall4.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/stack-reuse-miscompile.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/stack-reuse.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/temporaries.cpp (+6-6)
  • (modified) clang/test/CodeGenCXX/thiscall-struct-return.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/thunk-returning-memptr.cpp (+2-2)
  • (modified) clang/test/CodeGenCXX/trivial_abi.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/unknown-anytype.cpp (+1-1)
  • (modified) clang/test/CodeGenCXX/wasm-args-returns.cpp (+9-9)
  • (modified) clang/test/CodeGenCXX/x86_32-arguments.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/x86_64-arguments.cpp (+2-2)
  • (modified) clang/test/CodeGenCoroutines/coro-await.cpp (+5-5)
  • (modified) clang/test/CodeGenCoroutines/coro-gro2.cpp (+5-5)
  • (modified) clang/test/CodeGenHLSL/sret_output.hlsl (+1-1)
  • (modified) clang/test/CodeGenObjC/arc.m (+2-2)
  • (modified) clang/test/CodeGenObjC/direct-method.m (+1-1)
  • (modified) clang/test/CodeGenObjC/nontrivial-c-struct-exception.m (+2-2)
  • (modified) clang/test/CodeGenObjC/objc-non-trivial-struct-nrvo.m (+3-3)
  • (modified) clang/test/CodeGenObjC/stret-1.m (+4-4)
  • (modified) clang/test/CodeGenObjC/stret_lookup.m (+2-2)
  • (modified) clang/test/CodeGenObjC/weak-in-c-struct.m (+1-1)
  • (modified) clang/test/CodeGenObjC/x86_64-struct-return-gc.m (+1-1)
  • (modified) clang/test/CodeGenObjCXX/objc-struct-cxx-abi.mm (+1-1)
  • (modified) clang/test/CodeGenOpenCL/addr-space-struct-arg.cl (+3-3)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-abi-struct-arg-byref.cl (+2-2)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-abi-struct-coerce.cl (+3-3)
  • (modified) clang/test/CodeGenOpenCLCXX/addrspace-of-this.clcpp (+2-2)
  • (modified) clang/test/Modules/templates.mm (+1-1)
  • (modified) clang/test/OpenMP/irbuilder_for_iterator.cpp (+1-1)
  • (modified) clang/test/OpenMP/irbuilder_for_rangefor.cpp (+3-3)
  • (modified) clang/test/OpenMP/master_taskloop_in_reduction_codegen.cpp (+2-2)
  • (modified) clang/test/OpenMP/master_taskloop_simd_in_reduction_codegen.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_in_reduction_codegen.cpp (+2-2)
  • (modified) clang/test/OpenMP/task_in_reduction_codegen.cpp (+2-2)
  • (modified) clang/test/OpenMP/taskloop_in_reduction_codegen.cpp (+2-2)
  • (modified) clang/test/OpenMP/taskloop_simd_in_reduction_codegen.cpp (+2-2)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 51a43b5f85b3cc..13677cf150aed2 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2612,6 +2612,8 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
   if (IRFunctionArgs.hasSRetArg()) {
     llvm::AttrBuilder SRETAttrs(getLLVMContext());
     SRETAttrs.addStructRetAttr(getTypes().ConvertTypeForMem(RetTy));
+    SRETAttrs.addAttribute(llvm::Attribute::Writable);
+    SRETAttrs.addAttribute(llvm::Attribute::DeadOnUnwind);
     hasUsedSRet = true;
     if (RetAI.getInReg())
       SRETAttrs.addAttribute(llvm::Attribute::InReg);
diff --git a/clang/test/CodeGen/2006-05-19-SingleEltReturn.c b/clang/test/CodeGen/2006-05-19-SingleEltReturn.c
index 16eacf3ec16258..b542242606cc38 100644
--- a/clang/test/CodeGen/2006-05-19-SingleEltReturn.c
+++ b/clang/test/CodeGen/2006-05-19-SingleEltReturn.c
@@ -24,7 +24,7 @@ struct Y bar(void) {
 
 
 // X86_32: define{{.*}} void @foo(ptr noundef %P)
-// X86_32:   call void @bar(ptr sret(%struct.Y) align 4 %{{[^),]*}})
+// X86_32:   call void @bar(ptr dead_on_unwind writable sret(%struct.Y) align 4 %{{[^),]*}})
 
-// X86_32: define{{.*}} void @bar(ptr noalias sret(%struct.Y) align 4 %{{[^,)]*}})
+// X86_32: define{{.*}} void @bar(ptr dead_on_unwind noalias writable sret(%struct.Y) align 4 %{{[^,)]*}})
 // X86_32:   ret void
diff --git a/clang/test/CodeGen/64bit-swiftcall.c b/clang/test/CodeGen/64bit-swiftcall.c
index da6f18248c2af2..b1c42e3b0a657b 100644
--- a/clang/test/CodeGen/64bit-swiftcall.c
+++ b/clang/test/CodeGen/64bit-swiftcall.c
@@ -30,7 +30,7 @@ SWIFTCALL int indirect_result_2(OUT int *arg0, OUT float *arg1) {  __builtin_unr
 
 typedef struct { char array[1024]; } struct_reallybig;
 SWIFTCALL struct_reallybig indirect_result_3(OUT int *arg0, OUT float *arg1) { __builtin_unreachable(); }
-// CHECK-LABEL: define {{.*}} void @indirect_result_3(ptr noalias sret(%struct.struct_reallybig) {{.*}}, ptr noalias align 4 dereferenceable(4){{.*}}, ptr noalias align 4 dereferenceable(4){{.*}})
+// CHECK-LABEL: define {{.*}} void @indirect_result_3(ptr dead_on_unwind noalias writable sret(%struct.struct_reallybig) {{.*}}, ptr noalias align 4 dereferenceable(4){{.*}}, ptr noalias align 4 dereferenceable(4){{.*}})
 
 SWIFTCALL void context_1(CONTEXT void *self) {}
 // CHECK-LABEL: define {{.*}} void @context_1(ptr swiftself
@@ -238,7 +238,7 @@ typedef struct {
 } struct_big_1;
 TEST(struct_big_1)
 
-// CHECK-LABEL: define {{.*}} void @return_struct_big_1({{.*}} noalias sret
+// CHECK-LABEL: define {{.*}} void @return_struct_big_1(ptr dead_on_unwind noalias writable sret
 
 // Should not be byval.
 // CHECK-LABEL: define {{.*}} void @take_struct_big_1(ptr{{( %.*)?}})
@@ -522,7 +522,7 @@ typedef struct {
   double d4;
 } struct_d5;
 TEST(struct_d5)
-// CHECK: define{{.*}} swiftcc void @return_struct_d5(ptr noalias sret([[STRUCT5:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_d5(ptr dead_on_unwind noalias writable sret([[STRUCT5:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_d5(ptr
 
 typedef struct {
@@ -709,7 +709,7 @@ typedef struct {
   long long l4;
 } struct_l5;
 TEST(struct_l5)
-// CHECK: define{{.*}} swiftcc void @return_struct_l5(ptr noalias sret([[STRUCT5:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_l5(ptr dead_on_unwind noalias writable sret([[STRUCT5:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_l5(ptr
 
 typedef struct {
@@ -754,7 +754,7 @@ typedef struct {
   char16 c4;
 } struct_vc5;
 TEST(struct_vc5)
-// CHECK: define{{.*}} swiftcc void @return_struct_vc5(ptr noalias sret([[STRUCT:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_vc5(ptr dead_on_unwind noalias writable sret([[STRUCT:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_vc5(ptr
 
 typedef struct {
@@ -799,7 +799,7 @@ typedef struct {
   short8 c4;
 } struct_vs5;
 TEST(struct_vs5)
-// CHECK: define{{.*}} swiftcc void @return_struct_vs5(ptr noalias sret([[STRUCT:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_vs5(ptr dead_on_unwind noalias writable sret([[STRUCT:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_vs5(ptr
 
 typedef struct {
@@ -844,7 +844,7 @@ typedef struct {
   int4 c4;
 } struct_vi5;
 TEST(struct_vi5)
-// CHECK: define{{.*}} swiftcc void @return_struct_vi5(ptr noalias sret([[STRUCT:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_vi5(ptr dead_on_unwind noalias writable sret([[STRUCT:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_vi5(ptr
 
 typedef struct {
@@ -872,7 +872,7 @@ typedef struct {
   long2 c4;
 } struct_vl5;
 TEST(struct_vl5)
-// CHECK: define{{.*}} swiftcc void @return_struct_vl5(ptr noalias sret([[STRUCT:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_vl5(ptr dead_on_unwind noalias writable sret([[STRUCT:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_vl5(ptr
 
 typedef struct {
@@ -900,7 +900,7 @@ typedef struct {
   double2 c4;
 } struct_vd5;
 TEST(struct_vd5)
-// CHECK: define{{.*}} swiftcc void @return_struct_vd5(ptr noalias sret([[STRUCT:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_vd5(ptr dead_on_unwind noalias writable sret([[STRUCT:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_vd5(ptr
 
 typedef struct {
@@ -924,7 +924,7 @@ typedef struct {
   double4 c2;
 } struct_vd43;
 TEST(struct_vd43)
-// CHECK: define{{.*}} swiftcc void @return_struct_vd43(ptr noalias sret([[STRUCT:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_vd43(ptr dead_on_unwind noalias writable sret([[STRUCT:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_vd43(ptr
 
 typedef struct {
@@ -960,7 +960,7 @@ typedef struct {
   float4 c4;
 } struct_vf5;
 TEST(struct_vf5)
-// CHECK: define{{.*}} swiftcc void @return_struct_vf5(ptr noalias sret([[STRUCT:.+]])
+// CHECK: define{{.*}} swiftcc void @return_struct_vf5(ptr dead_on_unwind noalias writable sret([[STRUCT:.+]])
 // CHECK: define{{.*}} swiftcc void @take_struct_vf5(ptr
 
 typedef struct {
diff --git a/clang/test/CodeGen/CSKY/csky-abi.c b/clang/test/CodeGen/CSKY/csky-abi.c
index a24d4d8d64077d..2e549376ba9330 100644
--- a/clang/test/CodeGen/CSKY/csky-abi.c
+++ b/clang/test/CodeGen/CSKY/csky-abi.c
@@ -117,7 +117,7 @@ void f_agg_large(struct large x) {
 
 // The address where the struct should be written to will be the first
 // argument
-// CHECK-LABEL: define{{.*}} void @f_agg_large_ret(ptr noalias sret(%struct.large) align 4 %agg.result, i32 noundef %i, i8 noundef signext %j)
+// CHECK-LABEL: define{{.*}} void @f_agg_large_ret(ptr dead_on_unwind noalias writable sret(%struct.large) align 4 %agg.result, i32 noundef %i, i8 noundef signext %j)
 struct large f_agg_large_ret(int32_t i, int8_t j) {
   return (struct large){1, 2, 3, 4};
 }
@@ -144,7 +144,7 @@ int f_scalar_stack_1(struct tiny a, struct small b, struct small_aligned c,
 // the presence of large return values that consume a register due to the need
 // to pass a pointer.
 
-// CHECK-LABEL: define{{.*}} void @f_scalar_stack_2(ptr noalias sret(%struct.large) align 4 %agg.result, i32 noundef %a, i64 noundef %b, i64 noundef %c, double noundef %d, i8 noundef zeroext %e, i8 noundef signext %f, i8 noundef zeroext %g)
+// CHECK-LABEL: define{{.*}} void @f_scalar_stack_2(ptr dead_on_unwind noalias writable sret(%struct.large) align 4 %agg.result, i32 noundef %a, i64 noundef %b, i64 noundef %c, double noundef %d, i8 noundef zeroext %e, i8 noundef signext %f, i8 noundef zeroext %g)
 struct large f_scalar_stack_2(int32_t a, int64_t b, int64_t c, long double d,
                               uint8_t e, int8_t f, uint8_t g) {
   return (struct large){a, e, f, g};
diff --git a/clang/test/CodeGen/CSKY/csky-hard-abi.c b/clang/test/CodeGen/CSKY/csky-hard-abi.c
index 2171da8091e2b4..0bc4a5a8808e51 100644
--- a/clang/test/CodeGen/CSKY/csky-hard-abi.c
+++ b/clang/test/CodeGen/CSKY/csky-hard-abi.c
@@ -72,7 +72,7 @@ struct double_float_s {
 // CHECK: define{{.*}} void @f_double_double_s_arg([4 x i32] %a.coerce)
 void f_double_double_s_arg(struct double_double_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_double_double_s(ptr noalias sret(%struct.double_double_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_double_double_s(ptr dead_on_unwind noalias writable sret(%struct.double_double_s) align 4 %agg.result)
 struct double_double_s f_ret_double_double_s(void) {
   return (struct double_double_s){1.0, 2.0};
 }
@@ -80,7 +80,7 @@ struct double_double_s f_ret_double_double_s(void) {
 // CHECK: define{{.*}} void @f_double_float_s_arg([3 x i32] %a.coerce)
 void f_double_float_s_arg(struct double_float_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_double_float_s(ptr noalias sret(%struct.double_float_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_double_float_s(ptr dead_on_unwind noalias writable sret(%struct.double_float_s) align 4 %agg.result)
 struct double_float_s f_ret_double_float_s(void) {
   return (struct double_float_s){1.0, 2.0};
 }
@@ -118,7 +118,7 @@ struct double_int8_zbf_s {
 // CHECK: define{{.*}}  @f_double_int8_s_arg([3 x i32] %a.coerce)
 void f_double_int8_s_arg(struct double_int8_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_double_int8_s(ptr noalias sret(%struct.double_int8_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_double_int8_s(ptr dead_on_unwind noalias writable sret(%struct.double_int8_s) align 4 %agg.result)
 struct double_int8_s f_ret_double_int8_s(void) {
   return (struct double_int8_s){1.0, 2};
 }
@@ -126,7 +126,7 @@ struct double_int8_s f_ret_double_int8_s(void) {
 // CHECK: define{{.*}} void @f_double_uint8_s_arg([3 x i32] %a.coerce)
 void f_double_uint8_s_arg(struct double_uint8_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_double_uint8_s(ptr noalias sret(%struct.double_uint8_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_double_uint8_s(ptr dead_on_unwind noalias writable sret(%struct.double_uint8_s) align 4 %agg.result)
 struct double_uint8_s f_ret_double_uint8_s(void) {
   return (struct double_uint8_s){1.0, 2};
 }
@@ -134,7 +134,7 @@ struct double_uint8_s f_ret_double_uint8_s(void) {
 // CHECK: define{{.*}} void @f_double_int32_s_arg([3 x i32] %a.coerce)
 void f_double_int32_s_arg(struct double_int32_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_double_int32_s(ptr noalias sret(%struct.double_int32_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_double_int32_s(ptr dead_on_unwind noalias writable sret(%struct.double_int32_s) align 4 %agg.result)
 struct double_int32_s f_ret_double_int32_s(void) {
   return (struct double_int32_s){1.0, 2};
 }
@@ -142,7 +142,7 @@ struct double_int32_s f_ret_double_int32_s(void) {
 // CHECK: define{{.*}} void @f_double_int64_s_arg([4 x i32] %a.coerce)
 void f_double_int64_s_arg(struct double_int64_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_double_int64_s(ptr noalias sret(%struct.double_int64_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_double_int64_s(ptr dead_on_unwind noalias writable sret(%struct.double_int64_s) align 4 %agg.result)
 struct double_int64_s f_ret_double_int64_s(void) {
   return (struct double_int64_s){1.0, 2};
 }
@@ -150,7 +150,7 @@ struct double_int64_s f_ret_double_int64_s(void) {
 // CHECK: define{{.*}} void @f_double_int64bf_s_arg([3 x i32] %a.coerce)
 void f_double_int64bf_s_arg(struct double_int64bf_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_double_int64bf_s(ptr noalias sret(%struct.double_int64bf_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_double_int64bf_s(ptr dead_on_unwind noalias writable sret(%struct.double_int64bf_s) align 4 %agg.result)
 struct double_int64bf_s f_ret_double_int64bf_s(void) {
   return (struct double_int64bf_s){1.0, 2};
 }
@@ -158,7 +158,7 @@ struct double_int64bf_s f_ret_double_int64bf_s(void) {
 // CHECK: define{{.*}} void @f_double_int8_zbf_s([3 x i32] %a.coerce)
 void f_double_int8_zbf_s(struct double_int8_zbf_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_double_int8_zbf_s(ptr noalias sret(%struct.double_int8_zbf_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_double_int8_zbf_s(ptr dead_on_unwind noalias writable sret(%struct.double_int8_zbf_s) align 4 %agg.result)
 struct double_int8_zbf_s f_ret_double_int8_zbf_s(void) {
   return (struct double_int8_zbf_s){1.0, 2};
 }
@@ -179,7 +179,7 @@ void f_struct_double_int8_insufficient_fprs(float a, double b, double c, double
 // CHECK: define{{.*}} void @f_doublecomplex(double noundef %a.coerce0, double noundef %a.coerce1)
 void f_doublecomplex(double __complex__ a) {}
 
-// CHECK: define{{.*}} void @f_ret_doublecomplex(ptr noalias sret({ double, double }) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_doublecomplex(ptr dead_on_unwind noalias writable sret({ double, double }) align 4 %agg.result)
 double __complex__ f_ret_doublecomplex(void) {
   return 1.0;
 }
@@ -191,7 +191,7 @@ struct doublecomplex_s {
 // CHECK: define{{.*}} void @f_doublecomplex_s_arg([4 x i32] %a.coerce)
 void f_doublecomplex_s_arg(struct doublecomplex_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_doublecomplex_s(ptr noalias sret(%struct.doublecomplex_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_doublecomplex_s(ptr dead_on_unwind noalias writable sret(%struct.doublecomplex_s) align 4 %agg.result)
 struct doublecomplex_s f_ret_doublecomplex_s(void) {
   return (struct doublecomplex_s){1.0};
 }
@@ -218,7 +218,7 @@ struct doublearr2_s {
 // CHECK: define{{.*}} void @f_doublearr2_s_arg([4 x i32] %a.coerce)
 void f_doublearr2_s_arg(struct doublearr2_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_doublearr2_s(ptr noalias sret(%struct.doublearr2_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_doublearr2_s(ptr dead_on_unwind noalias writable sret(%struct.doublearr2_s) align 4 %agg.result)
 struct doublearr2_s f_ret_doublearr2_s(void) {
   return (struct doublearr2_s){{1.0, 2.0}};
 }
@@ -232,7 +232,7 @@ struct doublearr2_tricky1_s {
 // CHECK: define{{.*}} void @f_doublearr2_tricky1_s_arg([4 x i32] %a.coerce)
 void f_doublearr2_tricky1_s_arg(struct doublearr2_tricky1_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_doublearr2_tricky1_s(ptr noalias sret(%struct.doublearr2_tricky1_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_doublearr2_tricky1_s(ptr dead_on_unwind noalias writable sret(%struct.doublearr2_tricky1_s) align 4 %agg.result)
 struct doublearr2_tricky1_s f_ret_doublearr2_tricky1_s(void) {
   return (struct doublearr2_tricky1_s){{{{1.0}}, {{2.0}}}};
 }
@@ -247,7 +247,7 @@ struct doublearr2_tricky2_s {
 // CHECK: define{{.*}} void @f_doublearr2_tricky2_s_arg([4 x i32] %a.coerce)
 void f_doublearr2_tricky2_s_arg(struct doublearr2_tricky2_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_doublearr2_tricky2_s(ptr noalias sret(%struct.doublearr2_tricky2_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_doublearr2_tricky2_s(ptr dead_on_unwind noalias writable sret(%struct.doublearr2_tricky2_s) align 4 %agg.result)
 struct doublearr2_tricky2_s f_ret_doublearr2_tricky2_s(void) {
   return (struct doublearr2_tricky2_s){{}, {{{1.0}}, {{2.0}}}};
 }
@@ -262,7 +262,7 @@ struct doublearr2_tricky3_s {
 // CHECK: define{{.*}} void @f_doublearr2_tricky3_s_arg([4 x i32] %a.coerce)
 void f_doublearr2_tricky3_s_arg(struct doublearr2_tricky3_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_doublearr2_tricky3_s(ptr noalias sret(%struct.doublearr2_tricky3_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_doublearr2_tricky3_s(ptr dead_on_unwind noalias writable sret(%struct.doublearr2_tricky3_s) align 4 %agg.result)
 struct doublearr2_tricky3_s f_ret_doublearr2_tricky3_s(void) {
   return (struct doublearr2_tricky3_s){{}, {{{1.0}}, {{2.0}}}};
 }
@@ -278,7 +278,7 @@ struct doublearr2_tricky4_s {
 // CHECK: define{{.*}} void @f_doublearr2_tricky4_s_arg([4 x i32] %a.coerce)
 void f_doublearr2_tricky4_s_arg(struct doublearr2_tricky4_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_doublearr2_tricky4_s(ptr noalias sret(%struct.doublearr2_tricky4_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_doublearr2_tricky4_s(ptr dead_on_unwind noalias writable sret(%struct.doublearr2_tricky4_s) align 4 %agg.result)
 struct doublearr2_tricky4_s f_ret_doublearr2_tricky4_s(void) {
   return (struct doublearr2_tricky4_s){{}, {{{}, {1.0}}, {{}, {2.0}}}};
 }
@@ -292,7 +292,7 @@ struct int_double_int_s {
 // CHECK: define{{.*}} void @f_int_double_int_s_arg([4 x i32] %a.coerce)
 void f_int_double_int_s_arg(struct int_double_int_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_int_double_int_s(ptr noalias sret(%struct.int_double_int_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_int_double_int_s(ptr dead_on_unwind noalias writable sret(%struct.int_double_int_s) align 4 %agg.result)
 struct int_double_int_s f_ret_int_double_int_s(void) {
   return (struct int_double_int_s){1, 2.0, 3};
 }
@@ -305,7 +305,7 @@ struct int64_double_s {
 // CHECK: define{{.*}} void @f_int64_double_s_arg([4 x i32] %a.coerce)
 void f_int64_double_s_arg(struct int64_double_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_int64_double_s(ptr noalias sret(%struct.int64_double_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_int64_double_s(ptr dead_on_unwind noalias writable sret(%struct.int64_double_s) align 4 %agg.result)
 struct int64_double_s f_ret_int64_double_s(void) {
   return (struct int64_double_s){1, 2.0};
 }
@@ -319,7 +319,7 @@ struct char_char_double_s {
 // CHECK-LABEL: define{{.*}} void @f_char_char_double_s_arg([3 x i32] %a.coerce)
 void f_char_char_double_s_arg(struct char_char_double_s a) {}
 
-// CHECK: define{{.*}} void @f_ret_char_char_double_s(ptr noalias sret(%struct.char_char_double_s) align 4 %agg.result)
+// CHECK: define{{.*}} void @f_ret_char_char_double_s(ptr dead_on_unwind noalias writable sret(%struct.char_char_double_s) align 4 %agg.result)
 struct char_char_double_s f_ret_char_char_double_s(void) {
   return (struct char_char_double_s){1, 2, 3.0};
 }
@@ -338,19 +338,19 @@ union double_u f_ret_double_u(void) {
   return (union double_u){1.0};
 }
 
-// CHECK: define{{.*}} void @f_ret_double_int32_s_double_int32_s_just_sufficient_gprs(ptr noalias sret(%struct.double_int32_s) align 4 %agg.result, i32 noundef %a, i32 noundef %b, i32 noundef %c, i32 noundef %d, i32 noundef %e, i32 noundef %f, i32 noundef %g, [3 x i32] %h.coerce)
+// CHECK: define{{.*}} void @f_ret_double_int32_s_double_int32_s_just_sufficient_gprs(ptr dead_on_unwind noalias writable sret(%struct.double_int32_s) align 4 %agg.result, i32 noundef %a, i32 noundef %b, i32 noundef %c, i32 noundef %d, i32 noundef %e, i32 noundef %f, i32 noundef %g, [3 x i32] %h.coerce)
 struct double_int32_s f_ret_double_int32_s_double_int32_s_just_sufficient_gprs(
     int a, int b, int c, int d, int e, int f, int g, struct double_int32_s h) {
   return (struct double_int32_s){1.0, 2};
 }
 
-// CHECK: define{{.*}} void @f_ret_double_double_s_double_int32_s_just_sufficient_gprs(ptr noalias sret(%struct.double_double_s) align 4 %agg.result, i32 noundef %a, i32 noundef %b, i32 noundef %c, i32 noundef %d, i32 noundef %e, i32 noundef %f, i32 noundef %g, [3 x i32] %h.coerce)
+// CHECK: define{{.*}} void @f_ret_double_double_s_double_int32_s_just_sufficient_gprs(ptr dead_on_unwind noalias writable sret(%struct.double_double_s) align 4 %agg.result, i32 noundef %a, i32 noundef %b, i32 noundef %c, i32 noundef %d, i32 noundef %e, i32 noundef %f, i32 noundef %g, [3 x i32] %h.coerce)
 struct double_double_s f_ret_double_double_s_double_int32_s_just_sufficient_gprs(
     int a, int b, int c, int d, int e, int f, int g, struct double_int32_s h) {
   return (struct double_double_s){1.0, 2.0};
 }
 
-// CHECK: define{{.*}} void @f_ret_doublecomplex_double_int32_s_just_sufficient_gprs(ptr noalias sret({ double, double }) align 4 %agg.result, i32 noundef %a, i32 noundef %b, i32 noundef %c, i32 noundef %d, i32 noundef %e, i32 noundef %f, i32 noundef %g, [3 x i32] %h.coerce)
+// CHECK: define{{.*}} void @f_ret_doublecomplex_double_int32_s_just_sufficient_gprs(ptr dead_on_unwind noalias writable sret({ double, double }) align 4 %agg.result, i32 noundef %a, i32 noundef %b, i32 noundef %c, i32 noundef %d, i32 noundef %e, i32 noundef %f, i32 noundef %g, [3 x i32] %h.coerce)
 double __complex__ f_ret_doublecomplex_double_int32_s_just_sufficient_gprs(
     int a, int b, int c, int d, int e, int f, int g, struct double_int32_s h) {
   return 1.0;
@@ -376,7 +376,7 @@ struct large {
 // the presence of large return values that consume a register due to the need...
[truncated]

@nikic
Copy link
Contributor Author

nikic commented Jan 10, 2024

Some IR for reference: https://clang.godbolt.org/z/qEsP7vozW I believe that on unwind, the sret temporary is either entirely unused (if no cleanup landingpad is necessary) or we will call lifetime.end on it (which is legal for dead_on_unwind). This should be independent of whether copy elision is performed or not.

@rjmccall
Copy link
Contributor

If I understand the specification of these attributes correctly, they seem fine. dead_on_unwind is definitely fine — the return value must be treated as uninitialized after a call that throws. writable has a somewhat loose specification that scares me a bit, but as long as the actual analysis is being properly conservative about possible aliases created during the call, I think it should be fine.

@nikic nikic merged commit 158d72d into llvm:main Jan 11, 2024
14 checks passed
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…lvm#77116)

Set the writable and dead_on_unwind attributes for sret arguments. These
indicate that the argument points to writable memory (and it's legal to
introduce spurious writes to it on entry to the function) and that the
argument memory will not be used if the call unwinds.

This enables additional MemCpyOpt/DSE/LICM optimizations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU backend:RISC-V backend:WebAssembly backend:X86 clang:codegen clang:modules C++20 modules and Clang Header Modules clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category coroutines C++20 coroutines HLSL HLSL Language Support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants