Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dont alter cold function alignment unless using Os #72387

Closed
wants to merge 2 commits into from

Conversation

FlameTop
Copy link
Contributor

This PR alters the behaviour of function alignment for functions marked as 'cold'. Currently functions marked with a ‘cold’ attribute are also set as optimize by size. Optimize by size alters the function alignment from default. This interferes with code replacement features on our targets. This PR allows cold functions to maintain their default alignment except when optimize by size is explicitly used.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:codegen llvm:ir labels Nov 15, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 15, 2023

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-x86

Author: Phil Camp (FlameTop)

Changes

This PR alters the behaviour of function alignment for functions marked as 'cold'. Currently functions marked with a ‘cold’ attribute are also set as optimize by size. Optimize by size alters the function alignment from default. This interferes with code replacement features on our targets. This PR allows cold functions to maintain their default alignment except when optimize by size is explicitly used.


Full diff: https://github.com/llvm/llvm-project/pull/72387.diff

5 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+3)
  • (added) clang/test/CodeGen/cold-align.cpp (+32)
  • (modified) llvm/include/llvm/IR/Attributes.td (+1)
  • (modified) llvm/lib/CodeGen/MachineFunction.cpp (+2-1)
  • (added) llvm/test/CodeGen/X86/cold-align.ll (+52)
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index f1b900be74b2cdf..450729937740934 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2410,6 +2410,9 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
       if (!ShouldAddOptNone)
         B.addAttribute(llvm::Attribute::OptimizeForSize);
       B.addAttribute(llvm::Attribute::Cold);
+      // dont alter alignment if not optimizing for size
+      if (!CodeGenOpts.OptimizeSize)
+        B.addAttribute("keepalign", "true");
     }
     if (D->hasAttr<HotAttr>())
       B.addAttribute(llvm::Attribute::Hot);
diff --git a/clang/test/CodeGen/cold-align.cpp b/clang/test/CodeGen/cold-align.cpp
new file mode 100644
index 000000000000000..93564d181d65d96
--- /dev/null
+++ b/clang/test/CodeGen/cold-align.cpp
@@ -0,0 +1,32 @@
+// Dont alter function alignment if marked cold
+//
+// Cold attribute marks functions as also optimize for size. This normally collapses the
+// default function alignment. This can interfere with edit&continue effectiveness.
+//
+// RUN:     %clang -O2 -S -emit-llvm %s -o - | FileCheck %s -check-prefixes TWO
+// RUN:     %clang -Os -S -emit-llvm %s -o - | FileCheck %s -check-prefixes SIZE
+
+class Dismissed
+{
+public:
+  __attribute__((cold)) void Chilly();
+                        void Temparate();
+  __attribute__((hot))  void Sizzle();
+};
+void Dismissed::Chilly(){};
+void Dismissed::Temparate(){};
+void Dismissed::Sizzle(){};
+
+// TWO: attributes #0 = {
+// TWO: "keepalign"="true"
+// TWO: attributes #1 = {
+// TWO-NOT: "keepalign"="true"
+// TWO: attributes #2 = {
+// TWO-NOT: "keepalign"="true"
+
+// SIZE: attributes #0 = {
+// SIZE-NOT: "keepalign"="true"
+// SIZE: attributes #1 = {
+// SIZE-NOT: "keepalign"="true"
+// SIZE: attributes #2 = {
+// SIZE-NOT: "keepalign"="true"
\ No newline at end of file
diff --git a/llvm/include/llvm/IR/Attributes.td b/llvm/include/llvm/IR/Attributes.td
index fc38e68ad273b6b..30844f566786c37 100644
--- a/llvm/include/llvm/IR/Attributes.td
+++ b/llvm/include/llvm/IR/Attributes.td
@@ -332,6 +332,7 @@ def NoJumpTables : StrBoolAttr<"no-jump-tables">;
 def NoInlineLineTables : StrBoolAttr<"no-inline-line-tables">;
 def ProfileSampleAccurate : StrBoolAttr<"profile-sample-accurate">;
 def UseSampleProfile : StrBoolAttr<"use-sample-profile">;
+def KeepAlign: StrBoolAttr<"keepalign">;
 
 def DenormalFPMath : ComplexStrAttr<"denormal-fp-math", [FnAttr]>;
 def DenormalFPMathF32 : ComplexStrAttr<"denormal-fp-math-f32", [FnAttr]>;
diff --git a/llvm/lib/CodeGen/MachineFunction.cpp b/llvm/lib/CodeGen/MachineFunction.cpp
index 07eb0ba7f45c2e3..ab5392a05fc019b 100644
--- a/llvm/lib/CodeGen/MachineFunction.cpp
+++ b/llvm/lib/CodeGen/MachineFunction.cpp
@@ -214,7 +214,8 @@ void MachineFunction::init() {
 
   // FIXME: Shouldn't use pref alignment if explicit alignment is set on F.
   // FIXME: Use Function::hasOptSize().
-  if (!F.hasFnAttribute(Attribute::OptimizeForSize))
+  if ((!F.hasFnAttribute(Attribute::OptimizeForSize)) ||
+     (F.getFnAttribute("keepalign").getValueAsBool()))
     Alignment = std::max(Alignment,
                          STI->getTargetLowering()->getPrefFunctionAlignment());
 
diff --git a/llvm/test/CodeGen/X86/cold-align.ll b/llvm/test/CodeGen/X86/cold-align.ll
new file mode 100644
index 000000000000000..ab1f393dfa561ab
--- /dev/null
+++ b/llvm/test/CodeGen/X86/cold-align.ll
@@ -0,0 +1,52 @@
+;  Dont alter function alignment if marked cold
+;
+;  Cold attribute marks functions as also optimize for size. This normally collapses the
+;  default function alignment. This can interfere with edit&continue effectiveness.
+;
+;
+;  RUN:     llc -O2 <%s | FileCheck %s -check-prefixes TWO
+;
+;  TWO: .globl _ZN9Dismissed6ChillyEv
+;  TWO-NEXT: .p2align 4, 0x90
+;  TWO: .globl _ZN9Dismissed9TemparateEv
+;  TWO-NEXT: .p2align 4, 0x90
+;  TWO: .globl _ZN9Dismissed6SizzleEv
+;  TWO-NEXT: .p2align 4, 0x90
+
+; ModuleID = 'cold-align.cpp'
+source_filename = "cold-align.cpp"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-sie-ps5"
+
+; Function Attrs: cold mustprogress nofree norecurse nosync nounwind optsize sspstrong willreturn memory(none) uwtable
+define hidden void @_ZN9Dismissed6ChillyEv(ptr nocapture noundef nonnull readnone align 1 dereferenceable(1) %this) local_unnamed_addr #0 align 2 {
+entry:
+  ret void
+}
+
+; Function Attrs: mustprogress nofree norecurse nosync nounwind sspstrong willreturn memory(none) uwtable
+define hidden void @_ZN9Dismissed9TemparateEv(ptr nocapture noundef nonnull readnone align 1 dereferenceable(1) %this) local_unnamed_addr #1 align 2 {
+entry:
+  ret void
+}
+
+; Function Attrs: hot mustprogress nofree norecurse nosync nounwind sspstrong willreturn memory(none) uwtable
+define hidden void @_ZN9Dismissed6SizzleEv(ptr nocapture noundef nonnull readnone align 1 dereferenceable(1) %this) local_unnamed_addr #2 align 2 {
+entry:
+  ret void
+}
+
+attributes #0 = { cold mustprogress nofree norecurse nosync nounwind optsize sspstrong willreturn memory(none) uwtable "denormal-fp-math"="preserve-sign,preserve-sign" "frame-pointer"="non-leaf" "keepalign"="true" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="znver2s" "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+lwp,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
+attributes #1 = { mustprogress nofree norecurse nosync nounwind sspstrong willreturn memory(none) uwtable "denormal-fp-math"="preserve-sign,preserve-sign" "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="znver2s" "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+lwp,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
+attributes #2 = { hot mustprogress nofree norecurse nosync nounwind sspstrong willreturn memory(none) uwtable "denormal-fp-math"="preserve-sign,preserve-sign" "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="znver2s" "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+lwp,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
+
+!llvm.module.flags = !{!0, !1, !2, !3, !4, !5}
+!llvm.ident = !{!6}
+
+!0 = !{i32 1, !"wchar_size", i32 2}
+!1 = !{i32 1, !"SIE:STLVersion1", i32 1}
+!2 = !{i32 8, !"PIC Level", i32 2}
+!3 = !{i32 7, !"uwtable", i32 2}
+!4 = !{i32 7, !"frame-pointer", i32 1}
+!5 = !{i32 1, !"MaxTLSAlign", i32 256}
+!6 = !{!"clang version 18.0.0"}

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 15, 2023

@llvm/pr-subscribers-clang

Author: Phil Camp (FlameTop)

Changes

This PR alters the behaviour of function alignment for functions marked as 'cold'. Currently functions marked with a ‘cold’ attribute are also set as optimize by size. Optimize by size alters the function alignment from default. This interferes with code replacement features on our targets. This PR allows cold functions to maintain their default alignment except when optimize by size is explicitly used.


Full diff: https://github.com/llvm/llvm-project/pull/72387.diff

5 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+3)
  • (added) clang/test/CodeGen/cold-align.cpp (+32)
  • (modified) llvm/include/llvm/IR/Attributes.td (+1)
  • (modified) llvm/lib/CodeGen/MachineFunction.cpp (+2-1)
  • (added) llvm/test/CodeGen/X86/cold-align.ll (+52)
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index f1b900be74b2cdf..450729937740934 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2410,6 +2410,9 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
       if (!ShouldAddOptNone)
         B.addAttribute(llvm::Attribute::OptimizeForSize);
       B.addAttribute(llvm::Attribute::Cold);
+      // dont alter alignment if not optimizing for size
+      if (!CodeGenOpts.OptimizeSize)
+        B.addAttribute("keepalign", "true");
     }
     if (D->hasAttr<HotAttr>())
       B.addAttribute(llvm::Attribute::Hot);
diff --git a/clang/test/CodeGen/cold-align.cpp b/clang/test/CodeGen/cold-align.cpp
new file mode 100644
index 000000000000000..93564d181d65d96
--- /dev/null
+++ b/clang/test/CodeGen/cold-align.cpp
@@ -0,0 +1,32 @@
+// Dont alter function alignment if marked cold
+//
+// Cold attribute marks functions as also optimize for size. This normally collapses the
+// default function alignment. This can interfere with edit&continue effectiveness.
+//
+// RUN:     %clang -O2 -S -emit-llvm %s -o - | FileCheck %s -check-prefixes TWO
+// RUN:     %clang -Os -S -emit-llvm %s -o - | FileCheck %s -check-prefixes SIZE
+
+class Dismissed
+{
+public:
+  __attribute__((cold)) void Chilly();
+                        void Temparate();
+  __attribute__((hot))  void Sizzle();
+};
+void Dismissed::Chilly(){};
+void Dismissed::Temparate(){};
+void Dismissed::Sizzle(){};
+
+// TWO: attributes #0 = {
+// TWO: "keepalign"="true"
+// TWO: attributes #1 = {
+// TWO-NOT: "keepalign"="true"
+// TWO: attributes #2 = {
+// TWO-NOT: "keepalign"="true"
+
+// SIZE: attributes #0 = {
+// SIZE-NOT: "keepalign"="true"
+// SIZE: attributes #1 = {
+// SIZE-NOT: "keepalign"="true"
+// SIZE: attributes #2 = {
+// SIZE-NOT: "keepalign"="true"
\ No newline at end of file
diff --git a/llvm/include/llvm/IR/Attributes.td b/llvm/include/llvm/IR/Attributes.td
index fc38e68ad273b6b..30844f566786c37 100644
--- a/llvm/include/llvm/IR/Attributes.td
+++ b/llvm/include/llvm/IR/Attributes.td
@@ -332,6 +332,7 @@ def NoJumpTables : StrBoolAttr<"no-jump-tables">;
 def NoInlineLineTables : StrBoolAttr<"no-inline-line-tables">;
 def ProfileSampleAccurate : StrBoolAttr<"profile-sample-accurate">;
 def UseSampleProfile : StrBoolAttr<"use-sample-profile">;
+def KeepAlign: StrBoolAttr<"keepalign">;
 
 def DenormalFPMath : ComplexStrAttr<"denormal-fp-math", [FnAttr]>;
 def DenormalFPMathF32 : ComplexStrAttr<"denormal-fp-math-f32", [FnAttr]>;
diff --git a/llvm/lib/CodeGen/MachineFunction.cpp b/llvm/lib/CodeGen/MachineFunction.cpp
index 07eb0ba7f45c2e3..ab5392a05fc019b 100644
--- a/llvm/lib/CodeGen/MachineFunction.cpp
+++ b/llvm/lib/CodeGen/MachineFunction.cpp
@@ -214,7 +214,8 @@ void MachineFunction::init() {
 
   // FIXME: Shouldn't use pref alignment if explicit alignment is set on F.
   // FIXME: Use Function::hasOptSize().
-  if (!F.hasFnAttribute(Attribute::OptimizeForSize))
+  if ((!F.hasFnAttribute(Attribute::OptimizeForSize)) ||
+     (F.getFnAttribute("keepalign").getValueAsBool()))
     Alignment = std::max(Alignment,
                          STI->getTargetLowering()->getPrefFunctionAlignment());
 
diff --git a/llvm/test/CodeGen/X86/cold-align.ll b/llvm/test/CodeGen/X86/cold-align.ll
new file mode 100644
index 000000000000000..ab1f393dfa561ab
--- /dev/null
+++ b/llvm/test/CodeGen/X86/cold-align.ll
@@ -0,0 +1,52 @@
+;  Dont alter function alignment if marked cold
+;
+;  Cold attribute marks functions as also optimize for size. This normally collapses the
+;  default function alignment. This can interfere with edit&continue effectiveness.
+;
+;
+;  RUN:     llc -O2 <%s | FileCheck %s -check-prefixes TWO
+;
+;  TWO: .globl _ZN9Dismissed6ChillyEv
+;  TWO-NEXT: .p2align 4, 0x90
+;  TWO: .globl _ZN9Dismissed9TemparateEv
+;  TWO-NEXT: .p2align 4, 0x90
+;  TWO: .globl _ZN9Dismissed6SizzleEv
+;  TWO-NEXT: .p2align 4, 0x90
+
+; ModuleID = 'cold-align.cpp'
+source_filename = "cold-align.cpp"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-sie-ps5"
+
+; Function Attrs: cold mustprogress nofree norecurse nosync nounwind optsize sspstrong willreturn memory(none) uwtable
+define hidden void @_ZN9Dismissed6ChillyEv(ptr nocapture noundef nonnull readnone align 1 dereferenceable(1) %this) local_unnamed_addr #0 align 2 {
+entry:
+  ret void
+}
+
+; Function Attrs: mustprogress nofree norecurse nosync nounwind sspstrong willreturn memory(none) uwtable
+define hidden void @_ZN9Dismissed9TemparateEv(ptr nocapture noundef nonnull readnone align 1 dereferenceable(1) %this) local_unnamed_addr #1 align 2 {
+entry:
+  ret void
+}
+
+; Function Attrs: hot mustprogress nofree norecurse nosync nounwind sspstrong willreturn memory(none) uwtable
+define hidden void @_ZN9Dismissed6SizzleEv(ptr nocapture noundef nonnull readnone align 1 dereferenceable(1) %this) local_unnamed_addr #2 align 2 {
+entry:
+  ret void
+}
+
+attributes #0 = { cold mustprogress nofree norecurse nosync nounwind optsize sspstrong willreturn memory(none) uwtable "denormal-fp-math"="preserve-sign,preserve-sign" "frame-pointer"="non-leaf" "keepalign"="true" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="znver2s" "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+lwp,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
+attributes #1 = { mustprogress nofree norecurse nosync nounwind sspstrong willreturn memory(none) uwtable "denormal-fp-math"="preserve-sign,preserve-sign" "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="znver2s" "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+lwp,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
+attributes #2 = { hot mustprogress nofree norecurse nosync nounwind sspstrong willreturn memory(none) uwtable "denormal-fp-math"="preserve-sign,preserve-sign" "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="znver2s" "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+lwp,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
+
+!llvm.module.flags = !{!0, !1, !2, !3, !4, !5}
+!llvm.ident = !{!6}
+
+!0 = !{i32 1, !"wchar_size", i32 2}
+!1 = !{i32 1, !"SIE:STLVersion1", i32 1}
+!2 = !{i32 8, !"PIC Level", i32 2}
+!3 = !{i32 7, !"uwtable", i32 2}
+!4 = !{i32 7, !"frame-pointer", i32 1}
+!5 = !{i32 1, !"MaxTLSAlign", i32 256}
+!6 = !{!"clang version 18.0.0"}

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 2060bfcdc7eb704647c64bf925cdceb94c1f535f effbfe02b71638fe49dfc09423663f9077c4cc41 -- clang/test/CodeGen/cold-align.cpp clang/lib/CodeGen/CodeGenModule.cpp llvm/lib/CodeGen/MachineFunction.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/CodeGen/MachineFunction.cpp b/llvm/lib/CodeGen/MachineFunction.cpp
index ab5392a05f..9303ca787e 100644
--- a/llvm/lib/CodeGen/MachineFunction.cpp
+++ b/llvm/lib/CodeGen/MachineFunction.cpp
@@ -215,7 +215,7 @@ void MachineFunction::init() {
   // FIXME: Shouldn't use pref alignment if explicit alignment is set on F.
   // FIXME: Use Function::hasOptSize().
   if ((!F.hasFnAttribute(Attribute::OptimizeForSize)) ||
-     (F.getFnAttribute("keepalign").getValueAsBool()))
+      (F.getFnAttribute("keepalign").getValueAsBool()))
     Alignment = std::max(Alignment,
                          STI->getTargetLowering()->getPrefFunctionAlignment());
 

@efriedma-quic
Copy link
Collaborator

This patch appears to do two things:

  • Add a "default align" attribute, which instructions the backend to use some unspecified "default" alignment for the function in question.
  • Tells the frontend to apply this attribute specifically to "cold" functions.

This is basically nonsense. We already have a way to set alignment: it's the "align" attribute. And there isn't any obvious reason to special-case "cold" functions.

This interferes with code replacement features on our targets

Please take a step back and consider what makes sense to express the constraints in general, not what narrowly solves the problem on the current version of your codebase.

If there's an ABI constraint that requires a particular function alignment, clang should be explicitly marking it. We already do this if you, for example, specify -falign-functions=16 on the clang command-line.

@pogo59
Copy link
Collaborator

pogo59 commented Nov 20, 2023

We already have a way to set alignment: it's the "align" attribute.

Is clang aware of the target's default function alignment? I'd thought not, in which case deferring to LLVM (as this new attribute does) is correct. Happy to be corrected on this point.

there isn't any obvious reason to special-case "cold" functions.

I'd thought that was clear. Cold implies opt-for-size, which implies align(1). We want cold not to imply align(1), directly or indirectly; but, if opt-for-size was requested explicitly, that should still imply align(1). The implied align(1) does interfere with a useful feature for PS4/PS5, and has for some time. If the tweak to the definition of "cold" should be specific to our targets, we can certainly do that.

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Nov 20, 2023

Is clang aware of the target's default function alignment?

No? On x86, the "default" is set by the line setPrefFunctionAlignment(Align(16));. Which, as the name suggests, is supposed to be a preference, not a mandatory minimum. And it's subject to change in the future if we find that a different number is better on some target CPUs, or for certain functions.

there isn't any obvious reason to special-case "cold" functions.

I'd thought that was clear. Cold implies opt-for-size, which implies align(1)

If there's a relevant ABI rule here, we want to mark it on all functions. So if, for example, some optimization decides a function is cold, we get the same result.

If the tweak to the definition of "cold" should be specific to our targets, we can certainly do that.

If there's some relevant ABI rule, clang should compute the required alignment explicitly, and explicitly mark functions with the required alignment. If that's target-specific, that's fine. Ideally, it should be controllable by a flag separate from the optimization level.

@pogo59
Copy link
Collaborator

pogo59 commented Nov 20, 2023

I guess I'm a little confused still. Are you claiming that cold has a specified ABI effect? That if some pass decides a function is cold, that decision (correctly) affects its ABI? I'm not seeing any documentation to that effect, either on the clang attribute or the IR attribute. My admittedly aging copy of the psABI doesn't mention function alignment at all, either.

We're not inventing an ABI rule that cold functions are special and must have a specific alignment; we just want cold functions to have the same alignment as non-cold functions. (And if you specify -Os then that alignment changes to 1, for cold and non-cold functions alike.)

The only thing that makes cold functions special in this implementation is that, as an implementation convenience, clang's cold attribute also sets optsize which has an effect that we are finding inconvenient. We just want cold not to have that particular effect. We played around with several different ways to make that happen, and this seemed the least intrusive.

I suppose another way to go at this would be for -Os or clang's minsize to add both optsize and align(1) to functions, and have LLVM's implementation of optsize stop implying align(1). This would make everything more orthogonal and disentangled, yes? And then clang's cold could set optsize without affecting alignment; if non-PS targets still wanted cold to mean align(1), we can set that as well. No new attributes, and that bit of code down in LLVM to compute alignment gets a wee bit cleaner. WDYT?

@efriedma-quic
Copy link
Collaborator

I'll try describing it a different way.

There are two kinds of alignment. "ABI alignment" is required for the code to function correctly (due to things like architecture constraints, reusing the low bit for member pointers, binary patching). "Preferred alignment" is additional alignment added by the backend to try to improve performance (because, for example, the CPU you're targeting fetches instructions in aligned 16-byte chunks).

On x86, the "ABI alignment" is 1, unless a function is explicitly marked with an "align" attribute, in which case the "ABI alignment" is the marked alignment.

For the sake of performance, the backend increases the alignment of most functions. But that's not contractual; it's just an implementation detail. If your code is relying on the backend to produce some particular alignment without an "align" marking, that's a bug in your code; you're just getting lucky. So the starting point is that all your code is semantically broken.

Given that as a starting point, it should be clear why you shouldn't be special-casing "cold" functions, and why you shouldn't be relying on the backend's "default" alignment.

@pogo59
Copy link
Collaborator

pogo59 commented Nov 21, 2023

The psABI doesn't say anything about function alignment, implying that align(1) is functionally correct, and anything more than that is either an optimization or functional dependency for our target. So, if we depend on anything more, we have to set alignment explicitly.

@FlameTop does that explanation sound reasonable to you? So PS4/PS5 should be explicitly setting alignment on all functions that don't have an alignment specified in source. No need for a new attribute.

@FlameTop
Copy link
Contributor Author

FlameTop commented Nov 22, 2023

@pogo59 I agree that stating the PS ABI has an explicit function alignment unless overridden does negate the need for a new attribute. I will re-work the code to remove the new attribute and instead enforce it for the PS ABI target.

Edit: Actually we still have the problem of when in LLVM not knowing if optimze-by-size got set on the cold funtion by it being cold or if Os was used.

@pogo59
Copy link
Collaborator

pogo59 commented Jan 8, 2024

Edit: Actually we still have the problem of when in LLVM not knowing if optimze-by-size got set on the cold funtion by it being cold or if Os was used.

Not sure I understand the question (sorry I missed this comment when it was made). The suggestion is that for our target, clang sets explicit alignment, so LLVM should not need to know where optsize came from. LLVM's handling of optsize will not reset an alignment that is already set. Is there a situation where the wrong thing still happens?

@FlameTop FlameTop closed this Feb 22, 2024
@FlameTop
Copy link
Contributor Author

Closing this PR. I will redesign the change and re-submit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:codegen clang Clang issues not falling into any other category llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants