Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GlobalOpt] Don't replace aliasee with alias that has weak linkage #91483

Merged
merged 4 commits into from
May 16, 2024

Conversation

DianQK
Copy link
Member

@DianQK DianQK commented May 8, 2024

Fixes #91312.

Don't perform the transform if the alias may be replaced at link time while someone is using the aliasee (e.g., multiple aliases potentially target it or someone calls it).

If necessary, for a conservative approach, I can backport only ab7be41 after the PR is approved.

@DianQK DianQK changed the title [WIP][GlobalOpt] Don't replace aliasee with alias that has weak linkage [GlobalOpt] Don't replace aliasee with alias that has weak linkage May 9, 2024
@DianQK DianQK marked this pull request as ready for review May 9, 2024 14:16
@llvmbot
Copy link
Collaborator

llvmbot commented May 9, 2024

@llvm/pr-subscribers-llvm-transforms

Author: DianQK (DianQK)

Changes

Fixes #91312.

Don't perform the transform if the alias may be replaced at link time while someone is using the aliasee (e.g., multiple aliases potentially target it or someone calls it).


Full diff: https://github.com/llvm/llvm-project/pull/91483.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/GlobalOpt.cpp (+19)
  • (added) llvm/test/Transforms/GlobalOpt/alias-weak.ll (+56)
  • (modified) llvm/test/Transforms/LowerTypeTests/cfi-nounwind-direct-call.ll (+2-2)
  • (modified) llvm/test/Transforms/LowerTypeTests/cfi-unwind-direct-call.ll (+4-4)
diff --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
index c8c835115a992..53f54294c3b39 100644
--- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp
+++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
@@ -2215,6 +2215,17 @@ static bool hasUseOtherThanLLVMUsed(GlobalAlias &GA, const LLVMUsed &U) {
   return !U.usedCount(&GA) && !U.compilerUsedCount(&GA);
 }
 
+static bool hasMoreThanOneUseOtherThanLLVMUsed(GlobalValue &V,
+                                               const LLVMUsed &U) {
+  unsigned N = 2;
+  assert((!U.usedCount(&V) || !U.compilerUsedCount(&V)) &&
+         "We should have removed the duplicated "
+         "element from llvm.compiler.used");
+  if (U.usedCount(&V) || U.compilerUsedCount(&V))
+    ++N;
+  return V.hasNUsesOrMore(N);
+}
+
 static bool mayHaveOtherReferences(GlobalValue &GV, const LLVMUsed &U) {
   if (!GV.hasLocalLinkage())
     return true;
@@ -2224,6 +2235,7 @@ static bool mayHaveOtherReferences(GlobalValue &GV, const LLVMUsed &U) {
 
 static bool hasUsesToReplace(GlobalAlias &GA, const LLVMUsed &U,
                              bool &RenameTarget) {
+
   RenameTarget = false;
   bool Ret = false;
   if (hasUseOtherThanLLVMUsed(GA, U))
@@ -2242,6 +2254,13 @@ static bool hasUsesToReplace(GlobalAlias &GA, const LLVMUsed &U,
   //   define ... @a(...)
   Constant *Aliasee = GA.getAliasee();
   GlobalValue *Target = cast<GlobalValue>(Aliasee->stripPointerCasts());
+  // Don't perform the transform if the alias may be replaced at link time while
+  // someone is using the aliasee (e.g., multiple aliases potentially target it
+  // or someone calls it).
+  if (GA.isWeakForLinker())
+    Ret = false;
+  if (hasMoreThanOneUseOtherThanLLVMUsed(*Target, U))
+    return Ret;
   if (mayHaveOtherReferences(*Target, U))
     return Ret;
 
diff --git a/llvm/test/Transforms/GlobalOpt/alias-weak.ll b/llvm/test/Transforms/GlobalOpt/alias-weak.ll
new file mode 100644
index 0000000000000..e24085f00d714
--- /dev/null
+++ b/llvm/test/Transforms/GlobalOpt/alias-weak.ll
@@ -0,0 +1,56 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --include-generated-funcs --version 4
+; RUN: opt < %s -passes=globalopt -S | FileCheck %s
+
+@f1_alias = linkonce_odr hidden alias void (), ptr @f1
+@f2_alias = linkonce_odr hidden alias void (), ptr @f2
+
+define void @foo() {
+  call void @f1_alias()
+  ret void
+}
+
+define void @bar() {
+  call void @f1()
+  ret void
+}
+
+define void @baz() {
+  call void @f2_alias()
+  ret void
+}
+
+; We cannot use `f1_alias` to replace `f1` because they are both in use
+; and `f1_alias` could be replaced at link time.
+define internal void @f1() {
+  ret void
+}
+
+; We can use `f2_alias` to replace `f2` because `b2` is not in use.
+define internal void @f2() {
+  ret void
+}
+;.
+; CHECK: @f1_alias = linkonce_odr hidden alias void (), ptr @f1
+;.
+; CHECK-LABEL: define void @foo() local_unnamed_addr {
+; CHECK-NEXT:    call void @f1_alias()
+; CHECK-NEXT:    ret void
+;
+;
+; CHECK-LABEL: define void @bar() local_unnamed_addr {
+; CHECK-NEXT:    call void @f1()
+; CHECK-NEXT:    ret void
+;
+;
+; CHECK-LABEL: define void @baz() local_unnamed_addr {
+; CHECK-NEXT:    call void @f2_alias()
+; CHECK-NEXT:    ret void
+;
+;
+; CHECK-LABEL: define internal void @f1() {
+; CHECK-NEXT:    ret void
+;
+;
+; CHECK-LABEL: define linkonce_odr hidden void @f2_alias() local_unnamed_addr {
+; CHECK-NEXT:    ret void
+;
diff --git a/llvm/test/Transforms/LowerTypeTests/cfi-nounwind-direct-call.ll b/llvm/test/Transforms/LowerTypeTests/cfi-nounwind-direct-call.ll
index 2795333effd76..f05befb10658b 100644
--- a/llvm/test/Transforms/LowerTypeTests/cfi-nounwind-direct-call.ll
+++ b/llvm/test/Transforms/LowerTypeTests/cfi-nounwind-direct-call.ll
@@ -121,7 +121,7 @@ attributes #6 = { noreturn nounwind }
 ; CHECK-LABEL: define dso_local noundef range(i32 0, 2) i32 @_Z10call_catchi
 ; CHECK-SAME: (i32 noundef [[NUM:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] !type [[META4]] !type [[META5]] !type [[META6]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    store ptr @_Z9nothrow_ei.cfi_jt, ptr @catch_ptr, align 8, !tbaa [[TBAA7:![0-9]+]]
+; CHECK-NEXT:    store ptr @.cfi.jumptable, ptr @catch_ptr, align 8, !tbaa [[TBAA7:![0-9]+]]
 ; CHECK-NEXT:    [[TOBOOL_NOT_I:%.*]] = icmp ne i32 [[NUM]], 0
 ; CHECK-NEXT:    [[DOT_I:%.*]] = zext i1 [[TOBOOL_NOT_I]] to i32
 ; CHECK-NEXT:    ret i32 [[DOT_I]]
@@ -152,7 +152,7 @@ attributes #6 = { noreturn nounwind }
 ;
 ;
 ; CHECK: Function Attrs: naked nocf_check noinline nounwind
-; CHECK-LABEL: define internal void @_Z9nothrow_ei.cfi_jt
+; CHECK-LABEL: define private void @.cfi.jumptable
 ; CHECK-SAME: () #[[ATTR5:[0-9]+]] align 8 {
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    tail call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr nonnull @_Z9nothrow_ei) #[[ATTR7:[0-9]+]]
diff --git a/llvm/test/Transforms/LowerTypeTests/cfi-unwind-direct-call.ll b/llvm/test/Transforms/LowerTypeTests/cfi-unwind-direct-call.ll
index 3e1f8b97e98b8..ca540a1cc5f97 100644
--- a/llvm/test/Transforms/LowerTypeTests/cfi-unwind-direct-call.ll
+++ b/llvm/test/Transforms/LowerTypeTests/cfi-unwind-direct-call.ll
@@ -174,8 +174,8 @@ attributes #8 = { noreturn nounwind }
 ; CHECK-LABEL: define dso_local void @_Z10call_catchi(
 ; CHECK-SAME: i32 noundef [[NUM:%.*]]) local_unnamed_addr #[[ATTR0]] personality ptr @__gxx_personality_v0 !type [[META4]] !type [[META5]] !type [[META6]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    store ptr @_Z7throw_ei.cfi_jt, ptr @catch_ptr, align 8, !tbaa [[TBAA11:![0-9]+]]
-; CHECK-NEXT:    invoke void @_Z7throw_ei.cfi_jt(i32 noundef [[NUM]]) #[[ATTR8:[0-9]+]]
+; CHECK-NEXT:    store ptr @.cfi.jumptable, ptr @catch_ptr, align 8, !tbaa [[TBAA11:![0-9]+]]
+; CHECK-NEXT:    invoke void @.cfi.jumptable(i32 noundef [[NUM]]) #[[ATTR8:[0-9]+]]
 ; CHECK-NEXT:            to label [[TRY_CONT:%.*]] unwind label [[LPAD:%.*]], !callees [[META13:![0-9]+]]
 ; CHECK:       lpad:
 ; CHECK-NEXT:    [[TMP0:%.*]] = landingpad { ptr, i32 }
@@ -220,7 +220,7 @@ attributes #8 = { noreturn nounwind }
 ;
 ;
 ; CHECK: Function Attrs: naked nocf_check noinline
-; CHECK-LABEL: define internal void @_Z7throw_ei.cfi_jt(
+; CHECK-LABEL: define private void @.cfi.jumptable(
 ; CHECK-SAME: ) #[[ATTR5:[0-9]+]] align 8 {
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    tail call void asm sideeffect "jmp ${0:c}@plt\0Aint3\0Aint3\0Aint3\0A", "s"(ptr nonnull @_Z7throw_ei) #[[ATTR6]]
@@ -236,6 +236,6 @@ attributes #8 = { noreturn nounwind }
 ; CHECK: [[META10]] = !{!"Simple C++ TBAA"}
 ; CHECK: [[TBAA11]] = !{[[META12:![0-9]+]], [[META12]], i64 0}
 ; CHECK: [[META12]] = !{!"any pointer", [[META9]], i64 0}
-; CHECK: [[META13]] = !{ptr @_Z7throw_ei.cfi_jt}
+; CHECK: [[META13]] = !{ptr @.cfi.jumptable}
 ; CHECK: [[META14]] = !{}
 ;.

@DianQK
Copy link
Member Author

DianQK commented May 9, 2024

@oskarwirga @hstk30-hw
I'm not sure if the test case updates for LowerTypeTests are correct. Could you take a look? Thanks.

@avikivity
Copy link
Contributor

Bump. Could we get this reviewed? It's a regression from clang 16 with no workaround.

@hstk30-hw
Copy link
Contributor

Sorry, I'm not familiar with those testcase. @nikic @efriedma-quic Can you review it?

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble following all the different cases interleaved together. I just spent half an hour trying to follow this, and I feel like I understand less than when I started.

Please delete the Ret variable, and replace the corresponding logic with something actually comprehensible.

// or someone calls it).
if (GA.isWeakForLinker())
Ret = false;
if (hasMoreThanOneUseOtherThanLLVMUsed(*Target, U))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's a stripPointerCasts() earlier, do we need to check for other uses of a cast ConstantExpr, in addition to other uses of the global itself?

Or maybe we can just drop the whole pointer-cast-stripping thing... with opaque pointer types, not sure if there's any legitimate cast that can show up here. (An addressspacecast would be allowed by the verifier, but not sure it's something we can actually generate code for.)

@DianQK
Copy link
Member Author

DianQK commented May 16, 2024

I'm having trouble following all the different cases interleaved together. I just spent half an hour trying to follow this, and I feel like I understand less than when I started.

Please delete the Ret variable, and replace the corresponding logic with something actually comprehensible.

How do you feel about this PR only being submitted as ab7be41? I want to backport it to LLVM 18. I can create another PR for b901a98.

@efriedma-quic
Copy link
Collaborator

ab7be41 looks conservatively correct. Sure, just merging that to main and 18 for now seems like a plan.

@DianQK
Copy link
Member Author

DianQK commented May 16, 2024

I have reverted b901a98.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DianQK DianQK merged commit c796900 into llvm:main May 16, 2024
3 of 4 checks passed
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request May 16, 2024
…lvm#91483)

Fixes llvm#91312.

Don't perform the transform if the alias may be replaced at link time.

(cherry picked from commit c796900)
@MaskRay
Copy link
Member

MaskRay commented May 16, 2024

How do we get @f2 = linkonce_odr hidden alias void (), ptr @f1 in the first place? This codegen seems problematic if f2 can be defined in another module as something different.

weak/linkonce stopping the replace makes sense to me but weak_odr/linkonce_odr should not.

@DianQK
Copy link
Member Author

DianQK commented May 17, 2024

How do we get @f2 = linkonce_odr hidden alias void (), ptr @f1 in the first place? This codegen seems problematic if f2 can be defined in another module as something different.

weak/linkonce stopping the replace makes sense to me but weak_odr/linkonce_odr should not.

Yes. IIUC, weak_odr/linkonce_odr should assume that the definitions of all symbols are the same.

Do you have more information? @avikivity

@efriedma-quic
Copy link
Collaborator

linkonce_odr on a function means you can replace a call to the function with the function's body (i.e. inlining). linkonce_odr on a function alias should mean the same thing, I think: a call to the alias can be replaced with the aliasee function's body. This allows a lot of useful optimizations, but it doesn't allow anything more that that. In particular, if you have an internal symbol @f1, and @f2 = linkonce_odr hidden alias void (), ptr @f1, the "odr-ness" only means that you can replace a call to @f2 with a call to @f1. It doesn't allow replacing a call to @f1 with a call to @f2, which is the transform that was causing trouble here. We sometimes refer to this as "derefinement".

Granted, the usage of linkonce_odr for aliases pointing to __tls_init stretches this definition of "equivalence" to its extreme limit; the different aliases don't do the same thing. It only works because the C++ standard explicitly allows reordering the initialization of thread-local variables.

@DianQK DianQK deleted the globalopt-linkonce_odr branch May 17, 2024 20:41
@DianQK
Copy link
Member Author

DianQK commented May 17, 2024

I don't know much about TLS, but it sounds reasonable. Though I'm not sure exactly how the IR is generated yet.

tstellar pushed a commit to llvmbot/llvm-project that referenced this pull request May 17, 2024
…lvm#91483)

Fixes llvm#91312.

Don't perform the transform if the alias may be replaced at link time.

(cherry picked from commit c796900)
@MaskRay
Copy link
Member

MaskRay commented May 17, 2024

linkonce_odr on a function means you can replace a call to the function with the function's body (i.e. inlining). linkonce_odr on a function alias should mean the same thing, I think: a call to the alias can be replaced with the aliasee function's body. This allows a lot of useful optimizations, but it doesn't allow anything more that that. In particular, if you have an internal symbol @f1, and @f2 = linkonce_odr hidden alias void (), ptr @f1, the "odr-ness" only means that you can replace a call to @f2 with a call to @f1.

Agreed.

It doesn't allow replacing a call to @f1 with a call to @f2, which is the transform that was causing trouble here. We sometimes refer to this as "derefinement".

While the transformation replaces aliases with their underlying functions, it doesn't work the other way around.
This patch prevents an optimization that would replace linkonce_odr/weak_odr aliases with their aliasees.
This is my concern.

Test output before this isWeakForlinker patch. No aliasee is replaced with its alias.

define void @foo() local_unnamed_addr {
  call void @f1_alias()
  ret void
}

define void @bar() local_unnamed_addr {
  call void @f1_alias()
  ret void
}

define void @baz() local_unnamed_addr {
  call void @f2_alias()
  ret void
}

define linkonce_odr hidden void @f1_alias() local_unnamed_addr {
  ret void
}

define linkonce_odr hidden void @f2_alias() local_unnamed_addr {
  ret void
}

Test output after this isWeakForlinker patch

@f1_alias = linkonce_odr hidden alias void (), ptr @f1
@f2_alias = linkonce_odr hidden alias void (), ptr @f2

define void @foo() local_unnamed_addr {
  call void @f1_alias()
  ret void
}

define void @bar() local_unnamed_addr {
  call void @f1()
  ret void
}

define void @baz() local_unnamed_addr {
  call void @f2_alias()
  ret void
}

define internal void @f1() {
  ret void
}

define internal void @f2() {
  ret void
}

Granted, the usage of linkonce_odr for aliases pointing to __tls_init stretches this definition of "equivalence" to its extreme limit; the different aliases don't do the same thing. It only works because the C++ standard explicitly allows reordering the initialization of thread-local variables.

Agree there is some stretch about the limit.
I want to know a minimal reproduce to better understand why __tls_init was problematic.

@efriedma-quic
Copy link
Collaborator

Test output before this isWeakForlinker patch. No aliasee is replaced with its alias.

In @bar, the call to @f1 is a no-op. It's replaced with a call to @f1_alias... which non-deterministically refers to any definition of @f1_alias, which could have arbitrary effects.

While the transformation replaces aliases with their underlying functions, it doesn't work the other way around

The bug happens in the case where the underlying function has other uses. So you have an internal function with some callers, and then those callers are effectively rewritten to call a linkonce function.


The TLS case looks like the following:

struct A { A(); };
struct B { B(); };
inline thread_local A a;
static thread_local B b;
void g(A*, B*);
void f() { g(&a, &b); }
void f2() { g(&a, &b); }
void f3() { g(&a, &b); }
void f4() { g(&a, &b); }
void f5() { g(&a, &b); }

At -O0, the initialization of "a" and "b" happens in define internal void @__tls_init(). At -Oz, this becomes linkonce_odr dso_local void @_ZTH1a()... so if a is also defined in another file, we can skip the constructor for b.

@MaskRay
Copy link
Member

MaskRay commented May 18, 2024

Ah, I misread the code.
In bar, call void @f1() calling the aliasee was replaced with the alias before (RenameTarget).
This was problematic and this patch have fixed it.

This patch and its backport to release/18.x looks good to me.

I agree that some cleanup could be made to improve readability of this code.


At -O0, the initialization of "a" and "b" happens in define internal void @__tls_init(). At -Oz, this becomes linkonce_odr dso_local void @_ZTH1a()... so if a is also defined in another file, we can skip the constructor for b.

Thanks for the example.

At -O0, define internal void @__tls_init() initializes a and b.
Previously, __tls_init was incorrectly renamed (due to RenameTarget) to _ZTH1a (like derefinement).
If the linker picks the prevailing definition of _ZTH1a from another module, we would skip the constructor of b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

miscompiled TLS wrapper, likely coroutine related, with sanitizer
7 participants