-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CGP] Permit tail call optimization on undefined return value #82419
[CGP] Permit tail call optimization on undefined return value #82419
Conversation
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-backend-x86 Author: Antonio Frighetto (antoniofrighetto) ChangesWe should be able to freely allow tail call optimization on undefined values as well. Fixes: #82387. Full diff: https://github.com/llvm/llvm-project/pull/82419.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 4036f18dbc6794..feefe87f406365 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -2686,8 +2686,9 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
attributesPermitTailCall(F, CI, RetI, *TLI)) {
// Either we return void or the return value must be the first
// argument of a known intrinsic or library function.
- if (!V || (isIntrinsicOrLFToBeTailCalled(TLInfo, CI) &&
- V == CI->getArgOperand(0))) {
+ if (!V || isa<UndefValue>(V) ||
+ (isIntrinsicOrLFToBeTailCalled(TLInfo, CI) &&
+ V == CI->getArgOperand(0))) {
TailCallBBs.push_back(Pred);
}
}
diff --git a/llvm/test/CodeGen/X86/tailcall-cgp-dup.ll b/llvm/test/CodeGen/X86/tailcall-cgp-dup.ll
index 401ed9f7bc5a9e..92811c87f5623f 100644
--- a/llvm/test/CodeGen/X86/tailcall-cgp-dup.ll
+++ b/llvm/test/CodeGen/X86/tailcall-cgp-dup.ll
@@ -362,8 +362,30 @@ return:
ret ptr %src
}
+@i = global i32 0, align 4
+
+define i32 @undef_tailc() nounwind {
+; CHECK-LABEL: undef_tailc:
+; CHECK: ## %bb.0:
+; CHECK-NEXT: cmpl $0, _i(%rip)
+; CHECK-NEXT: jne _qux ## TAILCALL
+; CHECK-NEXT: ## %bb.1:
+; CHECK-NEXT: retq
+ %1 = load i32, ptr @i, align 4
+ %2 = icmp eq i32 %1, 0
+ br i1 %2, label %5, label %3
+
+3:
+ %4 = tail call i32 @qux()
+ br label %5
+
+5:
+ ret i32 undef
+}
+
declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1)
declare void @llvm.memset.p0.i64(ptr nocapture writeonly, i8, i64, i1)
declare noalias ptr @malloc(i64)
declare ptr @strcpy(ptr noalias returned writeonly, ptr noalias nocapture readonly)
declare ptr @baz(ptr, ptr)
+declare i32 @qux()
|
@@ -2686,8 +2686,9 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB, | |||
attributesPermitTailCall(F, CI, RetI, *TLI)) { | |||
// Either we return void or the return value must be the first | |||
// argument of a known intrinsic or library function. | |||
if (!V || (isIntrinsicOrLFToBeTailCalled(TLInfo, CI) && | |||
V == CI->getArgOperand(0))) { | |||
if (!V || isa<UndefValue>(V) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just leave V as nullptr back at line 2602?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the explicit UndefValue check is fine. Especially if we also want to extend the phi case above to handle undef incoming values later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we ever hit the phi case? Looking at https://clang.godbolt.org/z/eq8vTxvzM, it seems that in the first case it suffices one non-undef incoming value to tail call, whereas the second case is refined by CGP as follows:
define dso_local i32 @foo() local_unnamed_addr #0 {
%1 = load i32, ptr @i, align 4
switch i32 %1, label %6 [
i32 2, label %2
i32 5, label %4
]
2: ; preds = %0
%3 = tail call i32 @bar() #2
br label %6
4: ; preds = %0
%5 = tail call i32 @qux() #2
ret i32 %5
6: ; preds = %2, %0
ret i32 undef
}
Thus getting tail called as part of this change.
Confirmed on RISC-V to fix the reproducer for the issue mentioned above, thanks a lot for the fast PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -2686,8 +2686,9 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB, | |||
attributesPermitTailCall(F, CI, RetI, *TLI)) { | |||
// Either we return void or the return value must be the first | |||
// argument of a known intrinsic or library function. | |||
if (!V || (isIntrinsicOrLFToBeTailCalled(TLInfo, CI) && | |||
V == CI->getArgOperand(0))) { | |||
if (!V || isa<UndefValue>(V) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the explicit UndefValue check is fine. Especially if we also want to extend the phi case above to handle undef incoming values later.
From CI: Failed Tests (3): |
We may freely allow tail call optzs on undef values as well. Fixes: llvm#82387.
7fd2a6f
to
25e7e8d
Compare
We should be able to freely allow tail call optimization on undefined values as well.
Fixes: #82387.