-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVPTX] Fix code generation for trap-unreachable
.
#67478
Conversation
Actually, no. Because |
https://reviews.llvm.org/D152789 added an `exit` op before each `unreachable`. This means we never get to the `trap` instruction. - When `trap-unreachable` is enabled and `no-trap-after-noreturn` is not, don't insert `exit` before each `unreachable`. - Lower ISD::TRAP to both `trap` and `exit` instead of just the former. The fix doesn't work with `no-trap-after-noreturn`, because the `unreachable`s not following a `noreturn` are lowered to `exit; trap; exit;`. An alternative approach would be to insert `trap`s in `NVPTXLowerUnreachablePass`, depending on the `trap-unreachable` and `no-trap-after-noreturn` settings. I think we would then want skip lowering ISD::TRAP, so that we don't end up with `trap; exit; trap;` sequences.
8506012
to
1ef7379
Compare
@llvm/pr-subscribers-llvm-selectiondag Changeshttps://reviews.llvm.org/D152789 added an
The fix doesn't work with An alternative approach would be to insert Full diff: https://github.com/llvm/llvm-project/pull/67478.diff 5 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index f39b62abdd87790..ba0ab3586f75825 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3226,14 +3226,9 @@ void SelectionDAGBuilder::visitUnreachable(const UnreachableInst &I) {
// We may be able to ignore unreachable behind a noreturn call.
if (DAG.getTarget().Options.NoTrapAfterNoreturn) {
- const BasicBlock &BB = *I.getParent();
- if (&I != &BB.front()) {
- BasicBlock::const_iterator PredI =
- std::prev(BasicBlock::const_iterator(&I));
- if (const CallInst *Call = dyn_cast<CallInst>(&*PredI)) {
+ if (const CallInst *Call = dyn_cast_or_null<CallInst>(I.getPrevNode())) {
if (Call->doesNotReturn())
return;
- }
}
}
diff --git a/llvm/lib/Target/NVPTX/NVPTX.h b/llvm/lib/Target/NVPTX/NVPTX.h
index c5816b9266dfd9e..8dc68911fff0c05 100644
--- a/llvm/lib/Target/NVPTX/NVPTX.h
+++ b/llvm/lib/Target/NVPTX/NVPTX.h
@@ -47,7 +47,8 @@ MachineFunctionPass *createNVPTXReplaceImageHandlesPass();
FunctionPass *createNVPTXImageOptimizerPass();
FunctionPass *createNVPTXLowerArgsPass();
FunctionPass *createNVPTXLowerAllocaPass();
-FunctionPass *createNVPTXLowerUnreachablePass();
+FunctionPass *createNVPTXLowerUnreachablePass(bool TrapUnreachable,
+ bool NoTrapAfterNoreturn);
MachineFunctionPass *createNVPTXPeephole();
MachineFunctionPass *createNVPTXProxyRegErasurePass();
diff --git a/llvm/lib/Target/NVPTX/NVPTXLowerUnreachable.cpp b/llvm/lib/Target/NVPTX/NVPTXLowerUnreachable.cpp
index 1d312f82e6c061c..efafd909b93be37 100644
--- a/llvm/lib/Target/NVPTX/NVPTXLowerUnreachable.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXLowerUnreachable.cpp
@@ -72,6 +72,7 @@
#include "llvm/IR/Function.h"
#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/Type.h"
#include "llvm/Pass.h"
@@ -83,14 +84,19 @@ void initializeNVPTXLowerUnreachablePass(PassRegistry &);
namespace {
class NVPTXLowerUnreachable : public FunctionPass {
+ StringRef getPassName() const override;
bool runOnFunction(Function &F) override;
+ bool shouldEmitTrap(const UnreachableInst &I) const;
public:
static char ID; // Pass identification, replacement for typeid
- NVPTXLowerUnreachable() : FunctionPass(ID) {}
- StringRef getPassName() const override {
- return "add an exit instruction before every unreachable";
- }
+ NVPTXLowerUnreachable(bool TrapUnreachable, bool NoTrapAfterNoreturn)
+ : FunctionPass(ID), TrapUnreachable(TrapUnreachable),
+ NoTrapAfterNoreturn(NoTrapAfterNoreturn) {}
+
+private:
+ bool TrapUnreachable;
+ bool NoTrapAfterNoreturn;
};
} // namespace
@@ -99,6 +105,24 @@ char NVPTXLowerUnreachable::ID = 1;
INITIALIZE_PASS(NVPTXLowerUnreachable, "nvptx-lower-unreachable",
"Lower Unreachable", false, false)
+StringRef NVPTXLowerUnreachable::getPassName() const {
+ return "add an exit instruction before every unreachable";
+}
+
+// =============================================================================
+// Returns whether a `trap` intrinsic should be emitted before I.
+//
+// This is a copy of the logic in SelectionDAGBuilder::visitUnreachable().
+// =============================================================================
+bool NVPTXLowerUnreachable::shouldEmitTrap(const UnreachableInst &I) const {
+ if (!TrapUnreachable)
+ return false;
+ if (!NoTrapAfterNoreturn)
+ return true;
+ const CallInst *Call = dyn_cast_or_null<CallInst>(I.getPrevNode());
+ return Call && Call->doesNotReturn();
+}
+
// =============================================================================
// Main function for this pass.
// =============================================================================
@@ -109,18 +133,25 @@ bool NVPTXLowerUnreachable::runOnFunction(Function &F) {
LLVMContext &C = F.getContext();
FunctionType *ExitFTy = FunctionType::get(Type::getVoidTy(C), false);
InlineAsm *Exit = InlineAsm::get(ExitFTy, "exit;", "", true);
+ Function *Trap = nullptr;
bool Changed = false;
for (auto &BB : F)
for (auto &I : BB) {
if (auto unreachableInst = dyn_cast<UnreachableInst>(&I)) {
- Changed = true;
+ if (shouldEmitTrap(*unreachableInst)) {
+ if (!Trap)
+ Trap = Intrinsic::getDeclaration(F.getParent(), Intrinsic::trap);
+ CallInst::Create(Trap, "", unreachableInst);
+ }
CallInst::Create(ExitFTy, Exit, "", unreachableInst);
+ Changed = true;
}
}
return Changed;
}
-FunctionPass *llvm::createNVPTXLowerUnreachablePass() {
- return new NVPTXLowerUnreachable();
+FunctionPass *llvm::createNVPTXLowerUnreachablePass(bool TrapUnreachable,
+ bool NoTrapAfterNoreturn) {
+ return new NVPTXLowerUnreachable(TrapUnreachable, NoTrapAfterNoreturn);
}
diff --git a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
index cad97b1f14eb2b9..8d895762fbe1d9d 100644
--- a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
@@ -63,13 +63,6 @@ static cl::opt<bool> UseShortPointersOpt(
"Use 32-bit pointers for accessing const/local/shared address spaces."),
cl::init(false), cl::Hidden);
-// FIXME: intended as a temporary debugging aid. Should be removed before it
-// makes it into the LLVM-17 release.
-static cl::opt<bool>
- ExitOnUnreachable("nvptx-exit-on-unreachable",
- cl::desc("Lower 'unreachable' as 'exit' instruction."),
- cl::init(true), cl::Hidden);
-
namespace llvm {
void initializeGenericToNVVMLegacyPassPass(PassRegistry &);
@@ -410,8 +403,9 @@ void NVPTXPassConfig::addIRPasses() {
addPass(createSROAPass());
}
- if (ExitOnUnreachable)
- addPass(createNVPTXLowerUnreachablePass());
+ const auto &Options = getNVPTXTargetMachine().Options;
+ addPass(createNVPTXLowerUnreachablePass(Options.TrapUnreachable,
+ Options.NoTrapAfterNoreturn));
}
bool NVPTXPassConfig::addInstSelector() {
diff --git a/llvm/test/CodeGen/NVPTX/unreachable.ll b/llvm/test/CodeGen/NVPTX/unreachable.ll
index 742089df1bd4533..011497c4e23401a 100644
--- a/llvm/test/CodeGen/NVPTX/unreachable.ll
+++ b/llvm/test/CodeGen/NVPTX/unreachable.ll
@@ -1,5 +1,11 @@
-; RUN: llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs | FileCheck %s
-; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 -verify-machineinstrs | FileCheck %s
+; RUN: llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs \
+; RUN: | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-NOTRAP
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 -verify-machineinstrs \
+; RUN: | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-NOTRAP
+; RUN: llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs -trap-unreachable \
+; RUN: | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-TRAP
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 -verify-machineinstrs -trap-unreachable \
+; RUN: | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-TRAP
; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs | %ptxas-verify %}
; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_20 -verify-machineinstrs | %ptxas-verify %}
@@ -11,7 +17,10 @@ define void @kernel_func() {
; CHECK: call.uni
; CHECK: throw,
call void @throw()
-; CHECK: exit
+; CHECK-TRAP-NOT: exit;
+; CHECK-TRAP: trap;
+; CHECK-NOTRAP-NOT: trap;
+; CHECK: exit;
unreachable
}
|
I moved the implementation to I'm wondering though if we should convey to |
That would be fine. |
@maleadt FYI. For some reason I can't add you as a reviewer on this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general.
It was possible for multiDimWarpId[1] to be 0 which then gets translated into a `urem 0, 0` and results in an unreachable when going through llvm, an empty kernel, and nans. chsigg is working on a fix to lower the unreachable in llvm to a trap (llvm/llvm-project#67478).
It was possible for multiDimWarpId[1] to be 0 which then gets translated into a `urem 0, 0` and results in an unreachable when going through llvm, an empty kernel, and nans. chsigg is working on a fix to lower the unreachable in llvm to a trap (llvm/llvm-project#67478).
✅ With the latest revision this PR passed the C/C++ code formatter. |
Ok, I changed it to that. It seems a little cleaner because it a) avoids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
It didn't occur to me to preserve the trap
as in Julia we're mostly avoiding that instruction since it 'breaks' the GPU for the current process, without a way to reset.
EDIT: just to be sure, I ran our GPU testsuite on an older device where JuliaGPU/CUDA.jl#1746 is an issue, and using trap; exit
as proposed here doesn't seem to cause any issues.
It was possible for multiDimWarpId[1] to be 0 which then gets translated into a `urem 0, 0` and results in an unreachable when going through llvm, an empty kernel, and nans. This PR uses ceiling to clamp the result to be >=1. chsigg is working on a fix to lower the unreachable in llvm to a trap (llvm/llvm-project#67478).
Local branch amd-gfx 5546a04 Merged main:18461dc45483 into amd-gfx:2f63a363aacc Remote branch main 5b7a7ec [NVPTX] Fix code generation for `trap-unreachable`. (llvm#67478)
We've removed #72641 tries to disable it for some unreachable branches, which I'm not convinced we should do, but we do not seem to have an escape hatch option any more, as an alternative. |
It was possible for multiDimWarpId[1] to be 0 which then gets translated into a `urem 0, 0` and results in an unreachable when going through llvm, an empty kernel, and nans. This PR uses ceiling to clamp the result to be >=1. chsigg is working on a fix to lower the unreachable in llvm to a trap (llvm/llvm-project#67478).
https://reviews.llvm.org/D152789 added an
exit
op before eachunreachable
. This means we never get to thetrap
instruction.This change limits the insertion of
exit
instructions to the cases whereunreachable
is not lowered totrap
. Trap itself is changed to be emitted astrap; exit;
to convey toptxas
that it exits the CFG.