Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PowerPC: fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail #56679

Closed
pkubaj opened this issue Jul 22, 2022 · 10 comments

Comments

@pkubaj
Copy link
Contributor

pkubaj commented Jul 22, 2022

FreeBSD 13.1-RELEASE on powerpc / powerpc64 / powerpc64le
LLVM 14.0.6

namespace std {
template <bool, class, class _Then> using conditional_t = _Then;
void is_same_v();
template <class _Tp> using remove_cvref_t = _Tp;
template <class = void> struct coroutine_handle;
namespace experimental {
template <class R, class...> struct coroutine_traits : R {};
template <typename Promise>
struct coroutine_handle : std::coroutine_handle<Promise> {};
} // namespace experimental
template <> struct coroutine_handle<> {
  coroutine_handle(decltype(nullptr));
  void *address() noexcept;
};
template <class> struct coroutine_handle : coroutine_handle<> {
  static coroutine_handle from_address(void *) noexcept;
};
struct suspend_always {
  bool await_ready();
  void await_suspend(coroutine_handle<>);
  void await_resume();
};
inline namespace _LIBCPP_ABI_NAMESPACE {
template <long, class> struct tuple_element;
template <class...> class tuple;
template <class...> struct __tuple_types;
template <long _Ip, class... _Types>
struct tuple_element<_Ip, __tuple_types<_Types...>> {
  typedef __type_pack_element<_Ip, _Types...> type;
};
template <long _Ip, class... _Tp> struct tuple_element<_Ip, tuple<_Tp...>> {
  typedef typename tuple_element<_Ip, __tuple_types<_Tp...>>::type type;
};
template <long _Ip, class... _Tp>
using tuple_element_t = typename tuple_element<_Ip, _Tp...>::type;
} // namespace _LIBCPP_ABI_NAMESPACE
} // namespace std
namespace QCoro {
template <typename> class AsyncGenerator;
namespace detail {
class AsyncGeneratorYieldOperation;
struct AsyncGeneratorPromiseBase {
  std::suspend_always initial_suspend();
  AsyncGeneratorYieldOperation final_suspend() noexcept;
  void unhandled_exception();
  void return_void();
};
struct AsyncGeneratorYieldOperation {
  bool await_ready() noexcept;
  std::coroutine_handle<> await_suspend(std::coroutine_handle<>) noexcept;
  void await_resume() noexcept;
};
struct IteratorAwaitableBase {
  IteratorAwaitableBase(AsyncGeneratorPromiseBase, std::coroutine_handle<>);
  void await_suspend(std::coroutine_handle<>);
};
struct AsyncGeneratorPromise : AsyncGeneratorPromiseBase {
  AsyncGenerator<std::tuple<int, bool>> get_return_object();
};
} // namespace detail
template <typename> struct AsyncGenerator {
  using promise_type = detail::AsyncGeneratorPromise;
  auto begin() {
    struct BeginIteratorAwaitable : detail::IteratorAwaitableBase {
      detail::AsyncGeneratorPromise __trans_tmp_1;
      BeginIteratorAwaitable(std::coroutine_handle<> producerCoroutine)
          : IteratorAwaitableBase(__trans_tmp_1, producerCoroutine) {}
      bool await_ready();
      void await_resume();
    };
    return BeginIteratorAwaitable{nullptr};
  }
};
} // namespace QCoro
class QWebSocket;
namespace QCoro::detail {
struct QCoroWebSocket {
  AsyncGenerator<std::tuple<int>> binaryFrames();
  QWebSocket *mWebSocket;
};
namespace concepts {
template <typename>
concept QObject = requires {
  std::is_same_v;
};
} // namespace concepts
template <concepts::QObject, typename> struct QCoroSignalQueue;
} // namespace QCoro::detail
template <QCoro::detail::concepts::QObject T, typename FuncPtr>
auto qCoroSignalListener(T, FuncPtr)
    -> QCoro::AsyncGenerator<QCoro::detail::QCoroSignalQueue<T, FuncPtr>>;
struct QWebSocket {
  void binaryFrameReceived(int, bool);
};
using namespace QCoro::detail;
template <typename> struct signal_args;
template <typename T, typename R, typename... Args>
struct signal_args<R (T::*)(Args...)> {
  using types = std::tuple<std::remove_cvref_t<Args>...>;
};
template <typename T> using signal_args_t = typename signal_args<T>::types;
template <typename> struct unwrapped_signal_args;
template <typename... Args> struct unwrapped_signal_args<std::tuple<Args...>> {
  using args_tuple = std::tuple<std::remove_cvref_t<Args>...>;
  using type =
      std::conditional_t<0, std::tuple_element_t<0, args_tuple>, args_tuple>;
};
template <typename... Args>
using unwrapped_signal_args_t = typename unwrapped_signal_args<Args...>::type;
struct WebSocketSignalWatcher {
  void ready();
};
template <typename Signal>
auto watcherGenerator(QWebSocket *, Signal)
    -> QCoro::AsyncGenerator<unwrapped_signal_args_t<signal_args_t<Signal>>> {
  void watcher();
  auto signalListener =
      qCoroSignalListener(watcher, &WebSocketSignalWatcher::ready);
  co_await signalListener.begin();
}
int binaryFrames_timeout;
QCoro::AsyncGenerator<std::tuple<int>> QCoroWebSocket::binaryFrames() {
  watcherGenerator(mWebSocket, &QWebSocket::binaryFrameReceived);
}

Build with:

clang++14 -cc1 -triple powerpc-unknown-freebsd13.1 -emit-obj -std=gnu++20 qcorowebsocket-c2cb02.cpp

Output:

fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail
@EugeneZelenko
Copy link
Contributor

Could you please try main branch? https://godbolt.org should be helpful.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 22, 2022

@llvm/issue-subscribers-backend-powerpc

@pkubaj
Copy link
Contributor Author

pkubaj commented Jul 22, 2022

godbolt.org, when compiling with power64le clang (trunk), says:

fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /opt/compiler-explorer/clang-trunk/bin/clang++ -g -o /app/output.s -S -target powerpc64le -fcolor-diagnostics -fno-crash-diagnostics -std=gnu++20 <source>
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '<source>'.
4.	Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@_Z16watcherGeneratorIM10QWebSocketFvibEEN5QCoro14AsyncGeneratorIN21unwrapped_signal_argsIN11signal_argsIT_E5typesEE4typeEEEPS0_S7_.resume'
 #0 0x00005655011c5904 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x00005655011c372c llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x38af72c)
 #2 0x00005655011027b8 llvm::CrashRecoveryContext::HandleExit(int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x37ee7b8)
 #3 0x00005655011bb832 llvm::sys::Process::Exit(int, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x38a7832)
 #4 0x00005654feb280cf (/opt/compiler-explorer/clang-trunk/bin/clang+++0x12140cf)
 #5 0x000056550110904a llvm::report_fatal_error(llvm::Twine const&, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x37f504a)
 #6 0x00005655011091de (/opt/compiler-explorer/clang-trunk/bin/clang+++0x37f51de)
 #7 0x00005654ff9a2511 llvm::PPCTargetLowering::LowerCall(llvm::TargetLowering::CallLoweringInfo&, llvm::SmallVectorImpl<llvm::SDValue>&) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x208e511)
 #8 0x00005655021ef5b8 llvm::TargetLowering::LowerCallTo(llvm::TargetLowering::CallLoweringInfo&) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x48db5b8)
 #9 0x00005655021f11ba llvm::SelectionDAGBuilder::lowerInvokable(llvm::TargetLowering::CallLoweringInfo&, llvm::BasicBlock const*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x48dd1ba)
#10 0x00005655022073d9 llvm::SelectionDAGBuilder::LowerCallTo(llvm::CallBase const&, llvm::SDValue, bool, bool, llvm::BasicBlock const*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x48f33d9)
#11 0x000056550221b2b1 llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x49072b1)
#12 0x00005655022313e7 llvm::SelectionDAGBuilder::visit(llvm::Instruction const&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x491d3e7)
#13 0x00005655022a024e llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, false, false, void>, false, true>, llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, false, false, void>, false, true>, bool&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x498c24e)
#14 0x00005655022a26c9 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x498e6c9)
#15 0x00005655022a4118 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (.part.0) SelectionDAGISel.cpp:0:0
#16 0x00005654ff9409d8 (anonymous namespace)::PPCDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) PPCISelDAGToDAG.cpp:0:0
#17 0x00005655004b75ec llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x2ba35ec)
#18 0x00005655009654f0 llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x30514f0)
#19 0x0000565500965669 llvm::FPPassManager::runOnModule(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3051669)
#20 0x0000565500966250 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3052250)
#21 0x000056550156f57a clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3c5b57a)
#22 0x00005655023e0218 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4acc218)
#23 0x00005655035aac99 clang::ParseAST(clang::Sema&, bool, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x5c96c99)
#24 0x00005655023df815 clang::CodeGenAction::ExecuteAction() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4acb815)
#25 0x0000565501ce8c01 clang::FrontendAction::Execute() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x43d4c01)
#26 0x0000565501c70ac3 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x435cac3)
#27 0x0000565501dc723b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44b323b)
#28 0x00005654feb296e4 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x12156e4)
#29 0x00005654feb22c8b ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#30 0x0000565501adaa99 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
#31 0x0000565501102637 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x37ee637)
#32 0x0000565501adb08c clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
#33 0x0000565501aa50fe clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x41910fe)
#34 0x0000565501aa5b1d clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4191b1d)
#35 0x0000565501ab096c clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x419c96c)
#36 0x00005654feb27469 clang_main(int, char**) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x1213469)
#37 0x00007f020924a0b3 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b3)
#38 0x00005654feb2289e _start (/opt/compiler-explorer/clang-trunk/bin/clang+++0x120e89e)
clang-15: error: clang frontend command failed with exit code 70 (use -v to see invocation)
Compiler returned: 70

@yuanfang-chen yuanfang-chen added the coroutines C++20 coroutines label Jul 22, 2022
@chenzheng1030
Copy link
Collaborator

chenzheng1030 commented Jul 25, 2022

This issue also exists on powerpc64le

A narrow down case:

$ cat r.ll
target datalayout = "e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512"
target triple = "powerpc64le-unknown-linux-gnu"

; define internal fastcc void @foo(i8* %value) {
define internal void @foo(i8* %value) personality i8* null {
entry:
  musttail call fastcc void null(i8* null)
  ret void
}

; uselistorder directives
uselistorder i8* null, { 1, 0 }

@chenzheng1030
Copy link
Collaborator

This seems like CoroSplitPass pass aggressively sets a indirect call with musttail attribute.

PPC target will not do tail call optimization for indirect calls. For indirect calls, when returning from the callee, PPC backend needs to insert instructions to restore the TOC after the normal call instruction. If we do tail call optimization to the call instruction on PPC, then no opportunity to restore the TOC for the caller. See

// All variants of 64-bit ELF ABIs without PC-Relative addressing require that
// the caller and callee share the same TOC for TCO/SCO. If the caller and
// callee potentially have different TOC bases then we cannot tail call since
// we need to restore the TOC pointer after the call.
// ref: https://bugzilla.mozilla.org/show_bug.cgi?id=973977
// We cannot guarantee this for indirect calls or calls to external functions.
// When PC-Relative addressing is used, the concept of the TOC is no longer
// applicable so this check is not required.
// Check first for indirect calls.
if (!Subtarget.isUsingPCRelativeCalls() &&
!isFunctionGlobalAddress(Callee) && !isa<ExternalSymbolSDNode>(Callee))
return false;
// Check if we share the TOC base.
if (!Subtarget.isUsingPCRelativeCalls() &&
!callsShareTOCBase(&Caller, Callee, getTargetMachine()))
return false;

@bzEq
Copy link
Collaborator

bzEq commented Jul 25, 2022

I'm not sure if we can relax this restriction. PPC::TAILBCTR8 is defined, but it looks not working.

@chenzheng1030
Copy link
Collaborator

chenzheng1030 commented Jul 26, 2022

Imaging a case like:

FILE1:

void caller_caller() {
  //call caller inside
  caller();
}

void caller() {
// do musttail call to callee
callee();
}

FILE2:

callee()
{
}

If we return from callee() to caller_caller(), no chance to restore the TOC.

For the indirect call case in this issue, it is very hard to conclude that the indirect call callee() is defined in same file with caller().

@ChuanqiXu9
Copy link
Member

For tagets which doesn't support musttail, you could implement like like:

bool WebAssemblyTTIImpl::supportsTailCalls() const {
return getST()->hasTailCall();
}

@chenzheng1030
Copy link
Collaborator

OK, that sounds like right thing for PPC part. But PPC indeed supports tail call, we still need to return true for supportsTailCalls when Subtarget.isUsingPCRelativeCalls() or callsShareTOCBase() for direct calls.

@nemanjai
Copy link
Member

nemanjai commented Aug 8, 2022

We would not want to turn off tail calls without context as suggested. The only way I see forward would be to do one of the following:

  • Add a query to the TTI that provides context (i.e. the caller and callee) so that PPC can reject indirect calls when PC-Relative addressing is not active
  • Provide some way of guaranteeing that the indirect call is to a function within the same DSO so the PPC back end does not need to reject the tail call

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

8 participants