Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang crash for arm64 target #386

Closed
ukreator opened this issue May 7, 2017 · 13 comments
Closed

clang crash for arm64 target #386

ukreator opened this issue May 7, 2017 · 13 comments
Assignees
Labels
Milestone

Comments

@ukreator
Copy link

ukreator commented May 7, 2017

Description

Clang crashes when using examples for boost-msm library. Happens only for arm64 targets and only in no-optimization mode (e.g. -O0).

Environment Details

Command line that triggers the issue:

dmitry@dmitry8:~$ ./android-ndk-r15-beta1/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++ -c example4.cpp -std=c++14 -o example.o -target arm64-v8a -O0
clang++: error: unable to execute command: Segmentation fault (core dumped)
clang++: error: clang frontend command failed due to signal (use -v to see invocation)
Android clang version 3.8.275480  (based on LLVM 3.8.275480)
Target: arm64-v8a
Thread model: posix
InstalledDir: /home/dmitry/./android-ndk-r15-beta1/toolchains/llvm/prebuilt/linux-x86_64/bin
clang++: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
clang++: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang++: note: diagnostic msg: /tmp/example4-0ec423.cpp
clang++: note: diagnostic msg: /tmp/example4-0ec423.sh
clang++: note: diagnostic msg:

********************

Gist for the preprocessed source and run script: https://gist.github.com/ukreator/5849dd9e0c0600eea5e0ef082bc792e4

Host OS: Linux
NDK version: r15 beta1
Build system: any (used direct clang call for this particular reproducing case)

LLVM site says that no new accounts can be created for bug reporting, so I'm reporting here.

@enh
Copy link
Contributor

enh commented May 7, 2017

also repros with the newer clang in r15beta2:

clang++: error: unable to execute command: Segmentation fault (core dumped)
clang++: error: clang frontend command failed due to signal (use -v to see invocation)
Android clang version 5.0.300080 (based on LLVM 5.0.300080)
Target: arm64-v8a
Thread model: posix
InstalledDir: /tmp/./android-ndk-r15-beta2-canary/toolchains/llvm/prebuilt/linux-x86_64/bin
clang++: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.

for, -O1 is broken as well as -O0. -O2/-O3 work.

@stephenhines
Copy link
Collaborator

stephenhines commented May 16, 2017

I only had a non-debug clang handy (but am building a debug version right now). Here is the quick backtrace for the crash with our latest toolchain:

#0  0x0000000002436f04 in ?? ()
#1  0x00000000024355c8 in llvm::RegsForValue::getCopyFromRegs(llvm::SelectionDAG&, llvm::FunctionLoweringInfo&, llvm::SDLoc const&, llvm::SDValue&, llvm::SDValue*, llvm::Value const*) const ()
#2  0x0000000002448c74 in llvm::SelectionDAGBuilder::getValueImpl(llvm::Value const*) ()
#3  0x000000000244873e in llvm::SelectionDAGBuilder::getValue(llvm::Value const*) ()
#4  0x0000000002454356 in llvm::SelectionDAGBuilder::LowerCallTo(llvm::ImmutableCallSite, llvm::SDValue, bool, llvm::BasicBlock const*) ()
#5  0x00000000024437f2 in llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&) ()
#6  0x0000000002439d47 in llvm::SelectionDAGBuilder::visit(llvm::Instruction const&) ()
#7  0x000000000247fa19 in llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, bool&) ()
#8  0x000000000247ed2b in llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) ()
#9  0x000000000247cdb5 in llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) ()
#10 0x00000000026e9fe4 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) ()
#11 0x0000000002d551f3 in llvm::FPPassManager::runOnFunction(llvm::Function&) ()
#12 0x0000000002d553f3 in llvm::FPPassManager::runOnModule(llvm::Module&) ()
#13 0x0000000002d5585f in llvm::legacy::PassManagerImpl::run(llvm::Module&) ()
#14 0x0000000000e67035 in clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) ()
#15 0x0000000000ff198a in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) ()
#16 0x00000000010b6146 in clang::ParseAST(clang::Sema&, bool, bool) ()
#17 0x0000000000ff00b9 in clang::CodeGenAction::ExecuteAction() ()
#18 0x0000000000ad5270 in clang::FrontendAction::Execute() ()
#19 0x0000000000a98358 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) ()
#20 0x0000000000a5566c in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) ()
#21 0x0000000000a4b224 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) ()
#22 0x0000000000a52fb3 in main ()

@stephenhines
Copy link
Collaborator

The issue is a type with apparently "zero" size. I'm still digging, but everything is suspect at this point. The AArch64 argument lowering code doesn't handle a zero-sized type properly. It rounds to an 8-byte boundary, but unfortunately, 0 is 8-byte aligned. The function that is breaking is "make_transition_table".

(gdb) call Ty.dump()
RecordType 0xb92d260 'struct boost::sml::v1_1_0::front::transition<struct boost::sml::v1_1_0::front::state<struct boost::sml::v1_1_0::aux::string<char, 't', '
i', 'm', 'e', 'd', ' ', 'w', 'a', 'i', 't'> >, struct boost::sml::v1_1_0::front::state<struct boost::sml::v1_1_0::aux::string<char, 'f', 'i', 'n', ' ', 'w', '
a', 'i', 't', ' ', '2'> >, struct boost::sml::v1_1_0::front::event, struct boost::sml::v1_1_0::front::always, struct boost::sml::v1_1_0::aux::zero
_wrapper<class (lambda at example4.cpp:21:23), void> >'
`-ClassTemplateSpecialization 0xb92d178 'transition'

@stephenhines
Copy link
Collaborator

Digging a bit deeper here, the size of the record is noted as zero, yet isEmptyRecord() is returning false. In the case of -O2/-O3, the assertion still fires in my build, so the build just gets lucky when it is in a higher optimization mode. I will look more closely at whether isEmptyRecord() is wrong for this kind of type. I will also continue to reduce the testcase, in case I need to get help from others upstream.

@stephenhines
Copy link
Collaborator

https://bugs.llvm.org//show_bug.cgi?id=33191 is the bug I just filed upstream. I have some ideas how to fix this for AArch64, but want to confirm that there isn't something more terrible to be fixed here for all targets. I should note that I managed to get the test case down to:

struct s {
  char _[0];
};
struct t {
  struct s s1;
};

auto f() {
  return t();
}

Thanks for this very interesting bug. Sorry it has taken me a while to work through some of the crazier stuff here. Hopefully this will be fixed soon and available in the next NDK.

@ukreator
Copy link
Author

@stephenhines thanks for looking into this.
Do you know if it's safe enough to continue using the code that causes this with high compiler optimization levels? We can live without -O0, but it's not fine if the compiler generates invalid machine code. So far we haven't noticed any issues with the final binary.

@stephenhines
Copy link
Collaborator

I think that if you were going to see problems as a result of this code, they would show at compile-time, since there is an assertion that fires (in our assertion build), or you should end up with a segfault in other cases, since Clang won't be able to pick a proper size register to hold these data types. Since these zero-byte types aren't escaping as public APIs, the optimizer mostly inlines/removes them as parameters and/or return types. If you do have one of these types remain in a public API, there might be a small chance for a runtime bug, but it seems pretty low (and I doubt that you are exporting APIs that look like the code you originally sent).

@ukreator
Copy link
Author

yeah, all those boost::sml pieces are contained in cpp files with internal linkage. Thx @stephenhines

@DanAlbert DanAlbert added this to the r16 milestone May 30, 2017
@DanAlbert
Copy link
Member

@stephenhines: is r16 a reasonable goal for this?

@stephenhines
Copy link
Collaborator

This was already fixed in upstream LLVM with r302313. It should definitely be in r16, but we don't want to delay things further for r15, so it won't make it there.

@DanAlbert DanAlbert modified the milestones: r16, r17 Oct 3, 2017
@DanAlbert
Copy link
Member

Unfortunately Android hasn't finished with the Clang update yet, and this has been holding r16 up long enough. This is going to have to wait for r17.

@codernavi18
Copy link

In that case, is there a way to pick that patch up and get my AOSP compiled.
(I am also facing the similar error :

clang++.real: error: unable to execute command: Segmentation fault (core dumped)
clang++.real: error: clang frontend command failed due to signal (use -v to see invocation)
Android clang version 5.0.300080 (based on LLVM 5.0.300080)
Target: x86_64--linux-android
Thread model: posix
InstalledDir: prebuilts/clang/host/linux-x86/clang-4053586/bin

@rprichard
Copy link
Collaborator

The r302313 Clang fix is in r17-beta1. (https://android.googlesource.com/toolchain/clang/+/dc4dea9dda74c9110ccd761847520a075a68b20e), and the test case doesn't segfault anymore. (I was able to reproduce the segfault in r15c and r16b.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants