Skip to content

lld:MachO crashes when linking with same symbol from different architectures' archives #56386

Open
@PRESIDENT810

Description

@PRESIDENT810

How to produce:
foo.cpp:

void foo(){
  return;
}

bar.cpp

extern void foo();

int bar(){
  foo();
  return 0;
}

main.cpp

extern int bar();

int main() {
  bar();
  return 0;
}

Build the executable with the following command:

clang -arch arm64 foo.cpp -c -o foo_arm64.o
ar rcs libfoo1.a foo_arm64.o
clang -arch x86_64 foo.cpp -c -o foo_x86_64.o
ar rcs libfoo2.a foo_x86_64.o
clang -arch x86_64 bar.cpp -c -o bar.o
ar rcs libbar.a bar.o
clang -arch x86_64 main.cpp libfoo1.a libfoo2.a libbar.a -fuse-ld=/Users/bytedance/github/llvm-project/build/bin/ld64.lld

Then ld64.lld gives the following crash log:

ld64.lld: warning: libfoo1.a(foo_arm64.o) has architecture arm64 which is incompatible with target architecture x86_64
cannot be TLV
UNREACHABLE executed at /Users/bytedance/github/llvm-project/lld/MachO/Symbols.h:68!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /Users/bytedance/github/llvm-project/build/bin/ld64.lld -demangle -lto_library /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/libLTO.dylib -no_deduplicate -dynamic -arch x86_64 -platform_version macos 12.0.0 12.3 -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -o a.out -L/usr/local/lib /var/folders/mc/j3dq2z114_l_cpcpjvnlywnr0000gp/T/main-122511.o libfoo1.a libfoo2.a libbar.a -lSystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/13.1.6/lib/darwin/libclang_rt.osx.a
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  lld                      0x0000000104c1127d llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 61
1  lld                      0x0000000104c117fb PrintStackTraceSignalHandler(void*) + 27
2  lld                      0x0000000104c0f616 llvm::sys::RunSignalHandlers() + 134
3  lld                      0x0000000104c12e9f SignalHandler(int) + 223
4  libsystem_platform.dylib 0x00007ff80ba66dfd _sigtramp + 29
5  libsystem_platform.dylib 0x00007ff7bb5ab4a0 _sigtramp + 18446744072362411712
6  libsystem_c.dylib        0x00007ff80b99cd24 abort + 123
7  lld                      0x0000000104a49590 llvm::install_out_of_memory_new_handler() + 0
8  lld                      0x00000001053432f4 lld::macho::Symbol::isTlv() const + 36
9  lld                      0x00000001053262bb lld::macho::validateSymbolRelocation(lld::macho::Symbol const*, lld::macho::InputSection const*, lld::macho::Reloc const&) + 123
10 lld                      0x00000001053a2ce8 (anonymous namespace)::Writer::scanRelocations() + 408
11 lld                      0x0000000105398b01 void (anonymous namespace)::Writer::run<lld::macho::LP64>() + 113
12 lld                      0x0000000105398a31 void lld::macho::writeResult<lld::macho::LP64>() + 49
13 lld                      0x0000000105257fc4 lld::macho::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) + 16596
14 lld                      0x00000001049565a9 lldMain(int, char const**, llvm::raw_ostream&, llvm::raw_ostream&, bool) + 153
15 lld                      0x0000000104956314 main + 180
16 dyld                     0x000000012480851e start + 462
clang: error: unable to execute command: Abort trap: 6
clang: error: linker command failed due to signal (use -v to see invocation)

The problem occurs because by the linking order, lld first loads libfoo1.a, which is an arm64 archive. At this time lld only reads the header of libfoo1.a, and add void foo() as a lazy archive symbol to lld's symbol table. Then lld loads libfoo2.a, which is a x86_64 archive, and it contains the void foo() symbol that we want, because we are linking a x86_64 executable. However, lld finds that its symbol table already has void foo() from the arm64 archive, so it will not add this x86_64 one to its symbol table.

When lld tries to solve the undefined symbol int bar() in main.o, it loads bar.o from libbar.a, and realizes that there is an undefined symbol void foo() in bar.o, so it tries to load void foo() from foo_arm64.o from libfoo1.a according to info from its symbol table. At this time, lld finds that foo_arm64's architecture is incompatible with our target architecture, so it stops parsing foo_arm64.o. However, void foo() still remains as a lazy archive symbol in lld's symbol table. Therefore, in the relocation stage, lld goes into lld::macho::validateSymbolRelocation:

if (relocAttrs.hasAttr(RelocAttrBits::TLV) != sym->isTlv())
    error(message(Twine("requires that symbol ") + sym->getName() + " " +
                  (sym->isTlv() ? "not " : "") + "be thread-local"));

Because lld cannot determine whether a lazy archive symbol is a TLV (and there is no method override for LazyArchive symbol), it goes to the virtual method in the parent class (Symbol), which gives us the llvm_unreachable error.

I think we can fix this by making ArchiveFile::addLazySymbols check the architecture of the archive it reads. If the architecture doesn't match our target architecture, we shouldn't add lazy archive symbols from its header.

I'm not sure if this is the right way to fix it. Need some suggestions here.

By the way, if you switch the order of libfoo1.a and libfoo2.a, the linking will succeed (lld will load the x86_64 symbol first). I think we should make it works regardless of the linking order we use, and ignore symbols from archives with incompatible architectures?

Metadata

Metadata

Assignees

No one assigned

    Labels

    crashPrefer [crash-on-valid] or [crash-on-invalid]lld:MachO

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions