-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lld:MachO crashes when linking with same symbol from different architectures' archives #56386
Comments
@llvm/issue-subscribers-lld-macho |
Seems reasonable; it looks like ld64 is also filtering out archive members from non-matching targets. Though from what I understand, architecture information isn't stored in the archive header, so we have to read the archive member itself to figure this out. I'm slightly concerned about perf cost of the additional I/O this involves, but it looks like LLD-ELF also pages in archive member headers when loading archives, so I guess it might be okay. Some benchmarks would be nice though. |
I can come up with a dumb workaround. We can read the first member's memory buffer from each archives we are about to load their symbol table, and check its header to see if the first member's architecture matches our target. Since an archive cannot consists of object files of different architectures, checking one child's header can give us enough information of this archive's overall architecture type. I guess this could only introduce the overhead of reading one more object file's header per archive? Not sure if this performance cost could be serious or not. Though it could be a little bit ugly. Is there any more clever or elegant ways to do this? |
Not sure exactly what you're referring to, but it may have been part of a recent change to handle archives the same way as |
Ah yeah I was just looking at the code, not the diff that implemented it. Makes sense though. I guess implementing that is a bit out of scope for this issue though. We can fix this first, measure the perf impact, and then implement that later on if necessary to speed things up.
I was playing with this yesterday and it seems like while cctools' |
If cctools-ar disallows that and llvm-ar allows it, that sounds like an llvm-ar bug, though that's also a different issue, of course. |
This fixes llvm#56386, but it's based on the assumption that an archive cannot contain object files of multiple architecture. In practice however, such archives can be produced by llvm-ar, which should be a bug because cctool's ar would refuse to do so.
How to produce:
foo.cpp:
bar.cpp
main.cpp
Build the executable with the following command:
Then ld64.lld gives the following crash log:
The problem occurs because by the linking order, lld first loads libfoo1.a, which is an arm64 archive. At this time lld only reads the header of libfoo1.a, and add
void foo()
as a lazy archive symbol to lld's symbol table. Then lld loads libfoo2.a, which is a x86_64 archive, and it contains thevoid foo()
symbol that we want, because we are linking a x86_64 executable. However, lld finds that its symbol table already hasvoid foo()
from the arm64 archive, so it will not add this x86_64 one to its symbol table.When lld tries to solve the undefined symbol
int bar()
in main.o, it loads bar.o from libbar.a, and realizes that there is an undefined symbolvoid foo()
in bar.o, so it tries to loadvoid foo()
from foo_arm64.o from libfoo1.a according to info from its symbol table. At this time, lld finds that foo_arm64's architecture is incompatible with our target architecture, so it stops parsing foo_arm64.o. However,void foo()
still remains as a lazy archive symbol in lld's symbol table. Therefore, in the relocation stage, lld goes intolld::macho::validateSymbolRelocation
:Because lld cannot determine whether a lazy archive symbol is a TLV (and there is no method override for LazyArchive symbol), it goes to the virtual method in the parent class (Symbol), which gives us the llvm_unreachable error.
I think we can fix this by making
ArchiveFile::addLazySymbols
check the architecture of the archive it reads. If the architecture doesn't match our target architecture, we shouldn't add lazy archive symbols from its header.I'm not sure if this is the right way to fix it. Need some suggestions here.
By the way, if you switch the order of libfoo1.a and libfoo2.a, the linking will succeed (lld will load the x86_64 symbol first). I think we should make it works regardless of the linking order we use, and ignore symbols from archives with incompatible architectures?
The text was updated successfully, but these errors were encountered: