Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support DT_RELR using standard -z pack-relative-relocs cmdline flag #653

Closed
rawoul opened this issue Aug 21, 2022 · 18 comments
Closed

Support DT_RELR using standard -z pack-relative-relocs cmdline flag #653

rawoul opened this issue Aug 21, 2022 · 18 comments

Comments

@rawoul
Copy link

rawoul commented Aug 21, 2022

In bfd and gold, packed dynamic relocations are toggled with the -z pack-relative-relocs and -z nopack-relative-relocs cmd line flags, while mold uses -pack-dyn-relocs=none|relr. For compatibility it would be simpler if mold aligned to the other linkers, can the standard flag also be supported, or used instead of -pack-dyn-relocs ?

@rui314
Copy link
Owner

rui314 commented Aug 21, 2022

If someone really has a trouble with these flags, I'll consider, but I don't think we should add a new flag for a hypothetical use. It's somewhat an evolutionary process; if a flag doesn't become popular enough, it won't be supported by many implementations, and its use will diminish. So I don't think we should try to support all flags beforehand.

@rawoul
Copy link
Author

rawoul commented Aug 22, 2022

When adding -z pack-relative-relocs bfd also adds a version dependency on GLIBC_ABI_DT_RELR if libc.so is in DT_NEEDED and there's a dependency on version GLIBC_2.* (meaning the object depends on glibc). Without this the glibc dynamic loader will refuse to load a shared object using DT_RELR with the following error:

DT_RELR without GLIBC_ABI_DT_RELR dependency

That seems quite complicated, so I agree with you that waiting is probably better... For context, the same issue has been raised before in the lld project: llvm/llvm-project#53775.

@rui314
Copy link
Owner

rui314 commented Aug 22, 2022

Thank you for sharing the link! I agree with you that what GNU ld does seems quite complicated.

@glandium
Copy link

glandium commented Aug 8, 2023

b365d6b added -z pack-relative-relocs without GLIBC_ABI_DT_RELR, which leads to the error mentioned in #653 (comment) ...

@rui314 rui314 closed this as completed in f467ad1 Aug 9, 2023
@maxyvisser
Copy link

I was experiencing the error mentioned in #653 (comment) in the 2.0.0 release build when building firefox 116. This issue appears to have been fixed by f467ad1, but a new error pops up:

20:10.56 mold: fatal: /tmp/lto-llvm-5abe96.o: REL-type relocation table is not supported for this target
20:10.58 clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
20:10.59 make[4]: *** [/compile/makepkg/librewolf/src/librewolf-116.0-1/config/rules.mk:532: libmozsqlite3.so] Error 1
20:10.59 make[3]: *** [/compile/makepkg/librewolf/src/librewolf-116.0-1/config/recurse.mk:72: config/external/sqlite/target] Error 2

Could this possibly be related?

PS: here is a link to a report on the same bug on firefox's issue tracker: https://bugzilla.mozilla.org/show_bug.cgi?id=1847697.

@rui314
Copy link
Owner

rui314 commented Aug 10, 2023

No, that's different. What target did you get this error for? If it's for x86-64 or ARM64, it is actually odd if an object file contains a REL-type relocation section instead of RELA-type one.

@maxyvisser
Copy link

This is on x86-64, a ryzen 5 5600G to be exact. Do you want me to create a new issue, since it is unrelated?

@ambasta
Copy link

ambasta commented Aug 30, 2023

Can reproduce this issue on FF-117 as well

171:11.18 mold: fatal: /var/tmp/portage/www-client/firefox-117.0/temp/lto-llvm-715b02.o: REL-type relocation table is not supported for this target

@rui314
Copy link
Owner

rui314 commented Aug 30, 2023

It looks like LLVM generates REL-type relocations instead of RELA-type ones for LLVM. It's very likely that it's a bug in LLVM. Can you report it to LLVM?

@ambasta
Copy link

ambasta commented Aug 30, 2023

Sure, in the process of rebuilding manually (w/o ebuild) to isolate and confirm the issue, Will file one once I have confirmation

@ambasta
Copy link

ambasta commented Oct 13, 2023

Is this really a bug w/ LLVM?

Compiling w/ LLVM succeeds

clang version 17.0.2
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/17/bin
Configuration file: /etc/clang/x86_64-pc-linux-gnu-clang.cfg
System configuration file directory: /etc/clang
Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/13
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/usr/bin/x86_64-pc-linux-gnu-ld.bfd" --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -shared -o libmozsqlite3.so /usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64/crti.o /usr/lib/gcc/x86_64-pc-linux-gnu/13/crtbeginS.o -L/usr/lib/gcc/x86_64-pc-linux-gnu/13 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/lib -L/lib -L/usr/lib -plugin /usr/lib/llvm/17/bin/../lib64/LLVMgold.so -plugin-opt=mcpu=znver1 -plugin-opt=O3 -plugin-opt=thinlto -plugin-opt=-function-sections=1 -plugin-opt=-data-sections=1 -z relro -z defs --gc-sections -h libmozsqlite3.so /var/tmp/portage/www-client/firefox-118.0.2/work/firefox_build/config/external/sqlite/libmozsqlite3_so.list -plugin-opt=-import-instr-limit=10 -plugin-opt=-import-hot-multiplier=30 -lpthread -O1 --as-needed --compress-debug-sections=zlib -rpath=/usr/lib64/firefox --enable-new-dtags -z pack-relative-relocs -z noexecstack -z text -z relro -z nocopyreloc -Bsymbolic-functions -rpath-link /var/tmp/portage/www-client/firefox-118.0.2/work/firefox_build/dist/bin -rpath-link /usr/lib --version-script libmozsqlite3.so.symbols -lm -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-pc-linux-gnu/13/crtendS.o /usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64/crtn.o

But on adding -fuse-ld=mold

clang version 17.0.2
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/17/bin
Configuration file: /etc/clang/x86_64-pc-linux-gnu-clang.cfg
System configuration file directory: /etc/clang
Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/13
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/usr/bin/ld.mold" --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -shared -o libmozsqlite3.so /usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64/crti.o /usr/lib/gcc/x86_64-pc-linux-gnu/13/crtbeginS.o -L/usr/lib/gcc/x86_64-pc-linux-gnu/13 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/lib -L/lib -L/usr/lib -plugin /usr/lib/llvm/17/bin/../lib64/LLVMgold.so -plugin-opt=mcpu=znver1 -plugin-opt=O3 -plugin-opt=thinlto -plugin-opt=-function-sections=1 -plugin-opt=-data-sections=1 -z relro -z defs --gc-sections -h libmozsqlite3.so /var/tmp/portage/www-client/firefox-118.0.2/work/firefox_build/config/external/sqlite/libmozsqlite3_so.list -plugin-opt=-import-instr-limit=10 -plugin-opt=-import-hot-multiplier=30 -lpthread -O1 --as-needed --compress-debug-sections=zlib -rpath=/usr/lib64/firefox --enable-new-dtags -z pack-relative-relocs -z noexecstack -z text -z relro -z nocopyreloc -Bsymbolic-functions -rpath-link /var/tmp/portage/www-client/firefox-118.0.2/work/firefox_build/dist/bin -rpath-link /usr/lib --version-script libmozsqlite3.so.symbols -lm -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-pc-linux-gnu/13/crtendS.o /usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../lib64/crtn.o
mold: fatal: /tmp/lto-llvm-be7713.o: REL-type relocation table is not supported for this target

@rui314
Copy link
Owner

rui314 commented Oct 13, 2023

Yes, it's a bug in LLVM, please report it to LLVM. x86-64 is defined to use RELA-type relocations by its psABI, so creating a REL-type relocation type is illegal even if other linkers happen to not complain.

@ambasta
Copy link

ambasta commented Oct 28, 2023

@rui314 Some additional guidance please, looking at https://github.com/llvm/llvm-project/blob/631e2911ea298bc12564df8acd16bba89790426a/lld/test/ELF/relocation-none.test#L38

## Both REL and RELA are supported. .rel.llvm.call-graph-profile uses REL even
## if the prevailing format is RELA.

This is exactly where mold is failing, and seems to be documented behavior

@rui314
Copy link
Owner

rui314 commented Oct 28, 2023

Even if it's documented, it still violates the spec. As quoted from x86-64 psABI p.72, "The AMD64 LP64 ABI architecture uses only Elf64_Rela relocation entries with explicit addends." While they have the option not to conform to the standard, doing so would lead to compatibility issues for obvious reasons.

@ambasta
Copy link

ambasta commented Oct 28, 2023

Yes, but that would be unlikely. Even the kernel has patches to ignore modpost checks on call-graph-profile

https://lore.kernel.org/lkml/CAK7LNARVi1sfBjv5a5OoQWPEeM-6bFuwPJE+i32NC=wdum-AKw@mail.gmail.com/t/#mcd82d7d511dcb4fe8939e21127fafda08f4f732e

And here's the rationale behind it

https://reviews.llvm.org/D104080

@rui314
Copy link
Owner

rui314 commented Oct 28, 2023

Alright, how about we simply ignore the REL relocation tables in psABIs where RELA is mandated? It seems that by doing so, we'll resolve the issue.

@ambasta
Copy link

ambasta commented Oct 28, 2023

I genuinely do not understand any of this enough to comment. This issue was my first attempt at even trying to go through a compiler/linker source, so I'm not sure what the correct approach is here.

@rui314
Copy link
Owner

rui314 commented Oct 28, 2023

I believe ignoring a relocation table with an incompatible type should be fine because in this case all relocations in the incompatible table is R_*_NONE type, which is a no-op relocation. mold used to silently ignore incompatible relocation tables, so it's not new.

rui314 added a commit that referenced this issue Oct 28, 2023
Specifically, .llvm.call-graph-profile is always REL even on x86-64.
We can safely ignore the relocation table because it contains only
R_NONE relocations.

Fixes #653
VitalyAnkh pushed a commit to VitalyAnkh/mold that referenced this issue Dec 23, 2023
VitalyAnkh pushed a commit to VitalyAnkh/mold that referenced this issue Dec 23, 2023
Specifically, .llvm.call-graph-profile is always REL even on x86-64.
We can safely ignore the relocation table because it contains only
R_NONE relocations.

Fixes rui314#653
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants