Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc should specify SHT_INIT_ARRAY for its init_array section #92181

Closed
adetaylor opened this issue Dec 21, 2021 · 8 comments
Closed

rustc should specify SHT_INIT_ARRAY for its init_array section #92181

adetaylor opened this issue Dec 21, 2021 · 8 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.

Comments

@adetaylor
Copy link
Contributor

This recent commit to lld broke Rust programs which used that linker. That's because it discards sections entitled .init_array unless they are labeled with the correct section type, SHT_INIT_ARRAY.

Rust programs continued to link and run successfully, but couldn't see their command-line arguments (std::env::args would return nothing) because the Rust standard library relies on usage of .init_array to retrieve those arguments. This section was SHT_PROGBITS rather than SHT_INIT_ARRAY so was discarded.

This lld change has now been undone but my understanding from the LLVM folks (specifically @MaskRay) is that this is temporary.

To fix this to the satisfaction of the LLVM folks, rustc would need to set the section type correctly using llvm::MCContext::getELFSection. Unfortunately, that isn't currently exposed via LLVM's C API (as far as I can tell), so an LLVM change would first be needed to expose it.

After that, we'd need to decide whether to expose the section type ID via something like a new #[link_section_type] attribute or whether we'd simply hard-code the section type ID for certain well-known sections such as this.

NB I am not personally an LLVM person so forgive me if I've misunderstood the layering of LLVM APIs or similar. I just did a bit of the investigation here to understand why my command-line cupboard was bare.

Related issues:

@MaskRay
Copy link
Contributor

MaskRay commented Dec 21, 2021

If it is possible to set Elf64_Shdr::sh_type, Elf64_Shdr::sh_flags should be supported as well. Setting and clearing SHF_ALLOC can be useful.

@nikic
Copy link
Contributor

nikic commented Dec 22, 2021

I'm a bit unclear on how we're supposed to address this in practical terms. As far as I know LLVM still does not support specifying section type or section flags in IR, so we wouldn't be able to expose something like #[link_section_type]. Is the suggestion here that we should replace this symbol with inline assembly as well?

@nikic nikic added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Dec 22, 2021
@nikic
Copy link
Contributor

nikic commented Dec 22, 2021

Regarding llvm::MCContext::getELFSection(), I don't really see how it is useful here, as it works on a completely different level. For LTO purposes if nothing else this information must be embedded in bitcode, so the ability to adjust the section flags in the deep end of the backend isn't really helpful.

Should LLVM be automatically setting @init_array section type for .init_array sections, maybe?

@MaskRay
Copy link
Contributor

MaskRay commented Dec 22, 2021

LLVM IR provides llvm.global_ctors to create .init_array sections. Such sections correctly have @init_array %init_array which will get SHT_INIT_ARRAY.

In C, if you write __attribute__((section(".init_array"))) void foo() {}, Clang will translate it to LLVM IR section ".init_array". The section will be SHT_INIT_ARRAY as well. Though I'd say this is a less supported way (it somewhat happens to work).

How does the Rust component in question create .init_array?

@nikic
Copy link
Contributor

nikic commented Dec 27, 2021

@MaskRay The relevant difference is that rustc includes a priority in the section name. This can be seen with clang as well: https://c.godbolt.org/z/6PY87sW3v .init_array results in an @init_array section, while .init_array.00099 results in @progbits.

Changing https://github.com/llvm/llvm-project/blob/ba89c6d5056975c046275ce9614eb96eb7ec01f4/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp#L488 to use startswith instead of an equality comparison gives .init_array.00099 consistent treatment. Would that be a reasonable change to make?

@MaskRay
Copy link
Contributor

MaskRay commented Dec 28, 2021

For llvm.global_ctors, the code path is: AsmPrinter::emitXXStructorList -> TargetLoweringObjectFileELF::getStaticCtorSection -> getStaticStructorSection, and .init_array.00000 gets SHT_INIT_ARRAY as well.

Changing https://github.com/llvm/llvm-project/blob/ba89c6d5056975c046275ce9614eb96eb7ec01f4/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp#L488 to use startswith instead of an equality comparison gives .init_array.00099 consistent treatment. Would that be a reasonable change to make?

For C/C++ users, __attribute__((constructor(3))) should be used. I'd call __attribute__((section(".init_array.00003"))) unsupported and not guaranteed to work (say, if the compiler DCE it, I think that is totally justified).

Changing the code path may make __attribute__((section(".init_array.00003"))) use SHT_INIT_ARRAY, but it is a bit unfortunate to introduce more changes which are duplicated elsewhere (MC)... But if the supported way is difficult to use from a library's perspective, hacking .init_array.* for now seems fine.

nikic added a commit to llvm/llvm-project that referenced this issue Jan 4, 2022
Currently, the code in TargetLoweringObjectFile only assigns
@init_array section type to plain .init_array sections, but not
prioritized sections like .init_array.00001.

This is inconsistent with the interpretation in the AsmParser
(see https://github.com/llvm/llvm-project/blob/791523bae6153b13bb41ba05c9fc89e502cc4a1a/llvm/lib/MC/MCParser/ELFAsmParser.cpp#L621-L632)
and upcoming expectations in LLD
(see rust-lang/rust#92181 for context).

This patch assigns @init_array section type to all sections with an
.init_array prefix. The same is done for .fini_array and
.preinit_array as well. With that, the logic matches the AsmParser.

Differential Revision: https://reviews.llvm.org/D116528
@nikic
Copy link
Contributor

nikic commented Jan 4, 2022

This should be resolved now by llvm/llvm-project@de92a13 (temporary workaround on the LLD side) and llvm/llvm-project@4ef560e (assigning correct section type for rustc's way of declaring it), so closing this.

@nikic nikic closed this as completed Jan 4, 2022
@MaskRay
Copy link
Contributor

MaskRay commented Feb 4, 2022

The original issue has been resolved (and will allow ld.lld to remove the workaround after a while) but the state is still not great. The "SHT_INIT_ARRAY' from a special section name does not meet ELF spirits: part of the reason .ctors/.dtors were obsoleted by SHT_INIT_ARRAY/SHT_FINI_ARRAY. GNU as does use special section names like .init_array.* .note* and llvm-project has to port the behaviors.

I understand that LLVM IR does not provide a type feature beside section, that means relying on the behavior is a bit of abuse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.
Projects
None yet
Development

No branches or pull requests

3 participants