Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support #[target_feature(enable = ...)] on #[naked] functions #137720

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

folkertdev
Copy link
Contributor

fixes #136280

Instructions that are part of a target feature require a special directive on some targets. This PR adds those for the most common targets.

This is very WIP, but I'm hoping to collect some feedback on what is (not) supported and how to report that to users.

r? @ghost

cc @taiki-e @Amanieu

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 27, 2025
Comment on lines +109 to +110
// FIXME share code with `create_object_file`
fn parse_architecture(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this there some standard way of getting the architecture that I'm missing? I'd like the rest of the code to operate on an enum rather than doing a bunch of manual string matching.

@folkertdev folkertdev force-pushed the naked-function-target-feature branch from 89d44c1 to 4626bcc Compare February 27, 2025 11:54
// no action is needed, all instructions are accepted regardless of target feature
}

Architecture::Aarch64 | Architecture::Aarch64_Ilp32 | Architecture::Arm => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for Arm (AArch32), some target features like v8, neon, don't seem to work with .arch_extension: https://godbolt.org/z/GTK3vqfrh

As for v* features, they work with .arch armv*: https://godbolt.org/z/7oj9njWG6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docs list only three features for .arch_extension on aarch32: crc, fp16 and ras. I could see how the architectures are special, but even dotprod does not seem to work?!

also for the arch: because there is no push/pop, we can't get back to the original arch if we set it, right? I did figure out that you can .fpu neon for neon, but like arch, there is no way to reset the fpu once you set it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some further work categorizing the features that rust/llvm supports

https://godbolt.org/z/46s7zxvfo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, good news, I think there is a way forward here.

global_asm! being stateful (as mentioned here #137720 (comment)) cannot be relied on by users because of this rule in the reference

https://doc.rust-lang.org/reference/inline-assembly.html#r-asm.rules.not-successive

Therefore, we just need to make sure that at the end of a naked function, the settings are as configured via the global target features. It's less elegant that the push/pop mechanism , and for arm that will still be a bit of a mess, but it seems workable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, after looking at it a bit, I still think it's possible, but hard. I have some partial code here https://gist.github.com/folkertdev/fe99874c466e598d0fb2dadf13b91b6f but I think this really needs someone more familiar with the arm target features to do a good job on.

(or, like, can we get LLVM to add a push/pop mechanism for us?)

So, in combination with target features being unstable on arm anyway, I would prefer to skip that logic for now, keep this PR straightforward, and get target features working on targets with stable asm support first.

@folkertdev
Copy link
Contributor Author

folkertdev commented Mar 6, 2025

I think this is as far as I'm going to get on my own (well with a bunch of help from taiki too, thanks!). Questions I don't have a good answer to:

  • the conversion of the target string to an Architecture: it seems kind of ad-hoc and easy to miss (new) targets
  • what do we do with target features that we can't deal with (e.g. the simd128 for wasm): ignore silently, warn, error?
  • what do we do with targets where we are not sure we can restore to the prior state (e.g. arm8 is likely in this category, some of the other more obscure ones too): try our best? just error out?

r? @Amanieu

@folkertdev folkertdev marked this pull request as ready for review March 6, 2025 13:45
Comment on lines +191 to +197
// only enable/disable a feature if it is not already globally enabled.
// so that we don't infuence subsequent asm blocks
if !tcx.sess.unstable_target_features.contains(&feature.name) {
writeln!(begin, ".arch_extension {}", feature.name).unwrap();

writeln!(end, ".arch_extension no{}", feature.name).unwrap();
}
Copy link
Member

@taiki-e taiki-e Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually appear to work in some cases because due to a bug in LLVM (#113221), even if the feature is enabled on the Rust side, it is not passed to the assembler...

However, given that the bug appears conditionally, I think it is also wrong to always set both .arch_extension {feature} and .arch_extension no{feature}.

A bit odd, but setting .arch_extension {feature} always and .arch_extension no{feature} conditionally would work in both cases.

In that approach, .arch_extension {feature} will affect subsequent assemblies if the feature is enabled, but what it enables is a globally enabled feature on the Rust side, so I guess there is probably no adverse effect. EDIT: see #137720 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is probably no adverse effec

Nah, this doesn't met stateful directive rule...

And I think the s390x implementation of this PR that attempts to restore state based on features enabled on the Rust side has the same problem...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We now have a separate feature flag for target features on naked functions (so that naked functions themselves can be stabilized), #138568. At this point I'm sceptical that we can make the approach of this PR work in general.

So, that means that either

  • we use this approach on platforms where it is guaranteed to work
  • we just have the user handle it (and can help out with some good documentation for that platforms that we've looked at)

Copy link
Member

@taiki-e taiki-e Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I think it is safer to only support the x86/x86_64 which obviously does not require anything and architectures which support push/pop logic such as riscv, and make errors otherwise. Once the LLVM bug is fixed, we can also support aarch64 and s390x (and probably arm).

  • we just have the user handle it (and can help out with some good documentation for that platforms that we've looked at)

Considering the LLVM bug, I guess it is also difficult to use it properly on the user side. (as for aarch64 and s390x)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

#[target_feature(..)] no longer in effect in naked functions
6 participants