Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix target-cpu fpu features on Arm R/M-profile #123159

Merged
merged 1 commit into from
Apr 6, 2024

Conversation

chrisnc
Copy link
Contributor

@chrisnc chrisnc commented Mar 28, 2024

This is achieved by converting +<fpu>,-d32,{,-fp64} to +<fpu>d16{,sp}.

By using a single additive feature that captures d16 vs d32 and sp vs
dp, we prevent -<feature> from overriding -C target-cpu at build time.

Remove extraneous -fp16 from armv7r targets, as this is not included in
vfp3 anyway, but was preventing fp16 from being enabled by e.g.,
-C target-cpu=cortex-r7, which does support fp16.

@rustbot
Copy link
Collaborator

rustbot commented Mar 28, 2024

r? @fee1-dead

rustbot has assigned @fee1-dead.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 28, 2024
@rustbot
Copy link
Collaborator

rustbot commented Mar 28, 2024

These commits modify compiler targets.
(See the Target Tier Policy.)

@chrisnc
Copy link
Contributor Author

chrisnc commented Mar 28, 2024

Looks like the ability to do this dates back to LLVM 9.0.0: llvm/llvm-project@760df47b778a530e9368a4b

@fee1-dead
Copy link
Member

r? compiler

@rustbot rustbot assigned fmease and unassigned fee1-dead Mar 28, 2024
@fmease
Copy link
Member

fmease commented Mar 29, 2024

r? compiler

@rustbot rustbot assigned Nadrieril and unassigned fmease Mar 29, 2024
@Nadrieril
Copy link
Member

r? compiler 😅

@rustbot rustbot assigned fmease and unassigned Nadrieril Mar 29, 2024
@Nadrieril
Copy link
Member

Lol

r? compiler

@rustbot rustbot assigned michaelwoerister and unassigned fmease Mar 29, 2024
@michaelwoerister
Copy link
Member

Let's pull in the ARM notification group.

@rustbot ping arm

@rustbot rustbot added the O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state label Mar 29, 2024
@rustbot
Copy link
Collaborator

rustbot commented Mar 29, 2024

Hey ARM Group! This bug has been identified as a good "ARM candidate".
In case it's useful, here are some instructions for tackling these sorts of
bugs. Maybe take a look?
Thanks! <3

cc @adamgemmell @hug-dev @jacobbramley @JamieCunliffe @joaopaulocarreiro @raw-bin @Stammark

@workingjubilee
Copy link
Contributor

so the thing that is happening here is that previously, we set flags that say "disable this architectural feature", because we (and especially LLVM) allow target features to be disabled. this is a questionable choice to begin with, however, it's particularly problematic when considering that -Ctarget-cpu, which specifies a microarchitecture, comes with certain implied target features. disabling a feature takes priority over the implied features from -Ctarget-cpu, so most ways of specifying a specific CPU based on these architectures was effectively broken.

but we had to do something like this to make clear that the fpu was a slightly underpowered one relative to the full-bore Arm FPUs that they have nowadays. certainly not 32 128-bit registers! only about 16-by-32 or so? and this way fixes that by making only a positive statment.

the main question regarding this PR is whether this causes some oddity because someone was accidentally relying on us disabling those features. however, the new behavior will be more-correct in general, and we don't promise that our target feature settings for a given target remain static forever. and it's a fairly easy fix in general. in my experience, with dodgy feature settings that shouldn't work, it's far more likely they'll get a combination of effects that results in them failing instruction selection in LLVM, rather than emitting a bad binary that they can even try to run.

@chrisnc
Copy link
Contributor Author

chrisnc commented Mar 29, 2024

That's my understanding of what's happening as well. This PR will result in the same set of features being enabled for users who do not use -C target-cpu. For users who do specify -C target-cpu, they will now get the features their CPU supports (according to LLVM), which could be a breaking change for a given project if their target-cpu is incorrect (or they have a version of that CPU that is under-spec'ed compared to LLVM's understanding of it, but didn't tell the build about this). Those users could add -<feature> to their build to get the old behavior (which is what they would have to do when using clang on such a CPU), otherwise they'll get builds that contain illegal instructions for their target. I'm not sure how this would manifest as ISel issues at build time though.

@workingjubilee
Copy link
Contributor

@chrisnc you know I had been thinking of any blatant contradictions (there are a few that toggling features on/off can get you, like especially with "use soft-float but use float registers") but I think those are less likely here yeah.

@michaelwoerister
Copy link
Member

Thanks for the info! That all makes sense to me. But I'm having a hard time verifying that the new set of features is correct for the given targets, and I don't think we are actually executing any tests for these targets. @workingjubilee, feel free to r+ if you are confident in the changes. Otherwise I'll go look for another reviewer.

@chrisnc
Copy link
Contributor Author

chrisnc commented Apr 3, 2024

The combo feature names this PR uses are defined here: https://github.com/rust-lang/llvm-project/blob/rustc/18.0-2024-02-13/llvm/lib/Target/ARM/ARM.td#L63-L91
(see name#"d16sp" and name#"d16").

For arm{eb,}v7r-none-eabihf, the min feature set is of the R4F, which LLVM gives vfp3d16.

For thumbv7em-none-eabihf, the min feature set is of the M4F, which LLVM gives vfp4d16sp.

For thumbv8m.main-none-eabihf, the min feature set is of the M33, which LLVM gives fp-armv8d16sp.

For armv8r-none-eabihf, the min feature set is of the R52, which LLVM defines as ARMv8r and doesn't give a specific FP feature, but ARMv8r implies fp-armv8 and neon.
So this means the current feature list for armv8r is wrong and should be changed to just "" to match LLVM. (Also, this change as-is will have the effect of enabling d32 and fp64 by default because they come with armv8r.) The manual is pretty clear on the existence of the SP-only no-neon version of R52 though. Maybe some Arm folks have other ideas about this? Is LLVM defaulting to a superset intentional?

This is achieved by converting `+<fpu>,-d32,{,-fp64}` to `+<fpu>d16{,sp}`.

By using a single additive feature that captures `d16` vs `d32` and `sp` vs
`dp`, we prevent `-<feature>` from overriding `-C target-cpu` at build time.

Remove extraneous `-fp16` from `armv7r` targets, as this is not included in
`vfp3` anyway, but was preventing `fp16` from being enabled by e.g.,
`-C target-cpu=cortex-r7`, which does support `fp16`.
@chrisnc chrisnc force-pushed the fix-arm-rm-none-eabihf-features branch from ad18e1d to 0459e55 Compare April 5, 2024 05:18
@chrisnc
Copy link
Contributor Author

chrisnc commented Apr 5, 2024

I've removed the armv8r change as that requires more investigation due to how LLVM defines the architecture. The other changes should be good to go. I've confirmed that this allows e.g., target-cpu=cortex-m55 to imply mve, and target-cpu=cortex-r7 to imply fp16 as expected.

@workingjubilee
Copy link
Contributor

workingjubilee commented Apr 5, 2024

I am confident in the changes for Armv7 and Armv8-M targets. I do indeed prefer that we leave Armv8-R targets for later, as they are likely to be Extra Special in any case.

If any issues will arise here, I think there is only one way to truly find out:

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Apr 5, 2024

📌 Commit 0459e55 has been approved by workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 5, 2024
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Apr 5, 2024
…tures, r=workingjubilee

Fix target-cpu fpu features on Arm R/M-profile

This is achieved by converting `+<fpu>,-d32,{,-fp64}` to `+<fpu>d16{,sp}`.

By using a single additive feature that captures `d16` vs `d32` and `sp` vs
`dp`, we prevent `-<feature>` from overriding `-C target-cpu` at build time.

Remove extraneous `-fp16` from `armv7r` targets, as this is not included in
`vfp3` anyway, but was preventing `fp16` from being enabled by e.g.,
`-C target-cpu=cortex-r7`, which does support `fp16`.
@chrisnc
Copy link
Contributor Author

chrisnc commented Apr 5, 2024

I noticed that this landed somewhat recently for the Aarch64 v8r target in LLVM. I think we'll need to make a similar change to the Arm (32-bit) v8r in LLVM to have this make more sense.

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 5, 2024
…llaumeGomez

Rollup of 8 pull requests

Successful merges:

 - rust-lang#121419 (Add aarch64-apple-visionos and aarch64-apple-visionos-sim tier 3 targets)
 - rust-lang#123159 (Fix target-cpu fpu features on Arm R/M-profile)
 - rust-lang#123487 (CFI: Restore typeid_for_instance default behavior)
 - rust-lang#123500 (Revert removing miri jobserver workaround)
 - rust-lang#123505 (Revert "Use OS thread name by default")
 - rust-lang#123509 (Add jieyouxu to compiler review rotation and as a reviewer for `tests/run-make`, `src/tools/run-make-support` and `src/tools/compiletest`)
 - rust-lang#123514 (Fix typo in `compiler/rustc_middle/src/traits/solve/inspect.rs`)
 - rust-lang#123515 (Use `include` command to reduce code duplication)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 2658823 into rust-lang:master Apr 6, 2024
11 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Apr 6, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 6, 2024
Rollup merge of rust-lang#123159 - chrisnc:fix-arm-rm-none-eabihf-features, r=workingjubilee

Fix target-cpu fpu features on Arm R/M-profile

This is achieved by converting `+<fpu>,-d32,{,-fp64}` to `+<fpu>d16{,sp}`.

By using a single additive feature that captures `d16` vs `d32` and `sp` vs
`dp`, we prevent `-<feature>` from overriding `-C target-cpu` at build time.

Remove extraneous `-fp16` from `armv7r` targets, as this is not included in
`vfp3` anyway, but was preventing `fp16` from being enabled by e.g.,
`-C target-cpu=cortex-r7`, which does support `fp16`.
@chrisnc chrisnc deleted the fix-arm-rm-none-eabihf-features branch April 6, 2024 02:48
@chrisnc
Copy link
Contributor Author

chrisnc commented Apr 14, 2024

LLVM change is in review here. We can revisit rust's armv8r target features if and when this merges and rustc updates to use it.

@chrisnc
Copy link
Contributor Author

chrisnc commented May 8, 2024

The llvm fix for armv8r is merged now with the reduced set of default features. Once the rust fork picks that up we can edit rustc's target spec to eliminate the -feature flags (in armv8r's case it would be an empty list because the architecture itself requires a base level of floating point support and llvm defines it this way now, too).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants