Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust cannot pass a vector constant directly to LLVM #118209

Open
GeorgeWort opened this issue Nov 23, 2023 · 13 comments · May be fixed by #118980
Open

Rust cannot pass a vector constant directly to LLVM #118209

GeorgeWort opened this issue Nov 23, 2023 · 13 comments · May be fixed by #118980
Labels
A-simd Area: SIMD (Single Instruction Multiple Data) C-bug Category: This is a bug.

Comments

@GeorgeWort
Copy link
Contributor

This issue is referenced in stdarch here in missing_aarch64.txt. under "LLVM select error in debug builds".

Certain intrinsics in LLVM require a vector constant to be passed directly as an argument, one of the reasons for this might be that it needs to reduce that vector down to a duplicated scalar immediate that can be encoded in the generated instruction. Rust cannot pass vector constants directly and so relies on the CSE LLVM pass to propagate the loaded constant vector to the intrinsic call. This pass is not used at -O0 and so compilation fails with LLVM ERROR: Cannot select: intrinsic %llvm.aarch64.neon.sqshlu for example.

This is an example snippet that fails to compile:

#![feature(repr_simd)]
#![allow(non_camel_case_types)]

#[cfg(target_arch = "aarch64")]
use core::arch::aarch64::*;

fn main() {
    unsafe {
        vqshlu_n_s8::<2>(vdup_n_s8(5));
    }
}

There are a number of options to resolve this issue when it comes to Neon intrinsics specifically.

  1. Allow vectors to be constants in the compiler and thus passed directly to LLVM.
    • I have attempted to do this here, though this requires bigger and more thorough changes.
    • This may be easier to implement as a const generic.
  2. Always run the relevant LLVM pass at all rust optimization levels.
    • This isn't very nice as it goes against the idea of -O0.
  3. Alter all relevant LLVM intrinsics to accept a scalar argument alternative where the vector contains only a single unique scalar.
    • This may not be possible for all intrinsics that require a vector constant.

Meta

rustc --version --verbose:

rustc 1.76.0-nightly (a6b8ae582 2023-11-22)

@GeorgeWort GeorgeWort added the C-bug Category: This is a bug. label Nov 23, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 23, 2023
@RalfJung
Copy link
Member

RalfJung commented Nov 23, 2023

Cc @Amanieu @workingjubilee @oli-obk (once you're back from vacation)

Generally our policy was that we encode such arguments as const generics, which forces them to be constants. Otherwise this seems impossible to do reliably, after all the user could easily write a function like

fn myfun(x: int8x8_t) {
  vqshlu_n_s8::<2>(x);
}

What is supposed to happen for such functions? Certainly we cannot require x to be a constant?

Is there a list of intrinsics this affects? I only know of simd_shuffle_indices, which has special treatment in the compiler for this exact reason.

@RalfJung
Copy link
Member

RalfJung commented Nov 23, 2023

Looking at the stdarch code here, maybe I misunderstood the description:

pub unsafe fn vqshlu_n_s8<const N: i32>(a: int8x8_t) -> uint8x8_t {
    static_assert_uimm_bits!(N, 3);
    #[allow(improper_ctypes)]
    extern "unadjusted" {
        #[cfg_attr(target_arch = "arm", link_name = "llvm.arm.neon.vqshiftsu.v8i8")]
        fn vqshlu_n_s8_(a: int8x8_t, n: int8x8_t) -> uint8x8_t;
    }
vqshlu_n_s8_(a, int8x8_t(N as i8, N as i8, N as i8, N as i8, N as i8, N as i8, N as i8, N as i8))
}

It's not the argument to vqshlu_n_s8 that has to be a constant, is it? It's the second argument to vqshlu_n_s8_?

Does it work if we use an inline const block, like vqshlu_n_s8_(a, const { int8x8_t(N as i8, N as i8, N as i8, N as i8, N as i8, N as i8, N as i8, N as i8) })? Or alternatively we might have to use something like simd_shuffle_indices in the backend for these intrinsics.

I don't think we have to equip the entire Rust compiler with support for vector immediates.

@GeorgeWort
Copy link
Contributor Author

GeorgeWort commented Nov 23, 2023

Yes sorry, the issue is in the call to vqshlu_n_s8_. I should have explicitly pointed to that. The intrinsics that we know have this issue are listed here and here, though I suspect more are affected as this pattern also happens elsewhere for Neon intrinsics.

That inline const block only works with the patch I have written. I don't believe that a constant vector is currently created/propagated as an inline const block or as a const generic.

If there is precedent for this kind of special casing and you are happy for us to special case a decent number of them then I can do that.

@RalfJung
Copy link
Member

RalfJung commented Nov 23, 2023

I don't know if this situation is similar to shufflevector, but if it is then it seems reasonable to use the same solution.

Long-term, we want shufflevector to take a const generic instead of what looks like a regular argument but must actually be known at compile-time. So ideally we'd implement that instead of expanding the use of the existing shufflevector hack.

Either way the tricky part will be properly declaring this in the intrinsic import. The signature of vqshlu_n_s8_ lacks the required information. It should probably either be something like

    extern "unadjusted" {
        #[cfg_attr(target_arch = "arm", link_name = "llvm.arm.neon.vqshiftsu.v8i8")]
        #[rustc_intrinsic_const_vector_arg(1)] // the argument with index 1 must be a constant vector
        fn vqshlu_n_s8_(a: int8x8_t, n: int8x8_t) -> uint8x8_t;
    }

or

    extern "unadjusted" {
        #[cfg_attr(target_arch = "arm", link_name = "llvm.arm.neon.vqshiftsu.v8i8")]
        #[rustc_intrinsic_const_generic_arg(1)] // the const generic should be passed to the intrinsic as argument index 1
        fn vqshlu_n_s8_<N: int8x8_t>(a: int8x8_t) -> uint8x8_t;
    }

The latter being the more principled approach, the former closer to the current shufflevector hack.

@jacobbramley
Copy link
Contributor

One complication here is that the AArch64 Neon intrinsics are stable, and the first rustc_intrinsic_const_vector_arg example above would require a breaking change to the API.

@RalfJung
Copy link
Member

One complication here is that the AArch64 Neon intrinsics are stable, and the first rustc_intrinsic_const_vector_arg example above would require a breaking change to the API.

How that? vqshlu_n_s8 is stable but vqshlu_n_s8_ is not. vqshlu_n_s8 computes the 2nd argument of vqshlu_n_s8_ as int8x8_t(N as i8, N as i8, N as i8, N as i8, N as i8, N as i8, N as i8, N as i8) where N is a const generic. Therefore it can compute that vector as a const expression itself.

@jacobbramley
Copy link
Contributor

Oh, of course, sorry. Yes, we could do that with a few changes to the generator.

@GeorgeWort
Copy link
Contributor Author

Do you have a suggestion for how a constant vector could be constructed and passed as a generic const parameter without turning on generic_const_expr in the whole compiler and implementing support for all ExprKind::Adt and ExprKind::ConstBlocks etc?

@RalfJung
Copy link
Member

RalfJung commented Nov 30, 2023

Hm, I was imagining something like this...

#![feature(adt_const_params)]
#![feature(generic_const_exprs)]

use std::marker::ConstParamTy;

#[derive(ConstParamTy, PartialEq, Eq)]
struct int8x2_t(i8, i8);

fn vqshlu_n_s8_<const N: int8x2_t>(a: int8x2_t) -> int8x2_t { todo!() }

fn vqshlu_n_s8<const N: i32>(a: int8x2_t) -> int8x2_t {
    vqshlu_n_s8_::<{ int8x2_t(N as i8, N as i8) }>(a)
}

But that requires a bunch of features still marked as "incomplete". @lcnr what do you think is the best way to handle intrinsics that need as argument a vector that must be known at compile-time?

Given the current status of const generics, maybe extending the shufflevector hack is a more realistic plan.

@lcnr
Copy link
Contributor

lcnr commented Dec 1, 2023

I think internally depending on adt_const_params is fine, depending on generic_const_exprs is not. THis feature will still change (and break) far too much.

with that, could the following work?

fn vqshlu_n_s8<const N: i32>(a: int8x2_t) -> int8x2_t {
    actual_instrinsic::<N>(a)
}

idk how simd intrinsics are implemented, but could we convert i32 -> int8x2_t internally in the compiler before passing it to llvm or whatever?

@RalfJung
Copy link
Member

RalfJung commented Dec 1, 2023

I think internally depending on adt_const_params is fine, depending on generic_const_exprs is not. THis feature will still change (and break) far too much.

Okay, good to know. adt_const_params is also marked incomplete so I wasn't sure if this is usable yet.

Is there any way to avoid depending on generic_const_exprs by using inline const blocks or so? I tried vqshlu_n_s8_::<{ const { ... } }> but that didn't really change anything.

Sadly, even the associated-const trick seems to require generic_const_exprs:

fn vqshlu_n_s8<const N: i32>(a: int8x2_t) -> int8x2_t {
    struct S<const N: i32>;
    impl<const N: i32> S<N> {
        const B: int8x2_t = int8x2_t(N as i8, N as i8);
    }
    vqshlu_n_s8_::<{ S::<N>::B }>(a)
}

And even then it doesn't work since it says it needs a where clause. Bummer.

idk how simd intrinsics are implemented, but could we convert i32 -> int8x2_t internally in the compiler before passing it to llvm or whatever?

I don't know if all callers of vqshlu_n_s8_ will use a uniform vector that's just the same element repeated. @GeorgeWort which values are needed for that constant across all of stdarch?

@lcnr
Copy link
Contributor

lcnr commented Dec 1, 2023

adt_const_params is also marked incomplete so I wasn't sure if this is usable yet.

I think that warning can be removed by now. I am not working on const generics myself anymore and it that work has generally pretty much stalled, so I don't know the exact status there.

Is there any way to avoid depending on generic_const_exprs by using inline const blocks or so? I tried vqshlu_n_s8_::<{ const { ... } }> but that didn't really change anything.

once a constant is used in the type system, you need gce if it's not concrete of a generic parameter by itself. This means you cannot have generic expressions as generic args for functions without gce.

One hack would be the following:

#[derive(PartialEq, Eq)]
struct int8x2_t(i8, i8);

#[lang = "ToValueTrait"]
trait ToValue<T> {
    #[lang = "ToValueValue"]
    const VALUE: T;
}

extern "rust-intrinsics" {
    fn vqshlu_n_s8_<T: ToValue<int8x2_t>>(a: int8x2_t) -> int8x2_t;
}

fn vqshlu_n_s8<const N: i32>(a: int8x2_t) -> int8x2_t {
    struct Converter<const N: i32>;
    impl<const N: i32> ToValue<int8x2_t> for Converter<N> {
        const VALUE: int8x2_t = int8x2_t(N as i8, N as i8);
    } 
    vqshlu_n_s8_::<Converter<N>>(a)
}

this means we don't have to emulate the computation in the compiler, we can lookup the associated type and compute it. It doesn't need gce as the associated const is never used in the type system. Still have to decide how to deal with evaluation failures when evaluating the associated const, but 🤷

@RalfJung
Copy link
Member

RalfJung commented Dec 1, 2023

Yeah at that point I think I'd prefer building on the shufflevector hack.^^
That also defines a very concrete goal for what we hope const generics can do one day. :)

@workingjubilee workingjubilee added the A-simd Area: SIMD (Single Instruction Multiple Data) label Dec 13, 2023
GeorgeWort added a commit to GeorgeWort/rust that referenced this issue Dec 15, 2023
… as constants

This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic.

fix rust-lang#118209
GeorgeWort added a commit to GeorgeWort/rust that referenced this issue Dec 15, 2023
… as constants

This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic.

fix rust-lang#118209
GeorgeWort added a commit to GeorgeWort/rust that referenced this issue Dec 15, 2023
… as constants

This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic.

fix rust-lang#118209
GeorgeWort added a commit to GeorgeWort/rust that referenced this issue Dec 15, 2023
… as constants

This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic.

fix rust-lang#118209
GeorgeWort added a commit to GeorgeWort/rust that referenced this issue Dec 15, 2023
… as constants

This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic.

fix rust-lang#118209
GeorgeWort added a commit to GeorgeWort/rust that referenced this issue Dec 19, 2023
… as constants

This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic.

fix rust-lang#118209
@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 11, 2024
GeorgeWort added a commit to GeorgeWort/rust that referenced this issue Feb 15, 2024
… as constants

This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic.

fix rust-lang#118209
GeorgeWort added a commit to GeorgeWort/rust that referenced this issue Feb 15, 2024
… as constants

This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic.

fix rust-lang#118209
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-simd Area: SIMD (Single Instruction Multiple Data) C-bug Category: This is a bug.
Projects
None yet
7 participants