Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign uprepr(simd) is unsound #44367
Comments
This comment has been minimized.
This comment has been minimized.
|
Two thoughts I've had in the past how to fix this:
Given that |
This comment has been minimized.
This comment has been minimized.
|
It's worth noting that, when talking about LLVM features (which rust target features currently map directly to) this also affects floats (which are stable). I.e., on x86 with current safe rust if you compile one crate with --target-feature=+soft-float" and one without you have an issue. This can also be solved as Alex mentions though. |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton I would prefer to start with a hard error, and if we need it, add a way to opt-in to the shim generation (*). The only thing that concerns me about the hard error, is that we will probably need to emit this during monomorphization. I think that this is not acceptable, and we should only do this if it's either temporary or there is no other way. @eddyb pointed out that currently stable rust has no monomorphization-time errors, so this solution might block stabilization. @alexcrichton you mentioned that this would mean that SIMD types must be then banned from FFI because we don't know the calling convention of the caller. Could you elaborate on this? As I see it, FFI is already unsafe, so it would be up-to-the-user to make sure that the callee is using the appropriate calling convention. (*) I haven't thought this through, but I imagine getting a hard error for a particular call site, and then wanting to opt-in for that particular call site only, into the shim generation. Anyways, we don't need to think this all the way through now. |
This comment has been minimized.
This comment has been minimized.
|
I'd personally be totally ok with a hard error, but yes I think we'd have to do it during monomorphization. It's true that we don't have many monomorphization errors today but I don't think we have absolutely 0, and I'd personally also think that we should at least get to a workable state and try it out to evaluate before possibly stabilization. I, personally again, would be fine mostly likely stabilizing with monomorphization errors.
Oh right yeah! Right now we've got a lint in the compiler for "this type is unsafe in FFI", and for example it lints about bare structs that are not @parched I don't believe we're considering a |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton I've recently reviewed all non-ICE errors from |
This comment has been minimized.
This comment has been minimized.
|
While
The above code actually crashes in LLVM on the playground, because that's x86_64 where SSE2 is always present. However, on a 32 bit x86 target, |
This comment has been minimized.
This comment has been minimized.
|
How about we make the target spec define the ABI regardless of what extra features are enabled or disabled by attributes or the commandline. That would avoid the need for shims and monomorphization errors. So for example on x86_64 which has 128-bit vectors by default:
|
alexcrichton
added
A-simd
C-bug
I-unsound 💥
T-compiler
labels
Sep 7, 2017
This comment has been minimized.
This comment has been minimized.
|
@rkruppe Thanks for that example. Two small comments:
I personally wouldn't like to have to re-open this topic in the future when people start filling bugs due to random segfaults because some crate in the middle of their dependency graph decided that it was a good idea to add |
This comment has been minimized.
This comment has been minimized.
|
@parched some Ideally, if I have an SSE dynamic library that exposes some functions on its ABI for SSE...AVX2, I would like to be able to add some new AVX3/4 functions to its interface, recompile, and produce a library that is ABI compatible with the old one, so that all my old clients can continue to work as is by linking to the new library, but newer code is able to call the new AVX3/4 functions. That is, adding those new AVX3/4 functions should not break the ABI of my dynamic library as long as my global target is still |
This comment has been minimized.
This comment has been minimized.
@gnzlbg yes 512-bit and 1024-bit vectors would have to be treated the same way but I don't believe adding more would be an issue.
For that case you would just have to make your new functions |
This comment has been minimized.
This comment has been minimized.
|
@parched I think I misunderstood your comment then.
Do you think that @alexcrichton @BurntSushi I've slept over this a bit, and I think the following is a common idiom that we need to enable: #[target_feature = "sse"]
fn foo(v: f32x8) -> f32x8 {
// f32x8 has SSE ABI here
let u = if std::host_feature(AVX) {
foo_avx(v) // mismatched ABI: hard error (argument)
// mismatched ABI: hard error (return type)
} else {
/* SSE code */
}
/* do something with u */
u
}
#[target_feature = "avx"]
fn foo_avx(arg: f32x8) -> f32x8 { ... }Here we have some mismatching ABIs. I am still fine with making these mismatching ABIs hard errors as long as there is an opt-in way to make this idiom work. What do you think about using #[target_feature = "sse"]
fn foo(v: f32x8) -> f32x8 {
// f32x8 has SSE ABI here
let u = if std::host_feature(AVX) {
// foo_avx(v) // ERROR: mismatched ABIs (2x arg and ret type)
// foo_avx(v as f32x8) // ERROR: mismatched ABIs (1x ret type)
foo_avx(v as f32x8) as f32x8 // OK
} else {
/* SSE code */
}
/* do something with u */
u
}That is, an Do you think we can extend this to make function pointers work?: #[target_feature = "+sse"] fn foo(f32x8) -> f32x8;
static mut foo_ptr: fn(f32x8) -> f32x8 = foo;
unsafe {
// foo_ptr = foo_avx; // ERROR: mismatched ABI
foo_ptr = foo_avx as fn(f32x8) -> f32x8; // OK
}
// assert_eq!(foo_ptr, foo_avx); // ERROR: mismatched ABIs
assert_eq!(foo_ptr, foo_avx as fn(f32x8) -> f32x8); // OKI was thinking that in this case, I think that pursuing this would require us to track the ABI of I think that if we can lift these errors to type-checking:
Thoughts? EDIT: even if we never stabilize EDIT2: That is, this issue would be resolved by making the original example fail with a type error, and adding the |
This comment has been minimized.
This comment has been minimized.
For the record, that's not true, one just needs a target that doesn't have SSE enabled by default (or defaults to soft-float), such as the (tier 2) I do agree that we should find a proper solution right now, especially since the "cheap fixes" that I'm aware of (monomorphization-time error, or strongarm the ABIs into being compatible by explicitly passing problematic types on the stack) permit code that probably wouldn't work unmodified under a more principled solution. Unfortunately I don't have the time to dive into solutions right now, so I can't contribute anything but nagging at the moment |
This comment has been minimized.
This comment has been minimized.
|
@gnzlbg specifically you think that passing arguments like I'm also not sure we can ever get function pointers to "truly work" unless we declare "one true ABI" for these types, otherwise we have no idea what the actual abi of the function pointer is. (this bit about function pointers is pushing me quite a bit into the camp of "just declare everything unsafe and document why") |
This comment has been minimized.
This comment has been minimized.
Of course it happens, see this SO question, but users get warnings and undefined behavior pretty quickly and learn to work around this (that is, "don't do that", pass a An important point is that this can only happen when you have ABI incompatible vector types. That is, if you are using from SSE to SSE4.2, then you never run into these issues, because they are introduced by AVX which is relatively recent, and by AVX512 which is very rare (EDIT: on ARM you only have NEON so this does not happen, and the new SVE completely works around this issue).
Why do we need one true ABI for these types? For example: #[target_feature = "+sse"]
fn foo() {
let a: fn(f32x8) -> f32x8; // has type fn(f32x8["sse"]) -> f32x8["sse"]
}
#[target_feature = "+avx"]
fn bar() {
let a: fn(f32x8) -> f32x8; // has type fn(f32x8["avx"]) -> f32x8["avx"]
}
static a: fn(f32x8) -> f32x8; // has type fn(f32x8["CRATE"]) -> f32x8["CRATE"]
// where CRATE is replaced with whatever feature the crate is compiled withThat is, two function pointers, compiled on different crates, or functions, with different features, would just be different types and generate a type error. |
This comment has been minimized.
This comment has been minimized.
|
Another workaround in C is to do something like this: First we need a way to merge two 128bit registers into a 256bit register (or a "no op" in SSE): #[target_feature = "+sse"]
fn merge_sse(x: (f32x4, f32x4)) -> f32x8; // no op?
#[target_feature = "+avx"]
fn merge_avx(x: (f32x4, f32x4)) -> f32x8;
// ^^^^ copy 2x128bit registers to 1x256registerthen we need its inverse, that is, a function that takes a 256bit value (or two in SSE) and returns 2 128 bit registers: #[target_feature = "+sse"]
fn split_sse(f32x8) -> (f32x4, f32x4); // no op?
#[target_feature = "+avx"]
fn split_avx(f32x8) -> (f32x4, f32x4);
// ^^^^ copy the parts of a 256bit register into 2x128bit registersthen we add some macros to communicate macro_rules! from_sse_to_avx { ($x:expr) => (merge_avx(split_sse($x)) }
macro_rules! from_avx_to_sse { ($x:expr) => (merge_sse(split_avx($x)) }
macro_rules! from_sse_to_avx_and_back {
($f:expr, $x:expr) => (from_avx_to_sse!($f(from_sse_to_avx!($x))))
}and then we can safely write the code above as: #[target_feature = "sse"]
fn foo(v: f32x8) -> f32x8 {
// f32x8 has SSE ABI here
let u = if std::host_feature(AVX) {
// foo_avx(v) // mismatched ABI: hard error (argument)
from_avx_to_sse_and_back!(foo_avx, v); // OK
} else {
/* SSE code */
}
/* do something with u */
u
}
#[target_feature = "avx"]
fn foo_avx(arg: f32x8) -> f32x8 { ... } |
This comment has been minimized.
This comment has been minimized.
It could, but if you did that you wouldn't be able to call that function from another without |
This comment has been minimized.
This comment has been minimized.
|
@gnzlbg everything you're saying seems plausible? You're thinking that passing types like |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton the
What exactly do you propose to call/make After exploring all these options, I'd like to propose a path forward.
@eddyb said above that stable Rust has zero monomorphization time errors. I don't think that introducing one will cut it for stabilization. To produce a type-checking error, we need to "somehow" propagate the Once we are there users that want to convert between incompatible ABIs can do so with this idiom, which can also be extended to make function pointers work. We could provide procedural macros in a crate that do this automatically, and if that becomes a pain point in practice we could re-evaluate adding language support for that (e.g. something along the lines of the (*) If I understand this correctly, we would need to require that all |
This comment has been minimized.
This comment has been minimized.
|
@gnzlbg what I mean is that yes, I personally think that at this point it's not worth trying to push this back into typechecking. That sounds like quite a lot of work for not necessarily a lot of gain. Additionally I'd be worried that it'd expose hidden costs and/or complexities that are very difficult to get right. For example if we had: static mut FOO: fn(u8x32) = default;
#[target_feature = "+avx2"]
unsafe fn bar() {
FOO = foo;
}
#[target_feature = "+avx2"]
unsafe fn foo(a: u8x32) {
}
unsafe fn default(a: u8x32) {
}
fn main() {
bar();
FOO(Default::default());
}How do we rationalize that? Does the Note that this doesn't even start to touch on trait methods. For example how do we also rationalize These just seem like really difficult questions to answer and I'm not personally sold on there being any real benefit to going to all this effort to statically verify these things. All of SIMD is unsafe anyway now, so trying to add static guarantees on top of something that we've already declared as fundamentally unsafe would be nice but to me doesn't seem necessary. |
This comment has been minimized.
This comment has been minimized.
|
I'll try to explain what I am proposing better because I think we are misunderstanding each other. First, we need to declare the #[repr(simd)]
struct u8x32(u8, u8, ...);but what I am proposing is not to make #[repr(simd)]
struct<ABI> u8x32(u8, u8, ...);When the user uses a type CRATE_ABI = /* from target features of the crate */; Now we proceed with the example. First, the user writes: static mut FOO: fn(u8x32) = default; // OK, compiles
fn default(a: u8x32) { }That compiles and type checks, because implicitly, the code looks like this: static mut FOO: fn(u8x32<CRATE_ABI>) = default; // OK, compiles
fn default(a: u8x32<CRATE_ABI>) { }So the types unify just fine. Note that Now let's get a bit more messier. The user writes: #[target_feature = "+avx2"] unsafe fn foo(a: u8x32) {}
#[target_feature = "+avx2"]
unsafe fn bar() {
FOO = foo;
}but what this does is the following: #[target_feature = "+avx2"] unsafe fn foo(a: u8x32<AVX2_ABI>) {}
#[target_feature = "+avx2"]
unsafe fn bar() {
FOO = foo; // OK or Error?
}So is this code ok or is in an error? From the information provided, we cannot say. It depends on what the So at this point we are ready to move to trait methods. What should this do? fn main() {
FOO(Default::default());
}Well the same thing it does for any other type. It is just type-checking at work. If it can unify the type parameters then everything is ok, and otherwise, it does not compile. Obviously we need to nail the ABI types so that code only compile when it is safe, and breaks otherwise.
I hope this has become clear, but just to be crystal clear: we never produce shims, either the ABI matches, or it doesn't. The users can manually write the shims if they need to by using this idiom.
I hope this has become clear.
I think I am misunderstanding what you mean here. Right now, Also, how are you exactly proposing to make |
This comment has been minimized.
This comment has been minimized.
|
@gnzlbg So if I understand correctly, you would extend the type system in a way that would be visible to users in type mismatches? Would this also apply to float types? |
This comment has been minimized.
This comment has been minimized.
|
To solve the -sse issue that would need to apply to floats as well.
…On Sat 9. Sep 2017 at 22:24, Robin Kruppe ***@***.***> wrote:
@gnzlbg <https://github.com/gnzlbg> So if I understand correctly, you
would extend the type system in a way that would be visible to users in
type mismatches? Would this also apply to float?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44367 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA3NpiFDkpBDLi6rer_cY8lYCcxxb7gpks5sgvPlgaJpZM4POc13>
.
|
This comment has been minimized.
This comment has been minimized.
|
Okay, thanks. My impression after only very little thought is that I agree with @alexcrichton that a proper automatic solution to the ABI problems seems pretty complicated. If functions tagged with Letting unsafe code do its thing would mean that a strict solution such as the one @gnzlbg proposed can't be introduced later. However, it would still be possible to start generating shims based on the caller's and callee's set of target features. |
This comment has been minimized.
This comment has been minimized.
|
I want to add, though: ABI mismatches are potentially very subtle and annoying bugs, so there should be a warning at least for the cases that can be easily detected statically. |
This comment has been minimized.
This comment has been minimized.
|
@gnzlbg I think that makes sense, yeah, but I can't really comment on whether I'd think it's feasible or not. It sounds pretty complicated and subject to subtly nasty interactions, my worry would be that we'd spend all our effort chasing along tail of bugs to make this airtight. Is this really a common enough idiom to warrant the need to provide a static error instead of discovering this at runtime?
I haven't though too too hard about this, admittedly. If we expose APIs like |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton Why would Edit: On second thought, since the functions that use And of course, there's the issue that |
This comment has been minimized.
This comment has been minimized.
|
FYI the shims are required to make
Here, since |
This comment has been minimized.
This comment has been minimized.
|
SIMD code tends to use |
This comment has been minimized.
This comment has been minimized.
|
Won't |
This comment has been minimized.
This comment has been minimized.
|
@glaebhoerl Whether
Adding shims adds more code which increases compile-time, the question is by how much? Most applications I know of have only a tiny fraction of explicit SIMD code, so this might not be even measurable. Also, all applications working properly today won't probably need shims (otherwise they wouldn't be working correctly), so I wouldn't worry about this initially beyond checking that compile-times don't explode for the crates currently using SIMD. Whether execution times will be affected is very hard to tell. Applications have little explicit SIMD code, but that code is typically in a hot spot. As a thought experiment, we can consider the worst case: Beyond the worst case things get better though: if an application is just executing a single SIMD instruction in their hot loop, e.g., taking two arguments, and the shims aren't removed then they might go from 1 cycle to 4 cycles. And if they are doing something more complex then the cost of the shims quickly becomes irrelevant. In all of these cases:
If binary size/execution speed/compile time turns out to be a problem for debug builds I'd say let's worry about that when we get there. |
This was referenced Dec 29, 2017
This comment has been minimized.
This comment has been minimized.
|
I've been thinking about this again recently with an eye towards hoping to push SIMD over the finish line towards stabilization. Historically I've been a proponent of adding "shims" to solve this problem at compile time. These shims would cause any ABI mismatch to get resolved by transferring arguments through memory instead of registers. As I think more and more about the shims, however, I'm coming round to the conclusion that they're overly difficult (if and not sure if possible) to implement. Especially when dealing with function pointers is where I feel like things get super tricky to do. Along those lines I've been reconsidering another implementation strategy, which is to always pass arguments via memory instead of by value. In other words, let's say you write: fn foo(a: u8x32) { ... }Today we'd generated something along the lines of (LLVM-wise) define @foo(<i8 x 32>) {
...
}whereas instead what I think we should generate is: define @foo(<i8 x 32>*) { ; note the *
...
}Or in other words, SIMD values are unconditionally passed through memory between all functions. This would, I think, be much easier to implement and also jive much more nicely with the implementation of everything else in rustc today. I've historically been opposed to this approach thinking that it would be bad for performance, but when thinking about it I actually don't think there's going to be that much impact. In general I'm under the impression that SIMD code is primarily about optimizing hot loops, and in these sorts of situations if you have a literal function call that's already killing performance anyway. In that sense we're already inlining everything enough to remove the layer of indirection by storing values on the stack. If that's true, I actually don't think that if we leave a AFAIK the main trickiness around this would be that Rust functions would pass all the vector types via memory, but we'd need a way to pass them by value to variuos intrinsic functions in LLVM. In general though, what do others think about an always-memory approach? |
This comment has been minimized.
This comment has been minimized.
The intrinsics don't have the I think this approach is the easiest out of all possible ones, since all you need to change is: rust/src/librustc_trans/abi.rs Lines 872 to 875 in 247835a There's two ways to do it:
|
gnzlbg commentedSep 6, 2017
•
edited
The following should be discussed as part of an RFC for supporting portable vector types (
repr(simd)) but the current behavior is unsound (playground):Basically, those two objects of type
f32x8have a different layout, sofooandbarhave a different ABI / calling convention. This can be introduced withouttarget_feature, by compiling two crates with different--target-cpus and linking them, buttarget_featurewas used here for simplicity.