New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD groundwork #1199

Merged
merged 25 commits into from Sep 18, 2015

Conversation

Projects
None yet
@huonw
Member

huonw commented Jul 10, 2015

@eternaleye

This comment has been minimized.

eternaleye commented Jul 10, 2015

X-posting my late comment from the pre-RFC:

What I was saying was a bit more than that.

  1. From my earlier post, these may not need to be intrinsics at all - one-instruction-of-asm!() functions with #[inline(always)] and proper register specifiers on the asm!() can do the job except for the quirky magic structural typing.
  2. The ergonomics of strict types, whether [u32; 4] or Simd4<u32>, really aren't that bad for low-level building blocks that will mostly live behind prettier interfaces.
  3. What problems there are with the ergonomics can be largely resolved with T: Structural<Layout=[u32; 4]>> + SimdSafe, where SimdSafe is a marker trait denoting the same things as #[repr(simd)], and possibly added by it.

And yeah, I edited "alignment voodoo" into my post before you mentioned that 😛

Anyway, the result of the above is that one only really needs two changes to the compiler:

  1. #[repr(simd)]
  2. #[lang_item="simd_repr_marker"] (added by #[repr(simd)])

Structural can be done without any help from the compiler, but would benefit a lot from a #[derive]

But as far as benefits, this avoids a large mass of worryingly magical (regarding parameter types) intrinsics being added to the compiler.

@nikomatsakis nikomatsakis added the T-lang label Jul 10, 2015

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Jul 10, 2015

these may not need to be intrinsics at all

Unfortunately if we want to get SIMD on stable Rust this would necessitate a stable asm! macro, and we've got a much stronger story for stabilizing these intrinsics than we do that macro

What problems there are with the ergonomics can be largely resolved with T: Structural<Layout=[u32; 4]>> + SimdSafe, where SimdSafe is a marker trait denoting the same things as #[repr(simd)], and possibly added by it.

I believe a core aspect of this RFC is that it's stabilizing the absolute bare minimum of what the compiler needs to support SIMD. If we add in a few language traits and other various types here and there it's more surface area that will have to be stabilized. If this stuff can be built externally in a library, that'd be great! This RFC, however, is just focused on the compiler support.

@eternaleye

This comment has been minimized.

eternaleye commented Jul 10, 2015

This RFC, however, is just focused on the compiler support.

Sure! I just think that the way it makes the intrinsics unify types with sufficiently similar layouts under #[repr(simd)] is neither necessary nor advisable, and wanted to lay out a library-based alternative.

Rust currently has a nominal type system rather than a structural one; some subset of intrinsics magically ignoring that feels very questionable to me.

@huonw

This comment has been minimized.

Member

huonw commented Jul 10, 2015

some subset of intrinsics magically ignoring that feels very questionable to me.

Hm, maybe I haven't been clear. They're not really ignoring it.

Suppose we have:

#[repr(simd)]
struct A(f64, f64);
#[repr(simd)]
struct B(f64, f64);

extern {
    fn some_simd_intrinsic(x: A);
}

It's not legal to call some_simd_intrinsic(B(0.0, 1.0)) (type error: some_simd_intrinsic was declared to take A not B). The "structural typing" of intrinsics just means it's valid to also write the above and (if desired) extern { fn some_simd_intrinsic(x: B); } elsewhere. I see it some what similar to how importing C functions work: if some C function takes a struct Foo, we don't require that there's some single canonical type (or structural generic) always passed to every import of that C function, just that the function passed has the right layout.

(That said, it's slightly different, since the compiler doesn't enforce that arguments have the right layout for C functions (has to be done by the programmer), whereas the SIMD intrinsics do have layout-enforcement.)

@eternaleye

This comment has been minimized.

eternaleye commented Jul 10, 2015

Ah, that does make rather more sense, and is not the impression I had gotten.

@main--

This comment has been minimized.

main-- commented Jul 11, 2015

There would additionally be a small set of cross-platform operations that are either generally efficiently supported everywhere or are extremely useful. These won't necessarily map to a single instruction, but will be shimmed as efficiently as possible.

  • shuffles and extracting/inserting elements
  • comparisons

Lastly, arithmetic and conversions are supported via built-in operators.

The Motivation section mentions how this RFC aims to provide just some ground-work on top of which nice SIMD functionality could be built. While builtin arithmetic, shuffles etc for repr(simd) types is nice and convenient, providing it at this level seems questionable. I think something like this could be accomplished inside the to-be-written SIMD library as well with some operator overloading and the intrinsic functions for basic arithmetic.

The indices iN have to be compile time constants.

I have a bad feeling about this. A regular method call shouldn't require parameters to be compile time constants. Using generics to express this requirement as shown here depends on #1062, but it would be a much cleaner solution.

Out of bounds indices yield unspecified results.

Wouldn't a compile error be nicer here?


That aside, when reading this RFC I had the same thought as @eternaleye: Why implement compiler magic when these functions could be implemented in plain rust with asm!()? The absence of stable inline asm in a systems programming language is annoying and while this RFC attempts to make up for that by simply importing all C/C++ intrinsics into Rust, this set grows regularly, requiring a compiler update every time when an update of the system LLVM would be sufficient. (Well, it's mostly the maintenance overhead for rust. Why maintain a set of intrinsics when someone else is already doing it?)

Additionally, inline asm allows the programmer to influence things like instruction scheduling and register allocation (within the asm section), in case the compiler is doing a bad job in that regard.

So I'd suggest solving the "Operations" section of the RFC in a way that doesn't require any compiler changes (at least not specifically for SIMD).


I'm unsure about what the repr(simd) introduced here really does. I guess its primary purpose is signaling to the compiler that this struct can live in the SIMD registers and be subject to SIMD operations (like this builtin arithmetic).

People often argue that vectorization is best left to the compiler and as Rust uses LLVM, many simple cases can benefit from great optimizations. But unfortunately optimizers aren't perfect and simply unable to handle sufficiently complex code, so obviously an explicit way for programmers to express how they want code to be vectorized is necessary. Somewhere above, I suggested removing even those basic operations from repr(simd) types and doing them in intrinsics (or preferably inline asm) instead. (Danger ahead: I fear these ideas might require significant LLVM changes and therefore be infeasible) Then we could think further and consider removing repr(simd): The inline asm constraints (or the intrinsics) should ensure that these values stay in the SIMD registers while we're working with them. While we aren't, it's up to the compiler. As a result, the compiler could decide to hold even non-SIMD values inside the SIMD registers which is probably not a big issue as I remember the Intel optimization manual mentioning how spilling to XMM registers can be faster than spilling to memory.


So to summarize, I'm in favor of simplifying/breaking up the "types" and the "operations" sections of the RFC by building on top of much less specific compiler features and pushing as much work as possible into the to-be-written SIMD crate (and moving the "platform detection" section into its own RFC).

This feels appropriate as the RFC's intention was just basic groundwork. Or in other words: I'm arguing that "the absolute bare minimum of what the compiler needs to support SIMD" is zero. The missing parts aren't necessarily SIMD-specific.

@eternaleye

This comment has been minimized.

eternaleye commented Jul 11, 2015

@main--

I'm unsure about what the repr(simd) introduced here really does. I guess its primary purpose is signaling to the compiler that this struct can live in the SIMD registers and be subject to SIMD operations (like this builtin arithmetic).

  1. Can live in SIMD registers (though, not sure how it'll handle someone trying to apply it to [u64; 37] or other silliness)
  2. Interior references are forbidden
  3. Tweaks the in-memory layout to match SIMD for platform
  4. Subject to SIMD operations
  5. subject to SIMD alignment constraints

@huonw, did I miss anything? ISTR you mentioning it didn't validate things like that they need to be homogenous, and that such things would be left up to impl'ing some unsafe trait

@pcwalton

This comment has been minimized.

Contributor

pcwalton commented Jul 11, 2015

Arithmetic identity-based optimizations apply just as well to SIMD operations as they do to non-SIMD operations. By using the intrinsics as opposed to inline asm we give LLVM the ability to do those optimizations.

@main--

This comment has been minimized.

main-- commented Jul 11, 2015

@pcwalton Oh, I didn't know that! Yes, that's a big advantage of the intrinsics then.

@RalfJung

This comment has been minimized.

Member

RalfJung commented Jul 11, 2015

Concerning the structural typing when importing the intrinsics: Please be careful that this does not end up allowing people to "peek through" private abstractions of data-types. That would be a horrible mess of a safety issue.

Essentially, such merely structural typing should only be allowed if the module had access to all the fields of the type anyway: Either because they were all public (all the way down), or because the type was defined in the same module.
What's the reason for choosing this unconventional approach to typing, rather than using tuples, or arrays, or lang-items for the Simd* types?

@Aatch

This comment has been minimized.

Contributor

Aatch commented Jul 13, 2015

Just a note to all the people that keep mentioning inline assembly: asm! is pretty much a black box for LLVM. You can tell it a decent amount about the contents, but at the end of the day, it's going to have to take a conservative approach to it. LLVM can reason a lot more about regular operations on vectors (arithmetic, equality, shuffling) and intrinsic functions than it can single-instruction inline asm segments.

@huonw

This comment has been minimized.

Member

huonw commented Jul 13, 2015

The Motivation section mentions how this RFC aims to provide just some ground-work on top of which nice SIMD functionality could be built. While builtin arithmetic, shuffles etc for repr(simd) types is nice and convenient, providing it at this level seems questionable. I think something like this could be accomplished inside the to-be-written SIMD library as well with some operator overloading and the intrinsic functions for basic arithmetic.

I agree that it isn't totally necessary to actually use the arithmetic operators: we could instead use a generic intrinsic similar to the comparison operators. However, I think it is important we do more than the platform intrinsics: LLVM (and compilers in general) knows more about its internal add instruction than arbitrary platform specific intrinsics, and so may be able to optimise it more aggressively.

For shuffles, the optimisation applies: the compiler can e.g. simplify a sequence of shuffles into a single one. Also. the RFC discusses this. One point in it is the compiler synthesizing an optimal (/close to optimal) sequence of instructions for an arbitrary shuffle, instead of forcing the programmer to think about doing that themselves.

I have a bad feeling about this. A regular method call shouldn't require parameters to be compile time constants. Using generics to express this requirement as shown here depends on #1062, but it would be a much cleaner solution.

This isn't a regular method call: intrinsics are special in many ways. Note that my solution on #1062 that you link to just calls the intrinsic. This is the low-level API, people generally won't be calling the intrinsics directly.

Wouldn't a compile error be nicer here?

Yes, sort of. However, using the trick mentioned in #1062 would result in very poor error messages, since the shuffle order may be passed through multiple layers of generic function calls possibly in external crates, meaning the out-of-bounds error is generated deep inside code that the programmer didn't write.

Why implement compiler magic when these functions could be implemented in plain rust with asm!()?

As other have said, asm! is a black-box, and seriously inhibits optimisations.

Additionally, inline asm allows the programmer to influence things like instruction scheduling and register allocation (within the asm section), in case the compiler is doing a bad job in that regard.

Neither of these apply to this: the API is essentially exposing individual CPU instructions, i.e. each asm! block is a single instruction. Hence, there's no scheduling benefit, and none of the asm! blocks would use concrete registers: they'd all be "generic", to let the compiler allocate registers as it sees fit.

These reasons apply if one was, say, writing an entire inner loop as a single asm! block, but it doesn't apply here.

I'm unsure about what the repr(simd) introduced here really does. I guess its primary purpose is signaling to the compiler that this struct can live in the SIMD registers and be subject to SIMD operations (like this builtin arithmetic).

Yes. repr(simd) changes how a type is represented. E.g. it changes the alignment, imposes element constraints, and even changes its ABI (for function/FFI calls).

Concerning the structural typing when importing the intrinsics: Please be careful that this does not end up allowing people to "peek through" private abstractions of data-types. That would be a horrible mess of a safety issue.

It sort-of does, but in a very very restricted way, that's already possible with transmute.

What's the reason for choosing this unconventional approach to typing, rather than using tuples, or arrays, or lang-items for the Simd* types?

Tuples and arrays don't have the right low-level details.

repr(simd) is essentially acting as a lang-item that can be defined multiple times. All of the actual lang items (i.e. #[lang = "..."]) in the compiler can only be defined once in the entire hierarchy of dependencies of a compilation target, which means we'd either have to allow multiple versions of these lang items, or just disallow linking multiple SIMD crates into a project (e.g. two different crates that define low-level SIMD interfaces, or even just versions 0.1 & 0.3 or 1.0 & 2.3 or ... of a single SIMD crate).

@RalfJung

This comment has been minimized.

Member

RalfJung commented Jul 13, 2015

It sort-of does, but in a very very restricted way, that's already possible with transmute.

Transmute requires unsafe. It shouldn't be possible for safe code to violate abstraction boundaries. What you are proposing (if I follow your RFC correctly) is essentially that #[repr(simd)] implies that all fields are public, but the programmer doesn't have to write pub.

The restriction that the type used for the intrinsic has to be defined in the same module, shouldn't be a problem for the implementations you envision (with some crate(s) taking care of providing a descent abstraction) , right?

@huonw

This comment has been minimized.

Member

huonw commented Jul 13, 2015

Hm, I misunderstood what you were talking about. I'm unsure what the problematic situation you're envisioning could be. Is it something like:

Crate foo defines a simd Simd, crate bar depends on foo and loads the Simd type. bar extern's in an intrinsic like extern { fn some_simd_intrinsic(x: Simd); }... and then something bad happens? NB. the only way to call intrinsics is with unsafe.

(I'm not against the privacy restriction, I'm just trying to understand the motivation more concretely.)

@glaebhoerl

This comment has been minimized.

Contributor

glaebhoerl commented Jul 13, 2015

repr(simd) is essentially acting as a lang-item that can be defined multiple times.

Is a lang item the best intuition here? Wouldn't a closer analogy be repr(C)? In both cases the semantic content of the type is (mostly) unaffected and you're just specifying its underlying representation, which is mainly (or only) relevant at the ABI, rather than the API level.

The repr(simd) may not enforce that any trait bounds exists/does the right thing at the type checking level for generic repr(simd) types. As such, it will be possible to get the code-generator to error out (ala the old transmute size errors),

Would it be possible to just make this best-effort, and fall back to laying the type out normally if it doesn't meet the SIMD requirements (perhaps with a warning)? That seems cleaner than implicit requirements at the code generator level, which feels like a contract or layering violation (if it passes the typechecker, it should compile), and perhaps more in the spirit of #repr.

It is illegal to take an internal reference to the fields of a repr(simd) type, because the representation of booleans may require modification, so that booleans are bit-packed.

I wonder if the fact that the borrow checker enforces limits on observability wouldn't actually let us support the interior reference-taking, just in a less efficient way, by first copying the field out onto the stack when taking a shared & reference, and in the case of &mut, also copying it back when it goes out of scope. (Off the top of my head, the potential complication that comes to mind here is generic code - copying the value back is effectively a Drop impl for the &mut. But it does seem like it could actually be implemented in precisely that way... at least, I can't immediately think of why not.)

In this way it would be even more truly the case that #repr has no effect on the semantics, only on the representation and performance characteristics of the given type. (I had a similar idea, earlier, here.)

Any type marked repr(simd) automatically has the +, - and * operators work. The / operator works for floating point, and the << and >> ones work for integers.

I might be more comfortable with this if you had to explicitly write derive(Add), and so on, to get the desired operations, even if that in turn just ended up calling out to appropriate compiler magic. On the one hand, it does seem logical that "why would you repr(simd) if not to get the SIMD operations", but on the more important-seeming hand, I think there should be a separation of concerns, and #repr should really only affect the representation (as far as possible).

@RalfJung

This comment has been minimized.

Member

RalfJung commented Jul 13, 2015

NB. the only way to call intrinsics is with unsafe.

This essentially means its much less of an issue than I thought.

Hm, I misunderstood what you were talking about. I'm unsure what the problematic situation you're envisioning could be.

Which part is unclear - whether these rules allow code to get around the restriction that usually apply to private fields, or whether getting access to private fields is an issue? The answer to the latter is that it violates parametricity - for now I'll just assume that as accepted. Please tell me if I should elaborate on that. Unsafe code (through transmute) can violate parametricity anyways, but I would still prefer if no additional violations would be introduced.

Regarding the first part, let me try to come up with some examples. I assume "A::Simd" is some other crate's Simd type, with all fields private, and B::Simd is our own Simd.
If we have fn x86_mm_add_epi16(a: Simd8<i16>, b: Simd8<i16>) -> Simd8<i16> and the types all just match structurally, I could convert any B::Simd to an A::Simd by adding a 0, and choosing the argument types to be B::Simd and the return type to be A::Simd. Similarly, I can convert A::Simd to B::Simd. This gives me full access to all the private fields.
If we have fn simd_shuffle2<T, Elem>(v: T, w: T, i0: u32, i1: u32) -> Simd2<Elem>, I can choose T to be A::Simd and the return type to be B::Simd and convert from A::Simd to B::Simd with the appropriate options for the shuffles - and back, with a similar trick.

@huonw

This comment has been minimized.

Member

huonw commented Jul 13, 2015

Ah, I think you might be working with the same misunderstanding as @eternaleye, that the intrinsics can be called with any type that matches structurally, it's just that they can be declared with any type that matches structurally. See #1199 (comment) .

(I 100% agree that being able to write a transmute via SIMD intrinsics would be unfortunate.)

@RalfJung

This comment has been minimized.

Member

RalfJung commented Jul 13, 2015

I had that misunderstanding at first, but the post above was written without that assumption.

The first one, regarding add, can't I declare this with fn x86_mm_add_epi16(a: A::Simd, b: A::Simd) -> B::Simd?
And the second one, with shuffle, is actually explicitly declared generically, so it can be used with any T, right? Otherwise, why the distinction between explicit generics for shuffle, and implicit for abs?

@huonw

This comment has been minimized.

Member

huonw commented Jul 13, 2015

Is a lang item the best intuition here? Wouldn't a closer analogy be repr(C)? In both cases the semantic content of the type is (mostly) unaffected and you're just specifying its underlying representation, which is mostly relevant at the ABI, rather than the API level.

I agree that repr(C) is probably closer, however I was responding to a comment talking about lang-items. :)

Would it be possible to just make this best-effort, and fall back to laying the type out normally if it doesn't meet the SIMD requirements (perhaps with a warning)? That seems cleaner than implicit requirements at the code generator level, which feels like a contract violation (if it passes the typechecker, it should compile), and perhaps more in the spirit of #repr.

I think we can relax this in future if we find the RFC doesn't work well in practice. (It's part of the reason I proposed a hard error.)

I wonder if the fact that the borrow checker enforces limits on observability wouldn't actually let us support the interior reference-taking, just in a less efficient way, by first copying the field out onto the stack when taking a shared & reference, and in the case of &mut, also copying it back when it goes out of scope. (Off the top of my head, the potential complication that comes to mind here is generic code - copying the value back is effectively a Drop impl for the &mut. But it does seem like it could actually be implemented in precisely that way... at least, I can't immediately think of why not.)

Interesting idea, however it seems relatively complicated, and not worth it for SIMD: efficient SIMD code won't be handling/mutating individual elements like this very much.

I might be more comfortable with this if you had to explicitly write derive(Add), and so on, to get the desired operations, even if that in turn just ended up calling out to appropriate compiler magic. On the one hand, it does seem logical that "why would you repr(simd) if not to get the SIMD operations", but on the more important-seeming hand, I think there should be a separation of concerns, and #repr should really only affect the representation (as far as possible).

Another alternative is just providing arithmetic intrinsics.

The first one, regarding add, can't I declare this with fn x86_mm_add_epi16(a: A::Simd, b: A::Simd) -> B::Simd?

Oh, I see. It seems sensible to disallow it. I.e. have nominal equality constraints within a definition.

And for the shuffle, there's an implicit checked-at-code-gen link between T and Elem (i.e. Elem is the actual element type of T).

Shuffle is generic because it can be used with literally any SIMD type, i.e. it's not restricted to some subset of types with the same structure.

huonw added some commits Aug 12, 2015

shuffles don't rely on generic types for return values.
This has less type safety, but doesn't require generic simd types to
exist:

    #[repr(simd)] struct Simd2<T>(T, T);
Intrinsics-for-operations is now the RFC, not an alternative.
Also, the comparison comment no longer makes sense.
Only invalid to *call* intrinsics on bad platforms.
It's valid to `extern` them, though.
Internal references are legal.
Automatic crazy boolean bit-packing is crazy.
@huonw

This comment has been minimized.

Member

huonw commented Aug 14, 2015

Updated. Highlights:

  • removal of struct flattening and the internal-reference restriction
  • shuffles use @pnkfelix's suggestion for an array of indices (so much nicer!)

bors added a commit to rust-lang/rust that referenced this pull request Aug 15, 2015

Auto merge of #27169 - huonw:simd, r=alexcrichton
This implements rust-lang/rfcs#1199 (except for doing all the platform intrinsics).

Things remaining for SIMD (not necessarily in this PR):

- [x] I (@huonw) am signed up to ensure the compiler matches the RFC, when it lands
- [x] the platform specific intrinsics aren't properly type checked at the moment (LLVM will throw a "random" assertion)
- [ ] there's a lot of useful intrinsics that are missing, including whole platforms (mips, powerpc)
- [ ] the target-feature `cfg` detection/adding is not so great at the moment
- [x] I think the platform specific intrinsics should go in their own `extern` ABI (i.e. not `"rust-intrinsic"`)

(I'm adjusting the RFC to reflect the latter.)

I think it would be very nice for this to land without requiring the RFC to land first, because of the first point, and because this is the only way for any further work to happen/be experimented with, without requiring people to build/install/multirust a compiler from a custom branch.

r? @alexcrichton

bors added a commit to rust-lang/rust that referenced this pull request Aug 15, 2015

Auto merge of #27169 - huonw:simd, r=alexcrichton
This implements rust-lang/rfcs#1199 (except for doing all the platform intrinsics).

Things remaining for SIMD (not necessarily in this PR):

- [x] I (@huonw) am signed up to ensure the compiler matches the RFC, when it lands
- [x] the platform specific intrinsics aren't properly type checked at the moment (LLVM will throw a "random" assertion)
- [ ] there's a lot of useful intrinsics that are missing, including whole platforms (mips, powerpc)
- [ ] the target-feature `cfg` detection/adding is not so great at the moment
- [x] I think the platform specific intrinsics should go in their own `extern` ABI (i.e. not `"rust-intrinsic"`)

(I'm adjusting the RFC to reflect the latter.)

I think it would be very nice for this to land without requiring the RFC to land first, because of the first point, and because this is the only way for any further work to happen/be experimented with, without requiring people to build/install/multirust a compiler from a custom branch.

r? @alexcrichton

bors added a commit to rust-lang/rust that referenced this pull request Aug 15, 2015

Auto merge of #27169 - huonw:simd, r=alexcrichton
This implements rust-lang/rfcs#1199 (except for doing all the platform intrinsics).

Things remaining for SIMD (not necessarily in this PR):

- [x] I (@huonw) am signed up to ensure the compiler matches the RFC, when it lands
- [x] the platform specific intrinsics aren't properly type checked at the moment (LLVM will throw a "random" assertion)
- [ ] there's a lot of useful intrinsics that are missing, including whole platforms (mips, powerpc)
- [ ] the target-feature `cfg` detection/adding is not so great at the moment
- [x] I think the platform specific intrinsics should go in their own `extern` ABI (i.e. not `"rust-intrinsic"`)

(I'm adjusting the RFC to reflect the latter.)

I think it would be very nice for this to land without requiring the RFC to land first, because of the first point, and because this is the only way for any further work to happen/be experimented with, without requiring people to build/install/multirust a compiler from a custom branch.

r? @alexcrichton

bors added a commit to rust-lang/rust that referenced this pull request Aug 17, 2015

Auto merge of #27169 - huonw:simd, r=<try>
This implements rust-lang/rfcs#1199 (except for doing all the platform intrinsics).

Things remaining for SIMD (not necessarily in this PR):

- [x] I (@huonw) am signed up to ensure the compiler matches the RFC, when it lands
- [x] the platform specific intrinsics aren't properly type checked at the moment (LLVM will throw a "random" assertion)
- [ ] there's a lot of useful intrinsics that are missing, including whole platforms (mips, powerpc)
- [ ] the target-feature `cfg` detection/adding is not so great at the moment
- [x] I think the platform specific intrinsics should go in their own `extern` ABI (i.e. not `"rust-intrinsic"`)

(I'm adjusting the RFC to reflect the latter.)

I think it would be very nice for this to land without requiring the RFC to land first, because of the first point, and because this is the only way for any further work to happen/be experimented with, without requiring people to build/install/multirust a compiler from a custom branch.

r? @alexcrichton

bors added a commit to rust-lang/rust that referenced this pull request Aug 17, 2015

Auto merge of #27169 - huonw:simd, r=alexcrichton
This implements rust-lang/rfcs#1199 (except for doing all the platform intrinsics).

Things remaining for SIMD (not necessarily in this PR):

- [x] I (@huonw) am signed up to ensure the compiler matches the RFC, when it lands
- [x] the platform specific intrinsics aren't properly type checked at the moment (LLVM will throw a "random" assertion)
- [ ] there's a lot of useful intrinsics that are missing, including whole platforms (mips, powerpc)
- [ ] the target-feature `cfg` detection/adding is not so great at the moment
- [x] I think the platform specific intrinsics should go in their own `extern` ABI (i.e. not `"rust-intrinsic"`)

(I'm adjusting the RFC to reflect the latter.)

I think it would be very nice for this to land without requiring the RFC to land first, because of the first point, and because this is the only way for any further work to happen/be experimented with, without requiring people to build/install/multirust a compiler from a custom branch.

r? @alexcrichton
shuffles are exposed generally: intrinsics that represent arbitrary
shuffles.
This may violate the "one instruction per instrinsic" principal

This comment has been minimized.

@ranma42

ranma42 Aug 18, 2015

Contributor

"principal" seems to be a typo (for "principle")

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Sep 4, 2015

Hear ye, hear ye. This RFC is entering final comment period.

@glaebhoerl

This comment has been minimized.

Contributor

glaebhoerl commented Sep 5, 2015

What's the basic plan with respect to stabilization of these features? As long as the idea is just to get these into the unstable compiler, so that the implementation PR can land and we can start experimenting more seriously with higher-level APIs, further evolving the lower-level ones, and whatever else, that's completely cool. But the number of places (there's a few) where we're currently forced to say "enforcing this in the type system is difficult-to-impossible right now, so let's punt it to the backend" still bothers me if/when it's something we'd be committing to support as a stable feature forever (or until 2.0, anyways).

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Sep 10, 2015

@glaebhoerl I think there is no plan to stabilize until experience has been gained, but of course the further we go down this road, the less likely we'll back up and start from another.

@glaebhoerl

This comment has been minimized.

Contributor

glaebhoerl commented Sep 10, 2015

Basically, I suspect the approach I'd prefer is that once we have this infrastructure, and we've used it to gain some experience and to figure out the best way to formulate higher-level "type-safe" (i.e. without checks deferred to codegen) abstractions for SIMD (various traits, etc.), we should go ahead and stabilize those.

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Sep 11, 2015

To be clear, I agree with you Gabor -- I'd like to gain experience in what
we want to do, first, and then come back and see if we can find the most
elegant way to do it. (Not that this approach is unacceptable, it's quite
elegant in its own way.) I'm just acknowledging the power of incumbency. :)

On Thu, Sep 10, 2015 at 5:45 PM, Gábor Lehel notifications@github.com
wrote:

Basically, I suspect the approach I'd prefer is that once we have this
infrastructure, and we've used it to gain some experience and to figure out
the best way to formulate higher-level "type-safe" (i.e. without checks
deferred to codegen) abstractions for SIMD (various traits, etc.), we
should go ahead and stabilize those.


Reply to this email directly or view it on GitHub
#1199 (comment).

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Sep 18, 2015

Huzzah! The language subteam has decided to accept this RFC.

@nikomatsakis nikomatsakis merged commit 67fea6e into rust-lang:master Sep 18, 2015

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Sep 18, 2015

Tracking issue is rust-lang/rust#27731

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment