Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upSIMD groundwork part 1 #27169
Conversation
rust-highfive
assigned
alexcrichton
Jul 20, 2015
eefriedman
reviewed
Jul 20, 2015
| let llet = in_memory_type_of(cx, t.simd_type(cx.tcx())); | ||
| let elem_ty = match t.simd_machine_type(cx.tcx()) { | ||
| Some(e) => e, | ||
| None => cx.sess().fatal("monomorphising SIMD type to an incompatible element") |
This comment has been minimized.
This comment has been minimized.
eefriedman
Jul 20, 2015
Contributor
This is ugly... is allowing generic SIMD types actually important in practice? It seems like you could accomplish most of what is necessary with traits and maybe a few macros.
This comment has been minimized.
This comment has been minimized.
huonw
Jul 20, 2015
Author
Member
I agree it isn't so great, but note that in practice type safety is ensured via traits, e.g. struct Simd4<T: SimdElem>(T, T, T, T);.
This is the major difference to our current #[simd] scheme. Being generic is extremely useful, or else there is some extreme code-duplication and the compiler/library can't "synthesize" random types for helpers easily (i.e. forcing that duplication), e.g. fn get_high<T: SimdElem>(Simd4<T>) -> Simd2<T> works with any T but would require manually writing out for every Tx4 concrete type. (It'd be good to keep this sort of "high level" discussion on rust-lang/rfcs#1199 .)
This comment has been minimized.
This comment has been minimized.
eefriedman
Jul 20, 2015
Contributor
Still not sure this is a great... but okay, I'll move the discussion.
huonw
force-pushed the
huonw:simd
branch
2 times, most recently
from
68fb467
to
88cdfbd
Jul 20, 2015
alexcrichton
reviewed
Jul 21, 2015
|
|
||
| fn features_contain(sess: &Session, s: &str) -> bool { | ||
| sess.target.target.options.features.contains(s) || | ||
| sess.opts.cg.target_feature.contains(s) |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 21, 2015
Member
Isn't querying LLVM the "most robust" thing to do here? I guess all of these are disabled by default, though?
This comment has been minimized.
This comment has been minimized.
huonw
Jul 30, 2015
Author
Member
They're all disabled by default, yeah; and querying LLVM doesn't seem overly possible, unfortunately. You have a better idea about LLVM's structure so maybe you can wrangle something, though.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 31, 2015
Member
I believe you can get ahold of a TargetMachineRef, get ahold of the MCSubtargetInfo, get the list of features, and then test for various features being available.
It's probably not super critical that this happens immediately though.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 14, 2015
Member
Could you add a comment to this function with the results of this investigation? Basically something along the lines of "we'd love to do X but we couldn't do it because of Y, hence we're doing the 'poor mans' version Z"
alexcrichton
reviewed
Jul 21, 2015
| Float(u8), | ||
| Pointer(Box<Type>), | ||
| Vector(Box<Type>, u8), | ||
| } |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 21, 2015
Member
If this enum is just translated straight to a Ty below, perhaps this could just be a Ty in-memory and there could be an initialization pass for loading intrinsics into the tcx?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 31, 2015
Member
Just in the sense of instead of building up the custom Type here for signatures of all the functions and then pushing them into the tcx (with a translation layer), instead just push all the types into the tcx directly.
alexcrichton
reviewed
Jul 21, 2015
| "vminq_f32" => p!("fmin.v4f32", (f32x4, f32x4) -> f32x4), | ||
| "vminq_f64" => p!("fmin.v2f64", (f64x2, f64x2) -> f64x2), | ||
| _ => return None, | ||
| }) |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 21, 2015
Member
These blocks seem pretty similar to the pre-existing "intrinsic declaration" location. Perhaps these paths could be unified?
Also, do you know if all of these intrinsics are available on all of llvm 3.5+? I guess in general do you know if LLVM is adding new intrinsics basically every day, or are they kinda frozen now?
This comment has been minimized.
This comment has been minimized.
huonw
Jul 30, 2015
Author
Member
My (weak) intention is to somehow have this autogenerated, and there will be a lot (thousands), so having it separated seems better.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Jul 21, 2015
| Cmp::Ge => llvm::RealOGE, | ||
| }; | ||
| SExt(bcx, FCmp(bcx, op, llargs[0], llargs[1], call_debug_location), llret_ty) | ||
| }; |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
huonw
Jul 30, 2015
Author
Member
Theoretically, but compare_simd_types is probably the more appropriate variant.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Jul 21, 2015
| name if name.starts_with("x86_") || | ||
| name.starts_with("arm_") || | ||
| name.starts_with("aarch64_") => { | ||
| // FIXME: skip checking these for now |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 21, 2015
Member
Would this be easier to wire up if the Ty representation was used already for the intrinsic definitions?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Nice work @huonw!
I agree, not having the namespace of "simd_" made me a tad uneasy.
I would personally be ok with this, but I'm curious to hear what others think as well? Also, do you think the RFC basically has "broad consensus" modulo minor details by this point? If there are still some somewhat contentious decisions here and there it may be worth waiting a little longer before landing this, but if we're all in agreement pretty much then any future minor updates can be easily reflected here (as everything is unstable anyway). |
This comment has been minimized.
This comment has been minimized.
|
|
huonw
force-pushed the
huonw:simd
branch
from
88cdfbd
to
4de6fab
Jul 29, 2015
This comment has been minimized.
This comment has been minimized.
|
|
huonw
force-pushed the
huonw:simd
branch
3 times, most recently
from
fe76451
to
16c5c37
Aug 6, 2015
This comment has been minimized.
This comment has been minimized.
|
Ok, rebased and updated to be better, including
(I'm updating the RFC to ensure it matches the things I've learned while implementing, as we speak.) |
cesarb
added a commit
to cesarb/blake2-rfc
that referenced
this pull request
Aug 7, 2015
This comment has been minimized.
This comment has been minimized.
|
|
cesarb
added a commit
to cesarb/blake2-rfc
that referenced
this pull request
Aug 8, 2015
alexcrichton
reviewed
Aug 11, 2015
| @@ -897,7 +906,41 @@ pub fn trans_intrinsic_call<'a, 'blk, 'tcx>(mut bcx: Block<'blk, 'tcx>, | |||
|
|
|||
| } | |||
|
|
|||
| (_, _) => ccx.sess().span_bug(foreign_item.span, "unknown intrinsic") | |||
| (_, _) => { | |||
| match Intrinsic::find(tcx, &name) { | |||
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 11, 2015
Member
Could this de-indent a level via:
let intr = match ... {
None => ccx.sess().span_bug(...),
Some(intr) => intr,
};
This comment has been minimized.
This comment has been minimized.
|
Looking good to me! Just a minor nit so far |
huonw
force-pushed the
huonw:simd
branch
from
16c5c37
to
b05b2a4
Aug 13, 2015
This comment has been minimized.
This comment has been minimized.
|
Updated (will rebase in a bit), includes addressing the nit, adding a pile of platform intrinsic definitions (gets much of x86-64 from SSE to AVX2 and ARM NEON), and adding tests. |
huonw
force-pushed the
huonw:simd
branch
from
b05b2a4
to
d3ba568
Aug 13, 2015
This comment has been minimized.
This comment has been minimized.
|
(There's still some more tests I want to add, particularly run-pass tests making sure the various generic intrinsics behave sanely.) |
cesarb
reviewed
Aug 13, 2015
| @@ -223,7 +224,14 @@ pub fn sizing_type_of<'a, 'tcx>(cx: &CrateContext<'a, 'tcx>, t: Ty<'tcx>) -> Typ | |||
|
|
|||
| ty::TyStruct(..) => { | |||
| if t.is_simd(cx.tcx()) { | |||
| let llet = type_of(cx, t.simd_type(cx.tcx())); | |||
| let e = t.simd_type(cx.tcx()); | |||
| println!("sizing_type_of: simd: {}", e); | |||
This comment has been minimized.
This comment has been minimized.
cesarb
Aug 13, 2015
Contributor
This seems to be a stray debugging println!, which is hit while compiling libcore.
cesarb
reviewed
Aug 13, 2015
| @@ -405,7 +417,14 @@ pub fn in_memory_type_of<'a, 'tcx>(cx: &CrateContext<'a, 'tcx>, t: Ty<'tcx>) -> | |||
| } | |||
| ty::TyStruct(did, ref substs) => { | |||
| if t.is_simd(cx.tcx()) { | |||
| let llet = in_memory_type_of(cx, t.simd_type(cx.tcx())); | |||
| let e = t.simd_type(cx.tcx()); | |||
| println!("in_memory_type_of: simd: {}", e); | |||
This comment has been minimized.
This comment has been minimized.
cesarb
Aug 13, 2015
Contributor
This seems to be a stray debugging println!, which is hit while compiling libcore.
This comment has been minimized.
This comment has been minimized.
huonw
added some commits
Aug 11, 2015
huonw
force-pushed the
huonw:simd
branch
from
3d5cb38
to
4c92357
Aug 17, 2015
huonw
added some commits
Aug 14, 2015
huonw
force-pushed the
huonw:simd
branch
from
4c92357
to
02e9734
Aug 17, 2015
This comment has been minimized.
This comment has been minimized.
|
@bors r=alexcrichton |
This comment has been minimized.
This comment has been minimized.
|
|
huonw commentedJul 20, 2015
This implements rust-lang/rfcs#1199 (except for doing all the platform intrinsics).
Things remaining for SIMD (not necessarily in this PR):
cfgdetection/adding is not so great at the momentexternABI (i.e. not"rust-intrinsic")(I'm adjusting the RFC to reflect the latter.)
I think it would be very nice for this to land without requiring the RFC to land first, because of the first point, and because this is the only way for any further work to happen/be experimented with, without requiring people to build/install/multirust a compiler from a custom branch.
r? @alexcrichton