Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added all x86 SIMD codes. #37068

Closed
wants to merge 5 commits into from

Conversation

Projects
None yet
6 participants
@valarauca
Copy link

valarauca commented Oct 10, 2016

original pull request, rebased and resubmitted

This patch adds all x86_86 platform intrinsic. The names are compliant with those used in the GCC, ICC, and MSVC. The x86.rs file is generated via a rust program.

A ~4 months ago @sanxiyn request I add them to this commit when I got around to a submitting.

I realize this is directly modifying a generated file. But this method allows for automated updates without hand modifying writing JSON. I can extend decode-simd to include the MIPS, Aarch, Arm, POWER8, (and even NVVC+AMDGPU) platforms if that is needed/wanted.

Other commenters cc: @rkruppe @retep998

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Oct 10, 2016

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @arielb1 (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 10, 2016

So the current failure is one the tests. Test re-writing will be a large part of this patch, but before I start that I'd like the community to weigh on if this solution is worth adopting.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 10, 2016

Thanks for the PR! Would it be possible to update the JSON as well to then regenerate these files? (just to match what we've previously done here)

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 10, 2016

This is where the issue arises. I don't know how to bulk generate the x86.rs from the pythonscript + json files. As far as I can tell you get 1 x86.rs PER json file, then hand merge them(?). If we continue the pattern of 1 json file per feature set, this patch will require around a dozen new json files. So I'd like that clarified.

Really my main goal was to completely avoid the json files, or even replace them. They're incredibly frustrating to work with in practice. The main goal of decode-simd was to automate the generation of x86.rs from mining compiler projects. Which is easier to keep up to date then hand editing json.

@sanxiyn

This comment has been minimized.

Copy link
Member

sanxiyn commented Oct 11, 2016

I think we should give up on JSON, and that is what this PR is about.

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 12, 2016

@sanxiyn should I move forward and start extending decode SIMD to Aarch64, Arm, and MIPS?

As this isn't a total replacement for the Json/Python system I understand the hesitation with accepting it.

@alexcrichton For AArch+Arm should I attempt to persevere the current names/symbols? Or should I not be concerned if they're changed?

@alexcrichton alexcrichton assigned alexcrichton and unassigned arielb1 Oct 14, 2016

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 14, 2016

@valarauca ah ok, interesting! Sorry but I was unaware about that being the point of this PR.

Replacing the JSON files sounds good to me so long as we do it in a methodic fashion. Would it be possible to jettison them entirely and replace them with your script you've got? It'd be great to prove this out with the other platforms as well!

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 15, 2016

Okay so for writing tests. What is the difference between compile-fail, run-make, and run-pass? Because I'll need to edit these files.

For a primary test to prove the SIMD functionality should the script also have a feature to create a separate crate with a build.rs directive for linking against the GCC to prove out it's functionality? This'll likely require some fiddling and QEMU support on the end tester.

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 16, 2016

I removed the deny(warnings) take. It caused a TON of headaches in testing as ARM/x86 don't share some types. And ARM would throw an error, but x86 wouldn't. Then for some types x86 would throw errors where arm wouldn't.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 17, 2016

The compile-fail tests are ones that are expected to have a compile failure and have an assertion to what that error is. The run-make tests are just tests that need a Makefile to run (e.g. a kitchen sink for "this can't run in any other test suite" tests). Ideally run-make tests are avoided if possible. The run-pass test suite asserts that the tests compile and execute successfully.

It's ok to not add exhaustive testing to this though, we're likely not quite prepared for that.

valarauca added some commits Oct 18, 2016

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 18, 2016

Okay. I just had to clean up a few of the tests because they were using names/symbols that no longer exist. The u16 datatype isn't technically supported on intel intristics so I removed it, but the compile tests were using it.

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 18, 2016

Hey @alexcrichton I need your help with this one

{"message":"intrinsic argument 1 has wrong type: found vector with length 8, expected length 16","code":{"code":"E0442","explanation":"\nIntrinsic argument(s) and/or return value have the wrong type.\nErroneous code example:\n\n``compile_fail,E0442\n#![feature(repr_simd)]\n#![feature(platform_intrinsics)]\n\n#[repr(simd)]\nstruct i8x16(i8, i8, i8, i8, i8, i8, i8, i8,\n i8, i8, i8, i8, i8, i8, i8, i8);\n#[repr(simd)]\nstruct i32x4(i32, i32, i32, i32);\n#[repr(simd)]\nstruct i64x2(i64, i64);\n\nextern \"platform-intrinsic\" {\n fn x86_mm_adds_epi16(x: i8x16, y: i32x4) -> i64x2;\n // error: intrinsic arguments/return value have wrong type\n}\n``\n\nTo fix this error, please refer to the function declaration to give\nit the awaited types. Example:\n\n``\n#![feature(repr_simd)]\n#![feature(platform_intrinsics)]\n\n#[repr(simd)]\nstruct i16x8(i16, i16, i16, i16, i16, i16, i16, i16);\n\nextern \"platform-intrinsic\" {\n fn x86_mm_adds_epi16(x: i16x8, y: i16x8) -> i16x8; // ok!\n}\n``\n"},"level":"error","spans":[{"file_name":"/build/src/test/run-pass/simd-upgraded.rs","byte_start":866,"byte_end":915,"line_start":24,"line_end":24,"column_start":13,"column_end":62,"is_primary":true,"text":[{"text":" fn x86_mm_min_epi16(x: i16x8, y: i16x8) -> i16x8;","highlight_start":13,"highlight_end":62}],"label":null,"suggested_replacement":null,"expansion":null}],"children":[],"rendered":null} {"message":"intrinsic argument 2 has wrong type: found vector with length 8, expected length 16","code":{"code":"E0442","explanation":"\nIntrinsic argument(s) and/or return value have the wrong type.\nErroneous code example:\n\n``compile_fail,E0442\n#![feature(repr_simd)]\n#![feature(platform_intrinsics)]\n\n#[repr(simd)]\nstruct i8x16(i8, i8, i8, i8, i8, i8, i8, i8,\n i8, i8, i8, i8, i8, i8, i8, i8);\n#[repr(simd)]\nstruct i32x4(i32, i32, i32, i32);\n#[repr(simd)]\nstruct i64x2(i64, i64);\n\nextern \"platform-intrinsic\" {\n fn x86_mm_adds_epi16(x: i8x16, y: i32x4) -> i64x2;\n // error: intrinsic arguments/return value have wrong type\n}\n``\n\nTo fix this error, please refer to the function declaration to give\nit the awaited types. Example:\n\n``\n#![feature(repr_simd)]\n#![feature(platform_intrinsics)]\n\n#[repr(simd)]\nstruct i16x8(i16, i16, i16, i16, i16, i16, i16, i16);\n\nextern \"platform-intrinsic\" {\n fn x86_mm_adds_epi16(x: i16x8, y: i16x8) -> i16x8; // ok!\n}\n``\n"},"level":"error","spans":[{"file_name":"/build/src/test/run-pass/simd-upgraded.rs","byte_start":866,"byte_end":915,"line_start":24,"line_end":24,"column_start":13,"column_end":62,"is_primary":true,"text":[{"text":" fn x86_mm_min_epi16(x: i16x8, y: i16x8) -> i16x8;","highlight_start":13,"highlight_end":62}],"label":null,"suggested_replacement":null,"expansion":null}],"children":[],"rendered":null} {"message":"intrinsic return value has wrong type: found vector with length 8, expected length 16","code":{"code":"E0442","explanation":"\nIntrinsic argument(s) and/or return value have the wrong type.\nErroneous code example:\n\n``compile_fail,E0442\n#![feature(repr_simd)]\n#![feature(platform_intrinsics)]\n\n#[repr(simd)]\nstruct i8x16(i8, i8, i8, i8, i8, i8, i8, i8,\n i8, i8, i8, i8, i8, i8, i8, i8);\n#[repr(simd)]\nstruct i32x4(i32, i32, i32, i32);\n#[repr(simd)]\nstruct i64x2(i64, i64);\n\nextern \"platform-intrinsic\" {\n fn x86_mm_adds_epi16(x: i8x16, y: i32x4) -> i64x2;\n // error: intrinsic arguments/return value have wrong type\n}\n``\n\nTo fix this error, please refer to the function declaration to give\nit the awaited types. Example:\n\n``\n#![feature(repr_simd)]\n#![feature(platform_intrinsics)]\n\n#[repr(simd)]\nstruct i16x8(i16, i16, i16, i16, i16, i16, i16, i16);\n\nextern \"platform-intrinsic\" {\n fn x86_mm_adds_epi16(x: i16x8, y: i16x8) -> i16x8; // ok!\n}\n``\n"},"level":"error","spans":[{"file_name":"/build/src/test/run-pass/simd-upgraded.rs","byte_start":866,"byte_end":915,"line_start":24,"line_end":24,"column_start":13,"column_end":62,"is_primary":true,"text":[{"text":" fn x86_mm_min_epi16(x: i16x8, y: i16x8) -> i16x8;","highlight_start":13,"highlight_end":62}],"label":null,"suggested_replacement":null,"expansion":null}],"children":[],"rendered":null} {"message":"aborting due to 3 previous errors","code":null,"level":"error","spans":[],"children":[],"rendered":null}

This occured in /rust/src/test/run-pass/simd-upgraded.rs I have an i16x8 type defined in /rust/src/librustc_platform_intrinsics/lib.rs, but in the error message there isn't anything mentioning i16x8 as a possible size for a 16byte tuple type.

static U8: Type = Type::Integer(false, 8, 8);
static U16: Type = Type::Integer(false, 16, 16);
static U32: Type = Type::Integer(false, 32, 32);
static U64: Type = Type::Integer(false, 64, 64);

This comment has been minimized.

@alexcrichton

alexcrichton Oct 19, 2016

Member

Does this mean that all unsigned simd intrinsics were removed? Perhaps that's the cause of the error message?

This comment has been minimized.

@valarauca

valarauca Oct 19, 2016

Author

The error is happening with the type i16x8.

As I said previously

This occured in /rust/src/test/run-pass/simd-upgraded.rs I have an i16x8 type defined in /rust/src/librustc_platform_intrinsics/lib.rs, but in the error message there isn't anything mentioning i16x8 as a possible size for a 16byte tuple type.

This comment has been minimized.

@alexcrichton

alexcrichton Oct 20, 2016

Member

Why were these types removed? This is removing a number of unsigned simd intrinsics from tests below. I was just hazarding a guess that it was related to the test failure, I'm not really sure what's going on there.

This comment has been minimized.

@valarauca

valarauca Oct 22, 2016

Author

From the error message it seems to be stating there is no 16byte width intrinsic defined with a type signature of i16. While there is i8x16, i32x4, and i64x2, these fail the type check. I'm digging into it today.

The issue seems to be declaring i16x8(0,1,2,3,4,5,6,7) fails because the compiler isn't assuming the internal values are 1byte long not two. I'm looking into it. I think if I just declare them i16x8(0i16,1i16,2i16,3i16,4i16,5i16,6i16,7i16) I'll be fine. But it seems like the type checker should handle that.

This comment has been minimized.

@valarauca

valarauca Oct 22, 2016

Author

The issues seems to be fn x86_mm_min_epi16 isn't defined?!? I have it written, and exposed. I'm not sure why it isn't showing up under platform-intrinsics ?

Is there another file to edit?

This comment has been minimized.

@alexcrichton

alexcrichton Oct 25, 2016

Member

Hm so there's two questions here that I have:

  1. The tests are failing, and I'm hypothesizing it's related to this diff.
  2. Why were the unsigned types removed? Does that means that there there are no more unsigned intrinsics?

I'm hypothesizing these are related, but they may not be. I'm not sure why that intrinsic wouldn't be defined, unfortunately.

This comment has been minimized.

@valarauca

valarauca Oct 26, 2016

Author

The tests are failing, and I'm hypothesizing it's related to this diff.

I literally gutted and replaced the entire SIMD subsection so I have no doubt this is true. I'm just trying to pin down where the failure is occurring. The function in question isn't unsigned, and is defined in the x86.rs module.

Why were the unsigned types removed?

The LLVM intrinsic definitions themselves don't use unsigned values, all their type signatures are signed except for floating point values. Therefore the generation program is only generating signed intrinsics.

Does that means that there are no more unsigned intrinsics?

Depends. There is a lot to unpack here.

Does this mean there are no more unsigned intriniscs defined by the compiler?

Yes.

Does this mean developers can no longer use unsigned intrinsics?

No.

As you are aware the extern interface isn't type safe. So nothing stops a Developer, or Library from defining a pub struct u64x2( u64, u64) and passing this into a i64x2. Furthermore nothing actually stops them from passing that structure into a i8x16 type either while we're at it. At the extern interface, not after that.

The rub of it is neither ARM, AARCH, AMD64, or the LLVM care about signed-ness. Just register size. Defining the unsigned types seemed to be an easier task for a library above this, which will also handle traits, overloading, etc since it'll contain a large extern platform-intrinsic definition.

This comment has been minimized.

@rkruppe

rkruppe Oct 26, 2016

Member

The LLVM intrinsic definitions themselves don't use unsigned values, all their type signatures are signed except for floating point values. Therefore the generation program is only generating signed intrinsics.

It's not that the LLVM intrinsics have signed or unsigned integers in their signatures, LLVM integer types simply don't have a signedness, instead they put it in the operations. But that doesn't mean the Rust interface to those intrinsics should ignore Rust's richer type systems. Requiring wrapper crates to insert lots of ugly transmute would be a pointless complication IMHO, and would potentially generate worse code (in debug mode at least). To me this points to a weakness of the "generate everything from LLVM definitions" approach more than anything else.

This comment has been minimized.

@valarauca

valarauca Oct 26, 2016

Author

That doesn't mean the Rust interface to those intrinsics should ignore Rust's richer type systems. Requiring wrapper crates to insert lots of ugly transmute would be a pointless complication IMHO,

Nothing is ignoring Rust's type system. I'm also not using transmute Rust ignores Rust's type system at the extern platform-instrict layer. This is where you can inject pub struct u64x2( u64,u64); unsigned types perfectly safely then any interaction will be type safe to the u64x2 type.

For example if you have a symbol like fn x86_mm_min_epi16( i16x8, i16x8) -> i16x8); when you define it in your codes namespace on can do

pub struct u64x2(u64,u64);
extern platform-intrinisic {
      fn x86_mm_min_epi16( u64x2, i16x8) -> i16x8;
}

This is an extreme, and rather broken example. But perfectly functional and not a compiler error. There is no unsafe{ transmute() } wizardy. If this a bad approach, shouldn't a compiler error enforce this?

This comment has been minimized.

@rkruppe

rkruppe Oct 26, 2016

Member

I may have gotten the layering wrong and jumped to wrong conclusions. Please point out if any of the following is wrong:

IIRC, rustc_platform_intrinsics defines the names and signatures of platform-specific (mostly SIMD-related) intrinsics. Out-of-tree SIMD crates re-declare those intrinsics for themselves because the signatures are "duck-typed", i.e., there isn't a canonical fixed set of SIMD types but rather out-of-tree libraries can define their own #[repr(simd)] types and experiment with various APIs. However, the RFC does state:

The signatures are typechecked, but in a "duck-typed" manner: it will just ensure that the types are SIMD vectors with the appropriate length and element type, it will not enforce a specific nominal type.

Thus, purely from the RFC, the Rust types in the intrinsics' signatures do matter.


So as far as I am concerned, it is an implementation bug that the nonsensical declaration above is accepted. The compiler should and is intended to check the "shape" of the argument and return types, not just their bit size. Signedness would naturally by included in these checks, and so (in a future with fewer bugs) librustc_platform_intrinsics must know the signedness of intrinsic signatures as well, to permit out-of-tree SIMD crates to declare the right signature.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 19, 2016

Thinking a bit more about this, there's also a few points about the existing system I forgot initially:

  • The python script currently has a --help flag which should help generate json definitions if they need to be imported, so in theory the syntax isn't too alien.
  • The json files are also used to generate the extern blocks in other repositories like https://github.com/huonw/simd, which allows for not only easy compiler imports but also consuming this downstream as well.

I wonder though if we checked in a Rust script if it could help both importing to the compiler and exporting to external libraries as well? (e.g. dual mode-esque)

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 19, 2016

Re-writing/patching Huonw's SIMD library was my next action item after this stabilized. Really that library should be part of the compiler at it is describing core types. You don't important a boost:: library for _m128 in C++, yet this is the current model rust/cargo uses.

But I do recognize it is easier to have a feature start out as a crate, then move that crate into the compiler. Rather then have an extremely buggy section of the compiler.

Furthermore some SIMD operations aren't easily quantifiable as integer/vector operations. AArch/Arm/Intel CRC, Intel SHA1/SHA2/AES, and the various SSE4.2 String operations.

Lastly I read the -help text. The differences between S and s is frustrating for writing by hand. The Pointer syntax is also rather annoying.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 20, 2016

Note that the story around SIMD in Rust is very much still a work in progress. We've got @huonw's simd library as an excellent starting point and is where we'll likely expand from moving forward. The libs team is set to discuss SIMD in general in the near future so we should have more info on that soon as well.

In that sense I'd hold off large-scale rewrites for the time being, but fleshing out all the intrinsics in the compiler sounds great to me, so I'd still be interested in moving forward with this. To me the remaining issues here are:

  • We should check in the generation script to this repo so it can be rerun if necessary.
  • Tests need to become green

Other than that though I'm comfortable landing!

@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 21, 2016

Sounds good. Where will that discussion take place? Because I'd like to chime in on it.

Either way I hope to have the tests green this weekend. Emergency work travel slowed my commits heavily :|

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 25, 2016

The libs team will discuss SIMD soon (hopefully!) and we'll post all our thoughts to an internals thread afterwards.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Oct 25, 2016

That being said I think it's fine to land this in the meantime, given:

  • It's just expanding the number of exposed intrinsics
  • They're all still unstable
  • We'll still have a script to generate everything
@valarauca

This comment has been minimized.

Copy link
Author

valarauca commented Oct 27, 2016

Closing will submit JSON changes this weekend. If they pass checks you'll get a pull.

Ya'll can deal with figuring out how to fix [repr(simd)] ducktyping.

@valarauca valarauca closed this Oct 27, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.