Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upAdded all x86 SIMD codes. #37068
Conversation
rust-highfive
assigned
arielb1
Oct 10, 2016
This comment has been minimized.
This comment has been minimized.
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @arielb1 (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
This comment has been minimized.
This comment has been minimized.
|
So the current failure is one the tests. Test re-writing will be a large part of this patch, but before I start that I'd like the community to weigh on if this solution is worth adopting. |
This comment has been minimized.
This comment has been minimized.
|
Thanks for the PR! Would it be possible to update the JSON as well to then regenerate these files? (just to match what we've previously done here) |
This comment has been minimized.
This comment has been minimized.
|
This is where the issue arises. I don't know how to bulk generate the x86.rs from the pythonscript + json files. As far as I can tell you get 1 x86.rs PER json file, then hand merge them(?). If we continue the pattern of 1 json file per feature set, this patch will require around a dozen new json files. So I'd like that clarified. Really my main goal was to completely avoid the json files, or even replace them. They're incredibly frustrating to work with in practice. The main goal of decode-simd was to automate the generation of x86.rs from mining compiler projects. Which is easier to keep up to date then hand editing json. |
This comment has been minimized.
This comment has been minimized.
|
I think we should give up on JSON, and that is what this PR is about. |
This comment has been minimized.
This comment has been minimized.
|
@sanxiyn should I move forward and start extending decode SIMD to Aarch64, Arm, and MIPS? As this isn't a total replacement for the Json/Python system I understand the hesitation with accepting it. @alexcrichton For AArch+Arm should I attempt to persevere the current names/symbols? Or should I not be concerned if they're changed? |
alexcrichton
assigned
alexcrichton
and unassigned
arielb1
Oct 14, 2016
This comment has been minimized.
This comment has been minimized.
|
@valarauca ah ok, interesting! Sorry but I was unaware about that being the point of this PR. Replacing the JSON files sounds good to me so long as we do it in a methodic fashion. Would it be possible to jettison them entirely and replace them with your script you've got? It'd be great to prove this out with the other platforms as well! |
This comment has been minimized.
This comment has been minimized.
|
Okay so for writing tests. What is the difference between For a primary test to prove the SIMD functionality should the script also have a feature to create a separate crate with a |
This comment has been minimized.
This comment has been minimized.
|
I removed the |
This comment has been minimized.
This comment has been minimized.
|
The compile-fail tests are ones that are expected to have a compile failure and have an assertion to what that error is. The run-make tests are just tests that need a It's ok to not add exhaustive testing to this though, we're likely not quite prepared for that. |
valarauca
added some commits
Oct 18, 2016
This comment has been minimized.
This comment has been minimized.
|
Okay. I just had to clean up a few of the tests because they were using names/symbols that no longer exist. The |
This comment has been minimized.
This comment has been minimized.
|
Hey @alexcrichton I need your help with this one
This occured in |
alexcrichton
reviewed
Oct 19, 2016
| static U8: Type = Type::Integer(false, 8, 8); | ||
| static U16: Type = Type::Integer(false, 16, 16); | ||
| static U32: Type = Type::Integer(false, 32, 32); | ||
| static U64: Type = Type::Integer(false, 64, 64); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Oct 19, 2016
Member
Does this mean that all unsigned simd intrinsics were removed? Perhaps that's the cause of the error message?
This comment has been minimized.
This comment has been minimized.
valarauca
Oct 19, 2016
Author
The error is happening with the type i16x8.
As I said previously
This occured in /rust/src/test/run-pass/simd-upgraded.rs I have an i16x8 type defined in /rust/src/librustc_platform_intrinsics/lib.rs, but in the error message there isn't anything mentioning i16x8 as a possible size for a 16byte tuple type.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Oct 20, 2016
Member
Why were these types removed? This is removing a number of unsigned simd intrinsics from tests below. I was just hazarding a guess that it was related to the test failure, I'm not really sure what's going on there.
This comment has been minimized.
This comment has been minimized.
valarauca
Oct 22, 2016
•
Author
From the error message it seems to be stating there is no 16byte width intrinsic defined with a type signature of i16. While there is i8x16, i32x4, and i64x2, these fail the type check. I'm digging into it today.
The issue seems to be declaring i16x8(0,1,2,3,4,5,6,7) fails because the compiler isn't assuming the internal values are 1byte long not two. I'm looking into it. I think if I just declare them i16x8(0i16,1i16,2i16,3i16,4i16,5i16,6i16,7i16) I'll be fine. But it seems like the type checker should handle that.
This comment has been minimized.
This comment has been minimized.
valarauca
Oct 22, 2016
Author
The issues seems to be fn x86_mm_min_epi16 isn't defined?!? I have it written, and exposed. I'm not sure why it isn't showing up under platform-intrinsics ?
Is there another file to edit?
This comment has been minimized.
This comment has been minimized.
alexcrichton
Oct 25, 2016
Member
Hm so there's two questions here that I have:
- The tests are failing, and I'm hypothesizing it's related to this diff.
- Why were the unsigned types removed? Does that means that there there are no more unsigned intrinsics?
I'm hypothesizing these are related, but they may not be. I'm not sure why that intrinsic wouldn't be defined, unfortunately.
This comment has been minimized.
This comment has been minimized.
valarauca
Oct 26, 2016
•
Author
The tests are failing, and I'm hypothesizing it's related to this diff.
I literally gutted and replaced the entire SIMD subsection so I have no doubt this is true. I'm just trying to pin down where the failure is occurring. The function in question isn't unsigned, and is defined in the x86.rs module.
Why were the unsigned types removed?
The LLVM intrinsic definitions themselves don't use unsigned values, all their type signatures are signed except for floating point values. Therefore the generation program is only generating signed intrinsics.
Does that means that there are no more unsigned intrinsics?
Depends. There is a lot to unpack here.
Does this mean there are no more unsigned intriniscs defined by the compiler?
Yes.
Does this mean developers can no longer use unsigned intrinsics?
No.
As you are aware the extern interface isn't type safe. So nothing stops a Developer, or Library from defining a pub struct u64x2( u64, u64) and passing this into a i64x2. Furthermore nothing actually stops them from passing that structure into a i8x16 type either while we're at it. At the extern interface, not after that.
The rub of it is neither ARM, AARCH, AMD64, or the LLVM care about signed-ness. Just register size. Defining the unsigned types seemed to be an easier task for a library above this, which will also handle traits, overloading, etc since it'll contain a large extern platform-intrinsic definition.
This comment has been minimized.
This comment has been minimized.
rkruppe
Oct 26, 2016
Member
The LLVM intrinsic definitions themselves don't use unsigned values, all their type signatures are signed except for floating point values. Therefore the generation program is only generating signed intrinsics.
It's not that the LLVM intrinsics have signed or unsigned integers in their signatures, LLVM integer types simply don't have a signedness, instead they put it in the operations. But that doesn't mean the Rust interface to those intrinsics should ignore Rust's richer type systems. Requiring wrapper crates to insert lots of ugly transmute would be a pointless complication IMHO, and would potentially generate worse code (in debug mode at least). To me this points to a weakness of the "generate everything from LLVM definitions" approach more than anything else.
This comment has been minimized.
This comment has been minimized.
valarauca
Oct 26, 2016
•
Author
That doesn't mean the Rust interface to those intrinsics should ignore Rust's richer type systems. Requiring wrapper crates to insert lots of ugly
transmutewould be a pointless complication IMHO,
Nothing is ignoring Rust's type system. I'm also not using transmute Rust ignores Rust's type system at the extern platform-instrict layer. This is where you can inject pub struct u64x2( u64,u64); unsigned types perfectly safely then any interaction will be type safe to the u64x2 type.
For example if you have a symbol like fn x86_mm_min_epi16( i16x8, i16x8) -> i16x8); when you define it in your codes namespace on can do
pub struct u64x2(u64,u64);
extern platform-intrinisic {
fn x86_mm_min_epi16( u64x2, i16x8) -> i16x8;
}This is an extreme, and rather broken example. But perfectly functional and not a compiler error. There is no unsafe{ transmute() } wizardy. If this a bad approach, shouldn't a compiler error enforce this?
This comment has been minimized.
This comment has been minimized.
rkruppe
Oct 26, 2016
•
Member
I may have gotten the layering wrong and jumped to wrong conclusions. Please point out if any of the following is wrong:
IIRC, rustc_platform_intrinsics defines the names and signatures of platform-specific (mostly SIMD-related) intrinsics. Out-of-tree SIMD crates re-declare those intrinsics for themselves because the signatures are "duck-typed", i.e., there isn't a canonical fixed set of SIMD types but rather out-of-tree libraries can define their own #[repr(simd)] types and experiment with various APIs. However, the RFC does state:
The signatures are typechecked, but in a "duck-typed" manner: it will just ensure that the types are SIMD vectors with the appropriate length and element type, it will not enforce a specific nominal type.
Thus, purely from the RFC, the Rust types in the intrinsics' signatures do matter.
So as far as I am concerned, it is an implementation bug that the nonsensical declaration above is accepted. The compiler should and is intended to check the "shape" of the argument and return types, not just their bit size. Signedness would naturally by included in these checks, and so (in a future with fewer bugs) librustc_platform_intrinsics must know the signedness of intrinsic signatures as well, to permit out-of-tree SIMD crates to declare the right signature.
This comment has been minimized.
This comment has been minimized.
|
Thinking a bit more about this, there's also a few points about the existing system I forgot initially:
I wonder though if we checked in a Rust script if it could help both importing to the compiler and exporting to external libraries as well? (e.g. dual mode-esque) |
This comment has been minimized.
This comment has been minimized.
|
Re-writing/patching Huonw's SIMD library was my next action item after this stabilized. Really that library should be part of the compiler at it is describing core types. You don't important a But I do recognize it is easier to have a feature start out as a crate, then move that crate into the compiler. Rather then have an extremely buggy section of the compiler. Furthermore some SIMD operations aren't easily quantifiable as integer/vector operations. AArch/Arm/Intel CRC, Intel SHA1/SHA2/AES, and the various SSE4.2 String operations. Lastly I read the |
This comment has been minimized.
This comment has been minimized.
|
Note that the story around SIMD in Rust is very much still a work in progress. We've got @huonw's In that sense I'd hold off large-scale rewrites for the time being, but fleshing out all the intrinsics in the compiler sounds great to me, so I'd still be interested in moving forward with this. To me the remaining issues here are:
Other than that though I'm comfortable landing! |
This comment has been minimized.
This comment has been minimized.
|
Sounds good. Where will that discussion take place? Because I'd like to chime in on it. Either way I hope to have the tests green this weekend. Emergency work travel slowed my commits heavily :| |
This comment has been minimized.
This comment has been minimized.
|
The libs team will discuss SIMD soon (hopefully!) and we'll post all our thoughts to an internals thread afterwards. |
This comment has been minimized.
This comment has been minimized.
|
That being said I think it's fine to land this in the meantime, given:
|
This comment has been minimized.
This comment has been minimized.
|
Closing will submit JSON changes this weekend. If they pass checks you'll get a pull. Ya'll can deal with figuring out how to fix |
valarauca commentedOct 10, 2016
original pull request, rebased and resubmitted
This patch adds all x86_86 platform intrinsic. The names are compliant with those used in the GCC, ICC, and MSVC. The x86.rs file is generated via a rust program.
A ~4 months ago @sanxiyn request I add them to this commit when I got around to a submitting.
I realize this is directly modifying a generated file. But this method allows for automated updates without hand modifying writing JSON. I can extend decode-simd to include the MIPS, Aarch, Arm, POWER8, (and even NVVC+AMDGPU) platforms if that is needed/wanted.
Other commenters cc: @rkruppe @retep998