New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ShuffleUnsafe
methods
#99596
Conversation
Note regarding the
|
Benchmark results of my AVX2 code ( Yes, this is a very micro benchmark, but results are pretty reproducible on my machine (within ~%10 usually), and are probably pretty close to reality since it should be pretty quick (but obviously this doesn't measure the overhead with surrounding code due to more pipeline usage, etc.). |
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for looking into this again.
We're already using the so-far-internal Vector128.ShuffleUnsafe
in a bunch of places. Should we be using Vector256.ShuffleUnsafe
somewhere?
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector64.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector64.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector256.cs
Outdated
Show resolved
Hide resolved
It seems to me that all the current uses of |
- Used when non-constant indices are given to Shuffle, and an intrinsic implementation of ShuffleUnsafe is available - Optimises byte and sbyte cases only
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Vector128.cs
Outdated
Show resolved
Hide resolved
- Re-implement ShuffleUnsafe in JIT - Also re-implement the Shuffle optimisations in JIT - Add basic JIT support for mono (ShuffleUnsafe just gets the same implementation as Shuffle) - Implement support for variable index Shuffle & ShuffleUnsafe (for bytes) - Implement support for cross-lane shuffling in JIT (for bytes) - Optimise Vector128 shuffle for bytes in JIT to use Avx2.Shuffle
Can someone please check I won't accidentally regress mono :) |
Re: 9868e73 |
I don't think there's any issue with the runtime relying on specific behaviour. For external libraries, I think one of the following approaches makes sense:
I think the approach needs to be consistent for all of them, so I removed the Another option, which I briefly mentioned in a comment somewhere, is to expose a variant like |
I'm fine with only documenting "anything above 15 is UB". |
Yes, I've been careful to not use the AVX-512 one for this method for this reason. I will add a comment at some point to explain this in the method (assuming I don't forget). |
Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it. |
I'll be able to work on this in a few weeks when I have a few days to sit down and implement all the remaining things & make sure it's all correct :) (or maybe the sporadic day here and there also before that). I will ping someone when that day comes to get it reopened (or you can just reopen it now if you want). |
byte
,sbyte
)Shuffle
with variable indices on coreclr (for all types)Shuffle
onVector256
(with signed/unsigned bytes and shorts)Vector256
shuffle withAvx2.Shuffle
(for signed/unsigned bytes and shorts)Todo tasks:
VectorXXX.ShuffleUnsafe
for vectors of other element types