-
Notifications
You must be signed in to change notification settings - Fork 281
Activate SSE2 test on windows #563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7b9c5dc to
bb66d89
Compare
|
Turns out AVX2 is always enabled in the source code |
bb66d89 to
12066d5
Compare
|
thanks @cyb70289 . I now have a clean reproducer for your bug, is that okay if I cherry-pick your fix in this branch and adjust it? |
|
Sure, no problem |
|
All green. Thanks @cyb70289 for the hints! and cc @JohanMabille for the merge. |
| #if defined(_M_X64) || (defined(_M_IX86_FP) && _M_IX86_FP >= 2) | ||
| #define XSIMD_WITH_SSE2 1 | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only enable SSE2 here, then SSE3,SSE4 will not be supported by xsimd if built with msvc.
I think this is a problem and will still break many code.
MSVC X64 must support SSE4_2 (actually /arch:SSE2 is an invalid option for msvc x64).
What about define XSIMD_WITH_SSE4_2 if _M_X64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the doc there's indeed no explicit flag for SSE3 and later (based on https://docs.microsoft.com/fr-fr/cpp/build/reference/arch-x86?view=msvc-160) but I wonder if it's actually supported if no arch flag is specified ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
X64 msvc only supports /arch:AVX?, https://docs.microsoft.com/en-us/cpp/build/reference/arch-x64?view=msvc-160
My understanding is /arch:XXX anables msvc to auto vectorize c++ code using at most the XXX instructions.
But it doesn't prevent us from using intrinsics directly. Compiler should supports all level simd instructions.
b4e097e to
3118d0a
Compare
MSVC does not define __SSEn__ macros.
3118d0a to
2f02ee2
Compare
cyb70289
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @serge-sans-paille !
|
@cyb70289 unfortunately, I'm not able to reproduce the windows error locally (on |
include/xsimd/arch/xsimd_ssse3.hpp
Outdated
| batch<T, A> extract_pair(batch<T, A> const& self, batch<T, A> const& other, std::size_t i, ::xsimd::detail::index_sequence<I, Is...>) { | ||
| if(i == I) { | ||
| return _mm_alignr_epi8(self, other, sizeof(T) * I); | ||
| return _mm_alignr_epi8(other, self, sizeof(T) * (I)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other and self swapped?
|
yeah, note that this makes the validation pass :-) So unless the validation
itself is not correctly written...
|
|
@cyb70289 can you confirm e.g. on your code base that current implementation is correct, i.e. that the tests are well-written? |
I use |
|
Looks to me this test code to generate expected result is wrong:
|
bde7a37 to
343f881
Compare
343f881 to
a9f3ccb
Compare
| if (n == I) | ||
| { | ||
| return vextq_u8(lhs, rhs, I); | ||
| return vextq_u8(rhs, lhs, I); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, vextq_xx is indeed different from _mm_alignr_xx 👍
|
All green! cc @JohanMabille this one costed me quite a lot of effort :-) |
|
Congratulation on this one! |
No description provided.