-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Kokkos_SIMD with AVX2 on 64-bit architectures #6075
Conversation
Does AVX2 get enabled if the user sets |
Yes, kokkos/cmake/kokkos_arch.cmake Lines 366 to 376 in c09dd1c
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this PR is about fixing a bug (_mm256_mask{load,store}_epi64
intrinsics expecting pointers to long long
and us passing pointers to value_type
) then drop the unrelated "convenience" changes.
@@ -680,6 +680,8 @@ template <> | |||
class simd<std::int64_t, simd_abi::avx2_fixed_size<4>> { | |||
__m256i m_value; | |||
|
|||
static_assert(sizeof(long long) == 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you add that assertion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a minimal safeguard that the reinterpret_cast
is reasonable.
Here you go! |
Why was this bug not caught before? Is there a gap in the testing (not necessarily the CI, presumably someone built on Broadwell before and it did not give a compile error)? |
I have done Broadwell builds with no errors before this fix... I'm not sure if it is compiler or arch specific... |
My testing shows that this would only have worked on Linux 32-bit architectures. |
maybe I never did un-masked loads of |
@masterleinad meaning with the current develop you got a compile error? I am trying to determine whether we need to add unit tests (excluding the whether we need some more CI/nightly build) |
You see this with current |
OK
Understood but this must be handled elsewhere. |
To elaborate on this a bit more, whether to enable based on compiler defines rather than user-provided configure options is a question that we need to examine with care and it is beyond the scope of this PR. I am not opposed to it, it is actually in my backlog, I was considering getting rid of that macro along with others such as |
Retest this please |
Fixes #6070.
For better test coverage (and convenience for the user?), this pull request also enables theavx2
overloads if__AVX2__
is defined (which should be the case when the user setsKokkos_ARCH_AVX2
but also forKokkos_ARCH_NATIVE=ON
).