Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD: add generator constructors #6347

Merged
merged 5 commits into from
Aug 10, 2023
Merged

Conversation

ldh4
Copy link
Contributor

@ldh4 ldh4 commented Aug 9, 2023

Related to #5674

This PR adds a generator constructor for all simd types and mask types:

  • template <class G> simd(G&& gen)

Generator constructors for simd allow constructing simd objects using expressions without needing to manually set a value for each vector lane. For internal implementation purpose, generator constructors have been used to implement generic fallback methods for few operators where simd intrinsics aren't adequately provided to implement such operations (i.e. Arithmetic shift right for AVX2 with int64_t).

@masterleinad
Copy link
Contributor

Can you please elaborate a bit more on why we need this?

@ldh4
Copy link
Contributor Author

ldh4 commented Aug 9, 2023

@masterleinad Just updated the PR description to provide a bit more on where generator constructors are typically used, but I
believe the main motivation for having this task is to fill in missing features to stay as consistent as possible with P1928R6.

KOKKOS_IMPL_HOST_FORCEINLINE_FUNCTION constexpr explicit simd_mask(
G&& gen) noexcept
: m_value(_mm256_castsi256_pd(_mm256_setr_epi64x(
-std::int64_t(gen(std::integral_constant<std::size_t, 0>())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it correct to negate the argument here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to set all bits in the lane to 1 when the mask is set to true (in this case, to produce 0xFFFFFFFFFFFFFFFFULL). I guess this imposes an underlying assumption that setting the mask to be true is strictly equivalent to setting it to be 1 rather than any non-zero value.

@crtrott crtrott merged commit da49ee2 into kokkos:develop Aug 10, 2023
27 of 28 checks passed
@crtrott crtrott mentioned this pull request Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants