You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ideally you would want to use in this case __m128i _mm_blendv_epi8(__m128i a, __m128i b, __m128i mask)
where the mask could be created with anything in the _mm_cmp**_epi8 range.
But blendv is a SSE4.1 instruction. Leading to compile headaches. However, this can be done using SSE2 instructions only:
Ideally you would want to use in this case
__m128i _mm_blendv_epi8(__m128i a, __m128i b, __m128i mask)
where the mask could be created with anything in the _mm_cmp**_epi8 range.
But blendv is a SSE4.1 instruction. Leading to compile headaches. However, this can be done using SSE2 instructions only:
So this might open up opportunities for vectorization, using only
#ifdef __SSE2__
compile guards.EDIT: This would work for other than epi8 data types as well of course.
The text was updated successfully, but these errors were encountered: