Skip to content

How to generate more efficient SIMD code #2632

Closed Answered by nurmukhametov
zengdelang asked this question in Q&A
Discussion options

You must be logged in to vote

/* which elements are larger than the pivot /
// __m256i compared = _mm256_cmpgt_epi32(Curr_Vec, pivot_vec);
uniform uint32 Compared[8];
...
/
extract the most significant bit from each integer of the vector */
// int mm = _mm256_movemask_ps(_mm256_castsi256_ps(compared));
uniform uint32 ComparedMask[8] = {0b1, 0b10, 0b100, 0b1000, 0b10000, 0b100000, 0b1000000, 0b10000000};
....
}

I am not sure I follow the details but the quoted part looks like that it can be expressed using packmask like this:

export uniform int foo(uniform uint8 a[], uniform uint8 p[]) {
    return packmask(a[programIndex] > p[programIndex]);
}

compiling with the command:

$ ispc -O2 --target=avx2-i8x32 example.ispc …

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@zengdelang
Comment options

Answer selected by zengdelang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants