Please sign in to comment.
Adds esl_sse_hmax_epi8(); SSE now requires SSE4.1
Starting work on H4's SSV filter, which will depend on SSE4.1 instructions, so we can work with signed epi8/int8_t. I think we can get away with this. SSE4.1 has been around since 2007 on Intel (Penryn processors), and since 2011 on AMD processors (Bulldozer). If we have a problem, it will be on AMD; there are some modern AMD processors (Bobcat) that do not have SSE4.1, but I think they should be disappearing. Vector configuration code (esl_sse.m4; esl_cpu.c) changed to check for compile-time SSE4.1 support in order to define our eslENABLE_SSE flag, and run-time support to pass our CPU dispatch check. Added esl_sse_hmax_epi8(), horizontal 16-way int8_t max, using SSE4.1 _mm_max_epi8() instruction.
- Loading branch information...
Showing with 135 additions and 52 deletions.