Please sign in to comment.
Adds detection and support for "SSE4" vs. "SSE" vector implementations.
In ef206c4, I made our "SSE" vector implementation depend on <=SSE4.1 instead of <=SSE2. I was rewriting the H4 SSV filter to use the signed 8-bit ints and a esl_sse_hmax_epi8() macro that uses the SSE4.1 _mm_max_epi8() intrinsic. Presciently, I noted that if this change were going to cause trouble, it would be on AMD platforms; incorrectly, I proposed that since AMD processors have supported SSE4.1 since AMD Bulldozer (2011), I didn't expect trouble. Well, trouble: Odyssey head nodes are AMD Phenom II X4 910e processors, circa January 2010. I hadn't noticed a problem before because I've been using eddyfs01 as my head node, which has Intel processors. This commit separates detection and support for "SSE" versus "SSE4" vector implementations. SSE requires <=SSE2 (HMMER3); SSE4 requires <=SSE4.1 (current HMMER4). This commit will temporarily break H4, which will need to change ESL_SSE() autoconf macro call to ESL_SSE4(), eslENABLE_SSE -> eslENABLE_SSE4, and suchlike.
- Loading branch information...
Showing with 212 additions and 94 deletions.
Oops, something went wrong.