Skip to content

Commit

Permalink
Remove incorrect _bzhi_u64() usage
Browse files Browse the repository at this point in the history
  • Loading branch information
kimwalisch committed Apr 17, 2024
1 parent 02a1c6a commit f2aa68c
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions doc/CPP_API.md
Expand Up @@ -466,7 +466,7 @@ int main()
{
// Sum 64-bit primes using AVX512
for (std::size_t i = 0; i < it.size_; i += 8) {
__mmask8 mask = (__mmask8) _bzhi_u64(0xff, it.size_ - i);
__mmask8 mask = (i + 8 < it.size_) ? 0xff : 0xff >> (i + 8 - it.size_);
__m512i primes = _mm512_maskz_loadu_epi64(mask, (__m512i*) &it.primes_[i]);
sums = _mm512_add_epi64(sums, primes);
}
Expand All @@ -493,7 +493,7 @@ int main()

```bash
# Unix-like OSes
c++ -O3 -mavx512f -mbmi2 -funroll-loops primesum.cpp -o primesum -lprimesieve
c++ -O3 -mavx512f -funroll-loops primesum.cpp -o primesum -lprimesieve
time ./primesum
```

Expand Down
4 changes: 2 additions & 2 deletions doc/C_API.md
Expand Up @@ -532,7 +532,7 @@ int main(void)
{
// Sum 64-bit primes using AVX512
for (size_t i = 0; i < it.size; i += 8) {
__mmask8 mask = (__mmask8) _bzhi_u64(0xff, it.size - i);
__mmask8 mask = (i + 8 < it.size) ? 0xff : 0xff >> (i + 8 - it.size);
__m512i primes = _mm512_maskz_loadu_epi64(mask, (__m512i*) &it.primes[i]);
sums = _mm512_add_epi64(sums, primes);
}
Expand Down Expand Up @@ -560,7 +560,7 @@ int main(void)
```bash
# Unix-like OSes
cc -O3 -mavx512f -mbmi2 -funroll-loops primesum.c -o primesum -lprimesieve
cc -O3 -mavx512f -funroll-loops primesum.c -o primesum -lprimesieve
time ./primesum
```

Expand Down

0 comments on commit f2aa68c

Please sign in to comment.