### Highly optimized inner loop

primesieve's inner sieving loop has been optimized using
[extreme loop unrolling](,
[extreme loop unrolling](,
on average crossing off a multiple uses just 1.375 instructions on
x64 CPUs. Below is the assembly GCC generates for primesieve's inner
sieving loop, each andb instruction unsets a bit (crosses off a

