Make [u8]::reverse() 5x faster #41764

Since LLVM doesn't vectorize the loop for us, do unaligned reads of a larger type and use LLVM's bswap intrinsic to do the reversing of the actual bytes. cfg!-restricted to x86 and x86_64, as I assume it wouldn't help on things like ARMv5. Also makes [u16]::reverse() a more modest 1.5x faster by loading/storing u32 and swapping the u16s with ROT16. Thank you ptr::*_unaligned for making this easy :)

None of these are affected by e8fad32.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make [u8]::reverse() 5x faster #41764

Make [u8]::reverse() 5x faster #41764

Commits on May 5, 2017

Commits on May 6, 2017