Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Make SSE code work with unaligned memory.
SSE code is 43% faster than C code on 16-bytes aligned memory SSE code is 29% faster than C code on unaligned memory On a Core 2 duo, the unaligned compatible SSE code is 4.4% slower than aligned-required SSE code with aligned memory On Nehalem processors and newer, there's no speed disadvantage in using unaligned move SSE instructions vs aligned move SSE instructions.
- Loading branch information
Showing
1 changed file
with
27 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters