Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
switch buffer le optimization #1701
Aligning LE buffers on generic platforms uses swaps and aligns, but it can also be done without swapping by reversing the direction of the bytealign (i.e. towards higher bytes instead of lower):
I see some performance gain on the Intel CPU and Apple runtimes, but I don't know how well this performs elsewhere. This could use input, especially testing on older AMD GPUs.