New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch buffer le optimization #1701

RAN1 opened this Issue Sep 22, 2018 · 0 comments


None yet
1 participant

RAN1 commented Sep 22, 2018

Aligning LE buffers on generic platforms uses swaps and aligns, but it can also be done without swapping by reversing the direction of the bytealign (i.e. towards higher bytes instead of lower):


I see some performance gain on the Intel CPU and Apple runtimes, but I don't know how well this performs elsewhere. This could use input, especially testing on older AMD GPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment