Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize u8x8::trailing_zeros for AArch64 #193

Open
TheIronBorn opened this issue Nov 26, 2018 · 0 comments
Open

Optimize u8x8::trailing_zeros for AArch64 #193

TheIronBorn opened this issue Nov 26, 2018 · 0 comments
Labels
A-AArch64 ARM 64-bit architecture Blocked-LLVM Bugs blocked on bugfixes in LLVM Performance Something isn't fast

Comments

@TheIronBorn
Copy link
Contributor

LLVM's cttz.v8i8 intrinsic is broken on AArch64 machines: #191

Our current workaround just applies u8::trailing_zeros to each lane. With 8 lanes, that can be quite slow.

It could be optimized by adapting LLVM's algorithm to Rust's AArch64 SIMD intrinsics (some may be missing and we would have to implement those as well: rust-lang/stdarch#40).

@gnzlbg gnzlbg added Performance Something isn't fast A-AArch64 ARM 64-bit architecture Blocked-LLVM Bugs blocked on bugfixes in LLVM labels Nov 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-AArch64 ARM 64-bit architecture Blocked-LLVM Bugs blocked on bugfixes in LLVM Performance Something isn't fast
Projects
None yet
Development

No branches or pull requests

2 participants