New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize NEON functions required for libjpeg-turbo #646
Comments
Here's a list of completed functions with their corresponding commits:
|
Thanks for the reminder! I added some more earlier today, and we'll try to get that last one done soon; I think @Glitch18 is planning to take care of it. |
Yes. Will be pushing the commit soon! |
BTW, once this is done I'd be very interested in any performance data which could point us to something we might be able to optimize in SIMDe. See https://github.com/simd-everywhere/simde/wiki/Performance-Tuning#finding-performance-problems |
Great, thanks! I'll re-run the benchmark in |
I'll re-run the benchmarks and post the results soon, feel free to close this issue. |
First set of benchmarking/profiling results can be found here: It seems that reusing the Arm Neon intrinsics for WASM made it ~3.5x slower than its C implementation (on this benchmark). The most number of ticks (>= 10) can be observed in these functions (ordered from high to low):
Note that libjpeg-turbo is considering a whole new SIMD implementation just for WASM, so please don't spend too much time on this. |
@kleisauke is trying to get libjpeg-turbo working on WASM using SIMDe. Here is a list of functions which aren't implemented yet:
The text was updated successfully, but these errors were encountered: