Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Add Sha2 functions #687
We take the fastest time measurement taken across multiple runs. Tested
Jan 14, 2018
I looked into this a little bit, and I noticed that the throughput test in zig was getting LTO (in the sense that we emit only a single LLVM module/ .o file) while the C benchmark had to make function calls across .o files. So I copied the sha256.c function into the test .c file and made all the functions static. This actually did not change the timings. I found that your zig implementation of sha256 generates 14% faster machine code.
I looked at a comparison of the assembly and it's hard to tell exactly what is different, but it appears that the zig implementation has slightly better instruction selection.
I think I know what's going on. If you look at https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/sha-256-implementations-paper.pdf there's a section called "Optimizations with rorx". I believe that LLVM is able to come up with these optimizations with the zig implementation, but it somehow does not discover them with the clang implementation.