-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Intel SHA extension for SHA1 #807
Comments
I am very interested in speed measurements with native instructions.
|
Indeed, interesting to see the output of However, 2.0.0 is feature freeze and likely to be released this or next week, so this would be a candidate for 2.1. The same goes for #808. |
@randombit, @neverhub, @cordney, All measurements were taken from the Celeron J3455 running at 1.5 GHz (burst at 2.3 GHz). The library was configured with
|
Thank you! It will require some work to handle the ISA support properly, but I (or maybe someone else if interested, any takers?) can certainly handle it from here based on your patch. The timing is not right for target this for 2.0. But it is a great candidate for 2.1. How would you like your code to be attributed? Normally in Botan all copyright is held by individual authors, and we collectively distribute all of it under the BSD-2 license. Alternately it can be released in files (C) me, distributed under BSD, and explaining it is based on public domain code by you (and referencing also the public domain code you reference, either way). I was not sure if you had a personal preference for non-copyright. |
The numbers are quite impressive, that's a 5x increase. Can you say more on the microarchitectures on which the SHA extensions are available today and maybe will be in the future? ark.intel.com does not list these yet, even for the Celeron you say you tested on. Will this be in Kaby-Lake processors? |
I prefer to assign any rights to Botan. Botan can release it under any terms it prefers. (I know some countries endow copyright even if its unwanted). @cordney,
SHA extension were originally supposed to be part of Apollo Lake (IIRC). In 3Q 2016 we saw the SHA extensions surface under Goldmont (and not Apollo Lake). At this point in time, I am aware of 6 processors with the SHA extensions, and all of them are Goldmont:
I'm guessing SHA will proliferate a lot like AES-NI in Intel CPUs. Eventually it will be ubiquitous. I was never able to use ARK to identify the processors with SHA extensions. I used Wikipedia's page on Goldmont, and then worked backwards using the part numbers. When I sourced the motherboard I needed, I basically asked the same question ("does this CPU have SHA?"). A person answered and stated ARK provided the information. But like I said, I was never able to locate it in ARK. |
Ok, from http://www.legitreviews.com/intel-cannonlake-added-to-llvms-clang_179210 it seems that the SHA extensions will be available in common desktop processors starting with Cannonlake later this year. Regarding testing, I did not know that Intel has a software development emulator that, amongst others, emulates AES NI and SHA extensions. So one would not strictly need a real processor to test this. |
@randombit, @cordney, @neverhub, The last piece of helpful information might be... the intrinsics are available with:
|
@cordney I had forgotten about this tool but it is very useful, I used it in the past to test AES-NI and AVX2 support well before I had hardware. |
This extension might dramatically increase the usability of XMSS, at least with SHA-256. Right now signatures are ... quite slow (last time I tried, generating a H16 signature took over an hour on an i7-6700k, and even H10 is on the order of several seconds). |
I'm happy to benchmark it for you. Can you supply a sample program? |
The times for H10 signatures are reported by |
Again, its a four-core Celeron J3455 running at 1.5 GHz (burst at 2.3 GHz). Its not as impressive as a 6th gen i5 or i7. For example, according to cpu feature flags from
|
A 3x improvement for a very computationally intensive problem out of the gate is nothing to sneeze at. For XMSS we can probably do much better with multithreaded and/or SIMD execution of many inflight SHA-256 operations in order to expose more ILP to the CPU. |
I looked into this a little bit, already the code to handle SHA extensions is in master. I must have done this at some point. It is only enabled currently for x86-32 through some oversight, but everything to set -msha flag for GCC, read SHA cpuid bit, and such is already there. @noloader Can you post the output of |
Need to install SDE to test this. But it compiles at least. :) Based on GH #807
Yes sir:
|
Need to install SDE to test this. But it compiles at least. :) Based on GH #807
Merged to master now, thank you! |
This patch adds SHA extension support for SHA1. It is a hack because of my lack of knowledge of Botan. I don't know how to cut-in a new ISA or CPU feature, so I changed the SHA SSE2 code to SHA extensions for cut-in and testing. Someone more familiar with Botan needs to take it further.
Hopefully it can serve as a starting point for SHA1 using Intel SHA extensions. It would be nice to see it make it into Botan 2.0.
Credit should got to Sean Gulley of Intel. He wrote the article New Instructions Supporting the Secure Hash Algorithm on Intel® Architecture Processors. Later, I found his reference implementation at mitls | experimental | hash to fill in the missing pieces from the Intel blog. We also had to use unaligned loads and stores to avoid
SIGBUS
on unaligned buffers.Be careful of the ISA name. ARMv8 has AES and SHA extensions, too. I suspect there could be a collision if not mindful. Here are the CPU feature flags from a Goldmont board running Linux. Notice they call it
sha_ni
.Here is the updated
sha_sse2.cpp
and the diff packaged as a ZIP file.sha1_sse2_updated.zip
The text was updated successfully, but these errors were encountered: