Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for CPU instructions for hardware-accelerated SHA #139

Closed
GamePad64 opened this issue Feb 21, 2016 · 10 comments
Closed

Support for CPU instructions for hardware-accelerated SHA #139

GamePad64 opened this issue Feb 21, 2016 · 10 comments

Comments

@GamePad64
Copy link
Contributor

I haven't found this in issue list. Two years ago Intel announced new assembler instructions, that could assist computing SHA-1 and SHA-256. There are not much Skylake CPU's on the market, but all new Intel processors will support these instructions as well. Is there any way to support such hardware acceleration? (I am not an assembler guy, so I can't make a PR)

@DevJPM
Copy link
Contributor

DevJPM commented Feb 21, 2016

Skylake doesn't support the Intel SHA-Extensions.

This article0 and Wikipedia1 confirm this.

Am 21.02.2016 um 05:24 schrieb Alexander Shishenko:

I haven't found this in issue list. Two years ago Intel announced new
assembler instructions, that could assist computing SHA-1 and SHA-256
https://software.intel.com/en-us/articles/intel-sha-extensions.
There are not much Skylake CPU's on the market, but all new Intel
processors will support these instructions as well. Is there any way
to support such hardware acceleration? (I am not an assembler guy, so
I can't make a PR)

@GamePad64
Copy link
Contributor Author

Oh, I see, thanks! So, this is not really urgent for now. Maybe, it is worth leaving this feature in a wishlist?

@noloader
Copy link
Collaborator

What are you guys seeing in the field?

I see a lot of ARM Neon in mobile, so I know there's a [relatively] strong case for AES and SHA on the devices.

I was recently talking to Wei about VIA Padlock RNG, AES and SHA instructions. I'm almost afraid to make non-trivial changes to Rijndael because its so complicated. I'm kind of afraid of knocking something loose. The SHA files are quickly getting to the same state as Rijndael/AES.

@pavel-odintsov
Copy link

What about Intel Quick Assist acceleration cards? They has SHA instructions enabled.

@noloader
Copy link
Collaborator

noloader commented Apr 19, 2016

@pavel-odintsov - For the QuickAssist 8920 Adapter, I think we would need engine support. We don't have that at the moment, but its on my radar. If we did have engine support, then I can't really say what else we would need since I've never worked with one.

At $775 per item, we probably won't support it. I buy our testing gear out of my own pocket, the boards are kind of pricey, and there's little ROI for the project. I think the money is better spent on commodity processors or different IoT gadgets since it benefits more people. That could change if Intel shipped me a card.

@noloader
Copy link
Collaborator

noloader commented Oct 28, 2016

I think Intel CPUs with SHA extensions hit the market recently. It looks like processors which support it are Goldmont microarchitecture:

  • Pentium J4205 (desktop)
  • Pentium N4200 (mobile)
  • Celeron J3455 (desktop)
  • Celeron J3355 (desktop)
  • Celeron N3450 (mobile)
  • Celeron N3350 (mobile)

I looked through offerings at Amazon for machines with the architecture or the processor numbers, but I did not find any available (yet). I believe Acer had one laptop expected to be available in December 2016 that would meet testing needs.

We added runtime CPU feature detection at Commit ac01277d93636cd7. It will be available in the release ZIP that follows 5.6.5.

@noloader
Copy link
Collaborator

noloader commented Dec 1, 2016

We picked up a Celeron J3455. The commits of interest are:

SHA1 was clocking around 9.5 cycles per byte (cpb) for the straight CXX implementation. Using the SHA1 extensions, its running around 2.7 cpb. SHA256 was running around 19.5 cpb using SSE2 ASM. Using the SHA256 extensions, its running around 3.9 cpb.

I suspect the results are skewed because the Celerons use TurboBoost. However, all the measurements were taken from the same machine, so the skewing among results should be consistent.

@noloader
Copy link
Collaborator

noloader commented Feb 7, 2017

Closing the report. We now utilize SHA acceleration for the platforms we support (Intel and ARM).

If I can get my hands on some of the other boards, like the Intel one discussed earlier, then I'd be happy to revisit.

@noloader noloader closed this as completed Feb 7, 2017
@Katharsas
Copy link

Do you support AMD Zen SHA acceleration too?

@noloader
Copy link
Collaborator

Do you support AMD Zen SHA acceleration too?

Yes. From cpu.cpp : 270:

else if (IsAMD(cpuid0))
{
    CRYPTOPP_CONSTANT(RDRAND_FLAG = (1 << 30))
    CRYPTOPP_CONSTANT(RDSEED_FLAG = (1 << 18))
    CRYPTOPP_CONSTANT(   ADX_FLAG = (1 << 19))
    CRYPTOPP_CONSTANT(   SHA_FLAG = (1 << 29))

    CpuId(0x80000005, 0, cpuid2);
    g_cacheLineSize = GETBYTE(cpuid2[2], 0);
    g_hasRDRAND = (cpuid1[2] /*ECX*/ & RDRAND_FLAG) != 0;

    if (cpuid0[0] /*EAX*/ >= 7)
    {
        if (CpuId(7, 0, cpuid2))
        {
            g_hasRDSEED = (cpuid2[1] /*EBX*/ & RDSEED_FLAG) != 0;
            g_hasADX = (cpuid2[1] /*EBX*/ & ADX_FLAG) != 0;
            g_hasSHA = (cpuid2[1] /*EBX*/ & SHA_FLAG) != 0;
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants