Performance worse than sjcl lib #18

rubensayshi · 2015-08-03T13:00:00Z

@dcousens asked for a ticket ;)

performance on various engines is a lot worse compared to the 'sjcl' package.
see benchmarking test repo: https://github.com/rubensayshi/pbkdf2-benchmark

the most significant one is when using the old UIWebView on IOS (currently still the default for cordova apps without using a 3rd party plugin to upgrade to WKWebView) where it's close to a 10x difference.
but for the future WKWebView will not become the default to use in cordova 4.0.0+.

for chrome (on all platforms) there's also a significant difference between the 2 libs.

for w/e reason firefox sjcl is actually slower ... but (my) firefox is scrap xD

dcousens · 2015-08-03T23:41:54Z

ping @feross, thoughts on whether this might be a Buffer issue before I dig into it deeper?

feross · 2015-08-04T01:13:15Z

You can check if the buffer instances are based on Uint8Array with buf instanceof Uint8Array or if we detect typed array support Buffer.TYPED_ARRAY_SUPPORT as defined here. If those are false, then you're getting the fallback Object implementation.

Safari 7 and below has a bug in Object.prototype.constructor, so we're just using the Object implementation instead of adding a workaround, as described here: feross/buffer#63 Perhaps this is the issue?

dcousens · 2016-01-12T23:08:51Z

I'm willing to look into this soon.

feross · 2016-01-12T23:55:03Z

Safari 5-7 get the Uint8Array implementation as of the latest buffer version! So you might not even need to fix this anymore.

dcousens · 2016-01-13T00:11:14Z

@feross it actually seems related to the fact that createHmac is creating a new Buffer every time compared to SJCL which re-uses the underlying array.

I'm not sure if this is avoidable without breaking compliance in the create* modules?

dcousens · 2016-01-13T00:12:36Z

E.g https://github.com/crypto-browserify/sha.js/blob/master/sha1.js#L84-L94

feross · 2016-01-13T00:19:40Z

Not sure what you're trying to do, but if you want to create a buffer without allocating a whole new Uint8Array, you can now do this in node.js and the browser:

var buf1 = new Buffer(4)
var buf2 = new Buffer(buf1.buffer)

buf1 and buf2 now share the same underlying ArrayBuffer instance.

dcousens · 2016-01-13T00:33:15Z

@feross no no, my point is, look at sha1.js, we are creating a new Buffer each time we digest. There is no way to inject a .buffer in there without breaking compliance with node? No?

feross · 2016-01-13T00:35:18Z

Since iojs v3 and node v4, Buffer is a subclass of Uint8Array. So .buffer should work just fine!

dcousens · 2016-01-13T00:36:08Z

@feross I don't understand.
How is that relevant?

dcousens · 2016-01-13T00:36:23Z

I'm talking about .digest()

dcousens · 2016-01-13T00:38:08Z

In https://github.com/crypto-browserify/sha.js/blob/master/hash.js#L41,

Hash.prototype.digest = function (enc) {

Would have to become

Hash.prototype.digest = function (enc, target) {

Where target is an optional target Buffer.
This would not be compliant with node/io, AFAIK.

feross · 2016-01-13T00:38:32Z

I'm probably misunderstanding something here, just ignore me. ¯\_(ツ)_/¯

feross · 2016-01-13T00:39:23Z

Ah, I see your point. Yeah that sounds like it's an API modification.

dcousens · 2016-01-13T00:41:24Z

@rubensayshi our only options at this point would be to replace the entire underlying implementation with a different HMAC stack. I'm not really keen on that.

How are the benchmarks holding up?

fanatid · 2016-02-26T10:25:52Z

Results with improved sha.js (2.4.5), pbkdf2 (3.0.4)

Google Chrome Version 49.0.2623.63 beta (64-bit)

sjcl.misc.pbkdf2 462 = 92.4
build.js:24059 pbkdf2.pbkdf2Sync 2599 = 519.8

Firefox 44.0.2 (64-bit)

sjcl.misc.pbkdf2 383 = 76.6
pbkdf2.pbkdf2Sync 2988 = 597.6

node.js v5.6.0 (using browser version)

sjcl.misc.pbkdf2 415 = 83
pbkdf2.pbkdf2Sync 1893 = 378.6

dcousens · 2016-02-27T12:20:05Z

Awesome.

dcousens · 2016-03-24T02:00:36Z

ping @fanatid we could also improve speed by somehow caching the ipad/opad values and their resultant base hash.
They are currently re-calculated every time.

dcousens · 2016-03-24T03:01:13Z

@fanatid thoughts?
Here is some mock up example code I did in C++ where I take advantage of this trick to basically double the speed (twice as fast as OpenSSL):

#include <cassert>
#include <cstdint>
#include <cstring>
#include <openssl/sha.h>
#include <openssl/evp.h>

#define SHA256_DIGEST_LENGTH 32

void pbkdf2_hmac_sha256 (const uint8_t* password, const size_t passwordLength, const uint8_t* salt, const size_t saltLength, const size_t iterations, uint8_t* key, size_t keyLength) {
    SHA256_CTX ipadCtx, opadCtx;

    // pre-calculate the HMAC blocks to avoid constant re-calculation
    {
        uint8_t ipad[SHA256_CBLOCK], opad[SHA256_CBLOCK];
        memset(ipad, 0x36, SHA256_CBLOCK);
        memset(opad, 0x5C, SHA256_CBLOCK);

        if (passwordLength > SHA256_CBLOCK) {
            uint8_t tmp[SHA256_DIGEST_LENGTH];

            SHA256_CTX ctx;
            SHA256_Init(&ctx);
            SHA256_Update(&ctx, password, passwordLength);
            SHA256_Final(tmp, &ctx);

            for (size_t i = 0; i < SHA256_DIGEST_LENGTH; i++) {
                ipad[i] ^= tmp[i];
                opad[i] ^= tmp[i];
            }
        } else {
            for (size_t i = 0; i < passwordLength; i++) {
                ipad[i] ^= password[i];
                opad[i] ^= password[i];
            }
        }

        SHA256_Init(&ipadCtx);
        SHA256_Init(&opadCtx);
        SHA256_Update(&ipadCtx, ipad, SHA256_CBLOCK);
        SHA256_Update(&opadCtx, opad, SHA256_CBLOCK);
    }

    // divide and round upwards
    auto l = keyLength / SHA256_DIGEST_LENGTH;
    if (keyLength % SHA256_DIGEST_LENGTH != 0) l += 1;

    const auto r = keyLength - (l - 1) * SHA256_DIGEST_LENGTH;

    uint8_t T[SHA256_DIGEST_LENGTH], U[SHA256_DIGEST_LENGTH];

    // TODO: REMOVE write32BE 8-bit shortcut
    uint8_t uint32BE[4] = {};

    for (size_t i = 1; i <= l; i++) {
        // TODO: REMOVE write32BE 8-bit shortcut
        assert(i < 256);
        uint32BE[3] = (uint8_t) i;

        SHA256_CTX ctx;

        memcpy(&ctx, &ipadCtx, sizeof(SHA256_CTX));
        SHA256_Update(&ctx, salt, saltLength);
        SHA256_Update(&ctx, &uint32BE, sizeof(uint32BE));
        SHA256_Final(U, &ctx);

        memcpy(&ctx, &opadCtx, sizeof(SHA256_CTX));
        SHA256_Update(&ctx, U, SHA256_DIGEST_LENGTH);
        SHA256_Final(U, &ctx);

        memcpy(T, U, SHA256_DIGEST_LENGTH);

        for (size_t j = 1; j < iterations; j++) {
            memcpy(&ctx, &ipadCtx, sizeof(SHA256_CTX));
            SHA256_Update(&ctx, U, SHA256_DIGEST_LENGTH);
            SHA256_Final(U, &ctx);

            memcpy(&ctx, &opadCtx, sizeof(SHA256_CTX));
            SHA256_Update(&ctx, U, SHA256_DIGEST_LENGTH);
            SHA256_Final(U, &ctx);

            for (size_t k = 0; k < SHA256_DIGEST_LENGTH; k++) T[k] ^= U[k];
        }

        const auto destPos = (i - 1) * SHA256_DIGEST_LENGTH;
        const auto len = (i == l ? r : SHA256_DIGEST_LENGTH);
        memcpy(key + destPos, T, len);
    }
}

#include <iomanip>
#include <iostream>

int main (int argc, char** argv) {
    const uint8_t k[] = "password_________________________________________________________________________________________________________________________";
    const uint8_t s[] = "salt bbbbbbbbbbbbbb";
    const int kl = sizeof(k) - 1;
    const int sl = sizeof(s) - 1;

    // ours
    {
        uint8_t buffer[10];
        pbkdf2_hmac_sha256(k, kl, s, sl, 2000, buffer, sizeof(buffer));

        for (size_t i = 0; i < sizeof(buffer); ++i) {
            std::cout << std::hex << std::setfill('0') << std::setw(2) << (unsigned) buffer[i] << ' ';
        }
        std::cout << std::endl;
    }

    // openssl
    {
        uint8_t buffer[10];
        PKCS5_PBKDF2_HMAC(
            (const char*) k, kl,
            s, sl,
            2000,
            EVP_sha256(),
            sizeof(buffer),
            buffer
        );

        for (size_t i = 0; i < sizeof(buffer); ++i) {
            std::cout << std::hex << std::setfill('0') << std::setw(2) << (unsigned) buffer[i] << ' ';
        }
        std::cout << std::endl;
    }

    return 0;
}

fanatid · 2016-03-24T05:31:02Z

@dcousens yep, we can cache it and I sure that it gives significant performance improvement, but it will be not easy and could be easily broken by createHash

dcousens · 2016-03-24T06:51:59Z

@fanatid granted, but this doesn't have to rely on create-hash if the API isn't suitable.

calvinmetcalf · 2016-09-22T12:01:35Z

we could also look back into using subtle crypto in the places that support it

dcousens · 2016-09-22T23:16:43Z

@calvinmetcalf is that faster?

calvinmetcalf · 2016-09-23T11:50:49Z

it's native so yes it would be, you could test with native-crypto which (in the browser) uses that if it's available and then this if not

dcousens · 2017-03-19T11:20:26Z

Is this still a concern?

rubensayshi · 2017-05-11T10:34:53Z

not for me, using asmcrypto.js ;)

dcousens · 2017-05-11T10:59:27Z

Can you post a bench? It might be worth dropping it in.

calvinmetcalf · 2017-05-11T12:05:32Z

with #59 we should get a speedup

dcousens · 2017-05-11T12:09:06Z

Thanks @calvinmetcalf , forgot about it.
Merged :)

Still be worth comparing though :)

calvinmetcalf · 2017-05-11T12:12:43Z

oh yes I agree, I think I'll also publish all 3 of the updates we have in the wings

dcousens · 2017-05-19T06:57:10Z

Tested this today.

sjcl is 1.6x faster.
asmcrypto.js is 4x faster.

If we omit the new changes, sjcl is about 5x faster...

Win!
But haven't won the war yet.
asmcrypto.js is going to be hard to beat...

dcousens · 2017-05-19T07:33:27Z

I don't think we'd ever beat asmcrypto.js without breaking module isolation.

It inlines the ipad/opad directly... https://github.com/vibornoff/asmcrypto.js/blob/master/src/hash/sha256/sha256.asm.js#L745

calvinmetcalf · 2017-05-19T12:23:21Z

are you testing async in a browser because I bet we're faster there

dcousens · 2017-05-19T12:47:18Z

@calvinmetcalf because of webcrypto?

calvinmetcalf · 2017-05-19T14:20:52Z

yeah plus it's actually async (where supported)

…

On Fri, May 19, 2017 at 8:47 AM Daniel Cousens ***@***.***> wrote: @calvinmetcalf <https://github.com/calvinmetcalf> because of webcrypto? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABE4nxHl_fzIYEbB_YZRQ_vw0iflJI2Cks5r7Y9WgaJpZM4Fkj8y> .

rubensayshi · 2017-05-31T13:09:37Z

oh wow seems I've missed the webcrypto support addition, nice!

dcousens · 2017-12-18T00:50:40Z

Closing in favour of #75

dcousens self-assigned this Aug 3, 2015

dcousens added the bug label Aug 13, 2015

dcousens changed the title ~~performance vs sjcl lib~~ Performance worse than sjcl lib Aug 13, 2015

dcousens mentioned this issue Aug 17, 2015

pbkdf2 performance vs sjcl bitcoinjs/bip39#21

Closed

fanatid mentioned this issue Mar 2, 2016

Improve Speed cryptocoinjs/scryptsy#3

Open

dcousens added feature help wanted and removed bug labels Jun 21, 2016

dcousens closed this as completed Mar 19, 2017

dcousens removed their assignment Mar 19, 2017

dcousens reopened this Mar 19, 2017

calvinmetcalf mentioned this issue Apr 24, 2017

use native crypto when available #53

Closed

dcousens mentioned this issue Apr 28, 2017

do our own hmacing #54

Merged

jtormey mentioned this issue May 26, 2017

Make SJCL an optional dependency blockchain/My-Wallet-V3#342

Closed

dcousens closed this as completed Apr 16, 2018

Performance worse than sjcl lib #18

Performance worse than sjcl lib #18

Comments

rubensayshi commented Aug 3, 2015

dcousens commented Aug 3, 2015

feross commented Aug 4, 2015

dcousens commented Jan 12, 2016

feross commented Jan 12, 2016

dcousens commented Jan 13, 2016

dcousens commented Jan 13, 2016

feross commented Jan 13, 2016

dcousens commented Jan 13, 2016

feross commented Jan 13, 2016

dcousens commented Jan 13, 2016

dcousens commented Jan 13, 2016

dcousens commented Jan 13, 2016

feross commented Jan 13, 2016

feross commented Jan 13, 2016

dcousens commented Jan 13, 2016

fanatid commented Feb 26, 2016

dcousens commented Feb 27, 2016

dcousens commented Mar 24, 2016

dcousens commented Mar 24, 2016

fanatid commented Mar 24, 2016

dcousens commented Mar 24, 2016

calvinmetcalf commented Sep 22, 2016

dcousens commented Sep 22, 2016

calvinmetcalf commented Sep 23, 2016

dcousens commented Mar 19, 2017

rubensayshi commented May 11, 2017

dcousens commented May 11, 2017

calvinmetcalf commented May 11, 2017

dcousens commented May 11, 2017

calvinmetcalf commented May 11, 2017

dcousens commented May 19, 2017 • edited Loading

dcousens commented May 19, 2017 • edited Loading

calvinmetcalf commented May 19, 2017

dcousens commented May 19, 2017

calvinmetcalf commented May 19, 2017 via email

rubensayshi commented May 31, 2017

dcousens commented Dec 18, 2017

dcousens commented May 19, 2017 •

edited

Loading

dcousens commented May 19, 2017 •

edited

Loading