Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise OPENSSL_init_crypto using new atomics primitives #13733

Closed
wants to merge 5 commits into from

Conversation

@mattcaswell
Copy link
Member

@mattcaswell mattcaswell commented Dec 23, 2020

We introduce some new atomics primitives and use them to optimise the performance of OPENSSL_init_crypto.

We add a new check to the beginning of OPENSSL_init_crypto to end early if we have already initialised everything that we need to.

I've tested the new atomics primitives as best I can but:

  1. My solaris VM only has gcc not the solaris compiler, so I can't test that codepath
  2. I'm unsure when the Windows functions I'm using were first introduced and which versions they are available on. The MS documentation seems a little lacking in that regard, and I'm not Windows savvy enough to know.

I tested using the same methodology as outlined in #13730. Both the before and after tests had the commits from #13730, #13731 applied (as well as the already committed #13721).

Before:

sha256           41150.36k
sha256           40060.75k
sha256           39806.53k
sha256           45626.44k
sha256           38429.75k

Which gives an average of 41014.77k

After:

sha256           43953.54k
sha256           43949.66k
sha256           47193.21k
sha256           39278.57k
sha256           47358.34k

Which is an average of 44346.66k.

This represents an overall performance improvement in this test of about 8%.

This provides a partial fix for #13725 and #13578

mattcaswell added 4 commits Dec 22, 2020
We add an implementation for CRYPTO_atomic_or() and CRYPTO_atomic_load()
If everything has already been initialised we can check this with a
single test at the beginning of OPENSSL_init_crypto() and therefore
reduce the amount of time spent in this function. Since this is called
via very many codepaths this should have significant performance benefits.

Partially fixes #13725 and #13578
Also tests the older CRYPTO_atomic_add() which was without a test
@beldmit
Copy link
Member

@beldmit beldmit commented Dec 23, 2020

@mattcaswell, I don't think it's the slowest part of the code currently. I agree that OPENSSL_init_crypto is worth optimizing and the ENGINE_get_digest_engine calls should be eliminated - but is it possible to avoid the ENGINE_get_digest_engine calls for the provided digests?

@beldmit
Copy link
Member

@beldmit beldmit commented Dec 23, 2020

Nice job!

@beldmit
Copy link
Member

@beldmit beldmit commented Dec 23, 2020

The time spent in ENGINE_get_digest_engine became neglectable and even eliminating this call does not give any speedup.
Many thanks!


=item *

CRYPTO_atomic_or() atomically ors I<op> to I<*val> and returns the

This comment has been minimized.

@richsalz

richsalz Dec 23, 2020
Contributor

Is it a boolean/logical or? And not a C "any non-zero bit is true" kind of or? Like | versus ||? I'm sure it is, but it might be worth calling that out in the docs here and consider changing the name.

This comment has been minimized.

@mattcaswell

mattcaswell Dec 23, 2020
Author Member

Its a bitwise or (i.e. | not ||). I have made this clearer in the docs. I don't think its necessary to change the name - the underlying atomic operation names don't feel the need to make this distinction.

This comment has been minimized.

@richsalz

richsalz Dec 23, 2020
Contributor

made it clearer in the docs.

WFM, thanks.

@@ -86,6 +86,9 @@ int CRYPTO_THREAD_unlock(CRYPTO_RWLOCK *lock);
void CRYPTO_THREAD_lock_free(CRYPTO_RWLOCK *lock);

int CRYPTO_atomic_add(int *val, int amount, int *ret, CRYPTO_RWLOCK *lock);
int CRYPTO_atomic_or(uint64_t *val, uint64_t op, uint64_t *ret,
CRYPTO_RWLOCK *lock);
int CRYPTO_atomic_load(uint64_t *val, uint64_t *ret, CRYPTO_RWLOCK *lock);

This comment has been minimized.

@kroeckx

kroeckx Dec 23, 2020
Member

Is there a reason to make those functions public?

This comment has been minimized.

@mattcaswell

mattcaswell Dec 24, 2020
Author Member

I pondered this myself. We made the other "atomic" function public, and other similar threading type functions, so I decided that if we thought those ones were generally useful, then this one is too. OTOH, I don't have a strong preference if you'd rather them not be.

Copy link
Member

@beldmit beldmit left a comment

LGTM

@openssl-machine
Copy link

@openssl-machine openssl-machine commented Dec 30, 2020

This pull request is ready to merge

openssl-machine pushed a commit that referenced this pull request Dec 31, 2020
We add an implementation for CRYPTO_atomic_or() and CRYPTO_atomic_load()

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from #13733)
openssl-machine pushed a commit that referenced this pull request Dec 31, 2020
If everything has already been initialised we can check this with a
single test at the beginning of OPENSSL_init_crypto() and therefore
reduce the amount of time spent in this function. Since this is called
via very many codepaths this should have significant performance benefits.

Partially fixes #13725 and #13578

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from #13733)
openssl-machine pushed a commit that referenced this pull request Dec 31, 2020
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from #13733)
openssl-machine pushed a commit that referenced this pull request Dec 31, 2020
Also tests the older CRYPTO_atomic_add() which was without a test

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from #13733)
@beldmit
Copy link
Member

@beldmit beldmit commented Dec 31, 2020

Merged. Many thanks!

@beldmit beldmit closed this Dec 31, 2020
kiyolee added a commit to kiyolee/openssl that referenced this pull request Dec 31, 2020
We add an implementation for CRYPTO_atomic_or() and CRYPTO_atomic_load()

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from openssl#13733)
kiyolee added a commit to kiyolee/openssl that referenced this pull request Dec 31, 2020
If everything has already been initialised we can check this with a
single test at the beginning of OPENSSL_init_crypto() and therefore
reduce the amount of time spent in this function. Since this is called
via very many codepaths this should have significant performance benefits.

Partially fixes openssl#13725 and openssl#13578

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from openssl#13733)
kiyolee added a commit to kiyolee/openssl that referenced this pull request Dec 31, 2020
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from openssl#13733)
kiyolee added a commit to kiyolee/openssl that referenced this pull request Dec 31, 2020
Also tests the older CRYPTO_atomic_add() which was without a test

Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from openssl#13733)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants