Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The performance of nginx drops a lot when using OpenSSL 3.0 #21833

Open
Tracked by #453 ...
ShuaiYuan21 opened this issue Aug 25, 2023 · 13 comments
Open
Tracked by #453 ...

The performance of nginx drops a lot when using OpenSSL 3.0 #21833

ShuaiYuan21 opened this issue Aug 25, 2023 · 13 comments
Labels
branch: master Merge to master branch triaged: feature The issue/pr requests/adds a feature triaged: performance The issue/pr reports/fixes a performance concern

Comments

@ShuaiYuan21
Copy link
Contributor

Some parameters:

Cipher suite: TLS_AES_256_GCM_SHA384
TLS version: TLSv1.3
Kx=EC
Au=EC
ECDH_curve=prime256v1

CPS test result:

OpenSSL 1.1.1 OpenSSL 3.0.9 drop
4295.38 3132.76 27.07%

Flame Graph:

nginx with OpenSSL 1.1.1:
sw_ec_ossl1_sw

nginx with OpenSSL 3.0.9
sw_ec_ossl3_sw

Root Cause

1. EVP_PKEY_derive_set_peer_ex
image

I don't know if this part of the operation is necessary.
But it can be seen that it takes up a lot of CPU resources, so in order to eliminate its impact on performance, we can disable it in the source code.

diff --git a/crypto/evp/exchange.c b/crypto/evp/exchange.c
index d7a4ad142a..58eea73b80 100644
--- a/crypto/evp/exchange.c
+++ b/crypto/evp/exchange.c
@@ -499,7 +499,7 @@ int EVP_PKEY_derive_set_peer_ex(EVP_PKEY_CTX *ctx, EVP_PKEY *peer,

 int EVP_PKEY_derive_set_peer(EVP_PKEY_CTX *ctx, EVP_PKEY *peer)
 {
-    return EVP_PKEY_derive_set_peer_ex(ctx, peer, 1);
+    return EVP_PKEY_derive_set_peer_ex(ctx, peer, 0);
 }

Performance comparison after EVP_PKEY_derive_set_peer_ex disabled

OpenSSL 1.1.1 OpenSSL 3.0.9 drop
4295.38 3747.19 12.76%

The flame graph after EVP_PKEY_derive_set_peer_ex disabled
sw_ec_ossl3_sw_no_set_peer

Now, it looks good, and the performance improved a lot.

Below are based on the flame graph that disabled the EVP_PKEY_derive_set_peer_ex

2. do_sigver_init
This function takes up ~4% of CPU resources.
image

3. EVP_xxx_fetch
These functions take up ~6.3% of CPU resources.
In OpenSSL, these functions are similar to engine_table_select, which consumes fewer CPU cycles.
image

Most of it is a lock operation.
image

4. Some other code changes

Question

  1. If EVP_PKEY_derive_set_peer_ex is not necessary, can we disable it when config the project?
  2. Will the function do_sigver_init and EVP_xxx_fetch be optimized in the future release?
@ShuaiYuan21 ShuaiYuan21 added the issue: question The issue was opened to ask a question label Aug 25, 2023
@mattcaswell
Copy link
Member

If EVP_PKEY_derive_set_peer_ex is not necessary, can we disable it when config the project?

From the migration guide:

The public key check has moved from EVP_PKEY_derive() to EVP_PKEY_derive_set_peer()

This may mean result in an error in L<EVP_PKEY_derive_set_peer(3)> rather than
during L<EVP_PKEY_derive(3)>.
To disable this check use EVP_PKEY_derive_set_peer_ex(dh, peer, 0).

https://www.openssl.org/docs/man3.0/man7/migration_guide.html

So, it is expected that the EVP_PKEY_derive_set_peer performance will drop - but only because it is now doing something that EVP_PKEY_derive was previously doing that it is no longer doing. Since both function calls are necessary to perform a derive operation, this move is neutral to performance overall (unless there are multiple calls to derive for one derive_set_peer call, in which case this is a net positive for performance).

The call is necessary.

Will the function do_sigver_init and EVP_xxx_fetch be optimized in the future release?

There are many performance improvements in the 3.1 release compared to 3.0, and further performance improvements in the forthcoming 3.2.

@t8m
Copy link
Member

t8m commented Aug 25, 2023

The problem is that without the validate_peer parameter set the public key of the peer is not validated which means it can be a bogus key not providing security against eavesdroppers or mitm.

@t8m
Copy link
Member

t8m commented Aug 25, 2023

However perhaps EVP_PKEY_public_check_quick() would be sufficient instead of the full check. @slontis ?

@t8m t8m added branch: master Merge to master branch triaged: feature The issue/pr requests/adds a feature and removed issue: question The issue was opened to ask a question labels Aug 25, 2023
@slontis
Copy link
Member

slontis commented Aug 25, 2023

The checks were mainly written to satisfy keyagreement related key validation as defined in sp800-56A, so there is no RSA partial validation.
Only dh and ec have a partial test which map to
Section 5.6.2.3.1 and 5.6.2.3.4 of SP800-56A R3.

@t8m
Copy link
Member

t8m commented Aug 25, 2023

Does that imply we could use EVP_PKEY_public_check_quick() in derive_ex()? If there is no quick check the flag as set in EVP_PKEY_public_check_quick() would be just ignored by the provider and the public check will still be performed.

@davidben
Copy link
Contributor

Looks like the "quick" check is checking that the point isn't infinity, checking coefficients are in range, and checking if the point is on a curve. Other than potentially the first one (depending on your invariants), all these checks ought to have been performed at the time you constructed the EC_KEY. I.e. OpenSSL should not allow you to construct an EC_POINT that is off the curve... I thought you all fixed that in 1.1.x or so? But it's also not that expensive to redo wastefully.

The "full" check additionally checks that point * order = infinity. This one is expensive, as you're effectively doing twice as much ECDH as you need. It is also completely pointless if you're on a prime-order curve, where there are no small subgroups, and practically all curves used in this part of the code have prime order.

@davidben
Copy link
Contributor

davidben commented Aug 25, 2023

practically all curves used in this part of the code have prime order

Though, since OpenSSL supports arbitrary curves as well as a some built-in curves with a cofactor, keep in mind that when you aren't on a prime-order curve, you may need to take some care here. (I don't know that part as well because BoringSSL just doesn't support EC_GROUP with cofactors.)

@slontis
Copy link
Member

slontis commented Aug 27, 2023

Section 5.6.2.2.2 of SP800-56Ar3 says you can do either the partial OR the full public key validation, so changing to the partial check should be sufficient.

It is also not helped currently by the code performing the point multiply twice..

@slontis
Copy link
Member

slontis commented Aug 28, 2023

Perhaps if we know that there is no way that an EC_KEY can exist without already having done most of the checks, then maybe we could remove some of the redundant checks (such as pointoncureve) and comment why they are removed (even if it was only for the partial case).

@t8m
Copy link
Member

t8m commented Aug 28, 2023

Though, since OpenSSL supports arbitrary curves as well as a some built-in curves with a cofactor, keep in mind that when you aren't on a prime-order curve, you may need to take some care here. (I don't know that part as well because BoringSSL just doesn't support EC_GROUP with cofactors.)

We could do the quick check for prime order curves, otherwise do the full check.

mattcaswell added a commit to mattcaswell/openssl that referenced this issue Aug 30, 2023
This code was added in error and is entirely redundant. It is also an
expensive operation (e.g. see openssl#21833).

Fixes openssl#21834
openssl-machine pushed a commit that referenced this issue Sep 1, 2023
This code was added in error and is entirely redundant. It is also an
expensive operation (e.g. see #21833).

Fixes #21834

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from #21902)
openssl-machine pushed a commit that referenced this issue Sep 1, 2023
This code was added in error and is entirely redundant. It is also an
expensive operation (e.g. see #21833).

Fixes #21834

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from #21902)

(cherry picked from commit 3961991)
@wbl
Copy link
Contributor

wbl commented Sep 21, 2023

We've also run into the use of the expensive test instead of quick check at Akamai, in the course of our 3.0 migration. I don't however have a patch that does the right thing with non-NIST curves (although x25519 should bypass all of this).

@wbl
Copy link
Contributor

wbl commented Sep 21, 2023

For find_sig_alg optimization the problem is that the useable hash functions are not recorded in the EVP_PKEY, and so the calls are expensive. Optimizing by saying if you use a named curve you probably support the right hash is one possibility and is what I tried, but it would be better if the EVP_PKEY could keep that information around v. need an expensive provider call every time.

@paulidale paulidale added the triaged: performance The issue/pr reports/fixes a performance concern label Nov 7, 2023
@testn
Copy link

testn commented Dec 19, 2023

any updates? It seems worrisome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch: master Merge to master branch triaged: feature The issue/pr requests/adds a feature triaged: performance The issue/pr reports/fixes a performance concern
Projects
Status: Backlog
Development

No branches or pull requests

8 participants