Skip to content

Commit

Permalink
perf: Avoid taking too long time with high thread counts
Browse files Browse the repository at this point in the history
We change the tests to perform the same number
of iterations regardless of the thread counts.
Although this raises the running time for small
number of threads, this avoids very long running
times for 100 or more threads.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from #162)
  • Loading branch information
t8m committed Jan 18, 2024
1 parent 579bcdc commit f86e03b
Show file tree
Hide file tree
Showing 9 changed files with 90 additions and 69 deletions.
56 changes: 34 additions & 22 deletions perf/README
Original file line number Diff line number Diff line change
Expand Up @@ -41,52 +41,55 @@ labels.
randbytes
---------

The randbytes test repeated calls the RAND_bytes() function in blocks of 100
calls, and 100 blocks per thread. The number of threads to use is provided as
an argument and the test reports the average time take to execute a block of 100
RAND_bytes() calls.
The randbytes test does 10000 calls of the RAND_bytes() function divided
evenly among multiple threads. The number of threads to use is provided as
an argument and the test reports the average time take to execute a block of
1000 RAND_bytes() calls.

handshake
---------

Performs a combined in-memory client and server handshake. Each thread performs
1000 such handshakes. It takes 2 arguments:
Performs a combined in-memory client and server handshake. In total 100000
handshakes are performed divided evenly among each thread. It takes 2 arguments:

certsdir - A directory where 2 files exist (servercert.pem and serverkey.pem) for
the server certificate and key. The test/certs directory of the main OpenSSL
source repository contains such files for all supported branches.

threadcount - The number of threads to perform handshakes on in the test

The output is two values: the average time taken for a handshake in us, and the
average handshakes per second performed over the course of the test.
The output is two values: the average time taken for a single handshake in us,
and the average number of simultaneous handshakes per second performed over the
course of the test.

sslnew
------

The sslnew test repeatedly constructs a new SSL object and associates it with a
newly constructed read BIO and a newly constructed write BIO, and finally frees
them again. It does this in blocks of 100 sets of calls, and 100 blocks per
threads. The number of threads to use is provided as an argument and the test
reports the average time taken to execute a block of 100 construction/free calls.
them again. It does 100000 repetitions divided evenly among each thread.
The number of threads to use is provided as an argument and the test
reports the average time taken to execute a block of 1000 construction/free
calls.

newrawkey
---------

The newrawkey test repeatedly calls the EVP_PKEY_new_raw_public_key_ex()
function in blocks of 100 calls, and 100 blocks per thread. The number of
threads to use is provided as an argument and the test reports the average time
take to execute a block of 100 EVP_PKEY_new_raw_public_key_ex() calls.
function. It does 100000 repetitions divided evenly among each thread. The
number of threads to use is provided as an argument and the test reports the
average time take to execute a block of 1000 EVP_PKEY_new_raw_public_key_ex()
calls.

Note that this test does not support OpenSSL 1.1.1.

rsasign
-------

The rsasign test repeatedly calls the EVP_PKEY_sign_init()/EVP_PKEY_sign()
functions in blocks of 100 calls, and 100 blocks per thread, using a 512 bit RSA
key. The number of threads to use is provided as an argument and the test
reports the average time take to execute a block of 100
functions, using a 512 bit RSA key. It does 100000 repetitions divided evenly
among each thread. The number of threads to use is provided as an argument and
the test reports the average time take to execute a block of 1000
EVP_PKEY_sign_init()/EVP_PKEY_sign() calls.

x509storeissuer
Expand All @@ -97,12 +100,21 @@ is used in certificate chain building as part of a verify operation). The test
assumes that the default certificates directly exists but is empty. For a
default configuration this is "/usr/local/ssl/certs". The test takes the number
of threads to use as an argument and the test reports the average time take to
execute a block of 100 X509_STORE_CTX_get1_issuer() calls.
execute a block of 1000 X509_STORE_CTX_get1_issuer() calls.

providerdoall
-------------

The providerdoall test repeatedly calls the OSSL_PROVIDER_do_all() function in
blocks of 100 calls, and 100 blocks per thread. The number of threads to use is
provided as an argument and the test reports the average time take to execute a
block of 100 OSSL_PROVIDER_do_all() calls.
The providerdoall test repeatedly calls the OSSL_PROVIDER_do_all() function.
It does 100000 repetitions divided evenly among each thread. The number of
threads to use is provided as an argument and the test reports the average time
take to execute a block of 1000 OSSL_PROVIDER_do_all() calls.

pemread
-------------

The pemread test repeatedly calls the PEM_read_bio_PrivateKey() function on
a memory BIO with a private RSA key. It does 100000 repetitions divided evenly
among each thread. The number of threads to use is provided as an argument and
the test reports the average time take to execute a block of 1000
PEM_read_bio_PrivateKey() calls.
11 changes: 6 additions & 5 deletions perf/handshake.c
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,16 @@
#include <openssl/ssl.h>
#include "perflib/perflib.h"

#define NUM_HANDSHAKES_PER_THREAD 1000
#define NUM_HANDSHAKES_PER_RUN 100000

int err = 0;

static SSL_CTX *sctx = NULL, *cctx = NULL;

OSSL_TIME *times;

static int threadcount;

static void do_handshake(size_t num)
{
SSL *clientssl = NULL, *serverssl = NULL;
Expand All @@ -30,7 +32,7 @@ static void do_handshake(size_t num)

start = ossl_time_now();

for (i = 0; i < NUM_HANDSHAKES_PER_THREAD; i++) {
for (i = 0; i < NUM_HANDSHAKES_PER_RUN / threadcount; i++) {
ret = perflib_create_ssl_objects(sctx, cctx, &serverssl, &clientssl,
NULL, NULL);
ret &= perflib_create_ssl_connection(serverssl, clientssl,
Expand All @@ -48,7 +50,6 @@ static void do_handshake(size_t num)

int main(int argc, char *argv[])
{
int threadcount;
double persec;
OSSL_TIME duration, av;
uint64_t us;
Expand Down Expand Up @@ -111,9 +112,9 @@ int main(int argc, char *argv[])
av = times[0];
for (i = 1; i < threadcount; i++)
av = ossl_time_add(av, times[i]);
av = ossl_time_divide(av, NUM_HANDSHAKES_PER_THREAD * threadcount);
av = ossl_time_divide(av, NUM_HANDSHAKES_PER_RUN);

persec = ((NUM_HANDSHAKES_PER_THREAD * threadcount * OSSL_TIME_SECOND)
persec = ((NUM_HANDSHAKES_PER_RUN * OSSL_TIME_SECOND)
/ (double)ossl_time2ticks(duration));

if (terse) {
Expand Down
14 changes: 8 additions & 6 deletions perf/newrawkey.c
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,10 @@
#include <openssl/evp.h>
#include "perflib/perflib.h"

#define NUM_CALLS_PER_BLOCK 100
#define NUM_CALL_BLOCKS_PER_THREAD 100
#define NUM_CALLS_PER_THREAD (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_THREAD)
#define NUM_CALLS_PER_BLOCK 1000
#define NUM_CALL_BLOCKS_PER_RUN 100
#define NUM_CALLS_PER_RUN (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_RUN)


int err = 0;

Expand All @@ -25,12 +26,14 @@ static unsigned char buf[32] = {
0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f
};

static int threadcount;

void do_newrawkey(size_t num)
{
int i;
EVP_PKEY *pkey;

for (i = 0; i < NUM_CALLS_PER_THREAD; i++) {
for (i = 0; i < NUM_CALLS_PER_RUN / threadcount; i++) {
pkey = EVP_PKEY_new_raw_public_key_ex(NULL, "X25519", NULL, buf,
sizeof(buf));
if (pkey == NULL)
Expand All @@ -42,7 +45,6 @@ void do_newrawkey(size_t num)

int main(int argc, char *argv[])
{
int threadcount;
OSSL_TIME duration;
uint64_t us;
double avcalltime;
Expand Down Expand Up @@ -80,7 +82,7 @@ int main(int argc, char *argv[])

us = ossl_time2us(duration);

avcalltime = (double)us / (NUM_CALL_BLOCKS_PER_THREAD * threadcount);
avcalltime = (double)us / NUM_CALL_BLOCKS_PER_RUN;

if (terse)
printf("%lf\n", avcalltime);
Expand Down
11 changes: 6 additions & 5 deletions perf/pemread.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@
#include <openssl/crypto.h>
#include "perflib/perflib.h"

#define NUM_CALLS_PER_BLOCK 100
#define NUM_CALL_BLOCKS_PER_THREAD 100
#define NUM_CALLS_PER_THREAD (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_THREAD)
#define NUM_CALLS_PER_BLOCK 1000
#define NUM_CALL_BLOCKS_PER_RUN 100
#define NUM_CALLS_PER_RUN (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_RUN)


int err = 0;

Expand Down Expand Up @@ -62,7 +63,7 @@ void do_pemread(size_t num)
* Technically this includes the EVP_PKEY_free() in the timing - but I
* think we can live with that
*/
for (i = 0; i < NUM_CALLS_PER_THREAD / threadcount; i++) {
for (i = 0; i < NUM_CALLS_PER_RUN / threadcount; i++) {
key = PEM_read_bio_PrivateKey(pem, NULL, NULL, NULL);
if (key == NULL) {
printf("Failed to create key: %d\n", i);
Expand Down Expand Up @@ -116,7 +117,7 @@ int main(int argc, char *argv[])

us = ossl_time2us(duration);

avcalltime = (double)us / NUM_CALL_BLOCKS_PER_THREAD;
avcalltime = (double)us / NUM_CALL_BLOCKS_PER_RUN;

if (terse)
printf("%lf\n", avcalltime);
Expand Down
15 changes: 8 additions & 7 deletions perf/providerdoall.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
#include <openssl/provider.h>
#include "perflib/perflib.h"

#define NUM_CALLS_PER_BLOCK 100
#define NUM_CALL_BLOCKS_PER_THREAD 100
#define NUM_CALLS_PER_THREAD (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_THREAD)
#define NUM_CALLS_PER_BLOCK 1000
#define NUM_CALL_BLOCKS_PER_RUN 100
#define NUM_CALLS_PER_RUN (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_RUN)

static int err = 0;
OSSL_TIME *times;
Expand All @@ -30,6 +30,8 @@ static int doit(OSSL_PROVIDER *provider, void *vcount)
return 1;
}

static int threadcount;

static void do_providerdoall(size_t num)
{
int i;
Expand All @@ -39,7 +41,7 @@ static void do_providerdoall(size_t num)

start = ossl_time_now();

for (i = 0; i < NUM_CALLS_PER_THREAD; i++) {
for (i = 0; i < NUM_CALLS_PER_RUN / threadcount; i++) {
count = 0;
if (!OSSL_PROVIDER_do_all(NULL, doit, &count) || count != 1) {
err = 1;
Expand All @@ -50,12 +52,12 @@ static void do_providerdoall(size_t num)
end = ossl_time_now();

times[num] = ossl_time_divide(ossl_time_subtract(end, start),
NUM_CALL_BLOCKS_PER_THREAD);
NUM_CALL_BLOCKS_PER_RUN);
}

int main(int argc, char *argv[])
{
int threadcount, i;
int i;
OSSL_TIME duration, av;
int terse = 0;
int argnext;
Expand Down Expand Up @@ -99,7 +101,6 @@ int main(int argc, char *argv[])
av = times[0];
for (i = 1; i < threadcount; i++)
av = ossl_time_add(av, times[i]);
av = ossl_time_divide(av, threadcount);

if (terse)
printf("%ld\n", ossl_time2us(av));
Expand Down
13 changes: 7 additions & 6 deletions perf/randbytes.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,25 +14,26 @@
#include <openssl/crypto.h>
#include "perflib/perflib.h"

#define NUM_CALLS_PER_BLOCK 100
#define NUM_CALL_BLOCKS_PER_THREAD 100
#define NUM_CALLS_PER_THREAD (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_THREAD)
#define NUM_CALLS_PER_BLOCK 1000
#define NUM_CALL_BLOCKS_PER_RUN 100
#define NUM_CALLS_PER_RUN (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_RUN)

int err = 0;

static int threadcount;

void do_randbytes(size_t num)
{
int i;
unsigned char buf[32];

for (i = 0; i < NUM_CALLS_PER_THREAD; i++)
for (i = 0; i < NUM_CALLS_PER_RUN / threadcount; i++)
if (!RAND_bytes(buf, sizeof(buf)))
err = 1;
}

int main(int argc, char *argv[])
{
int threadcount;
OSSL_TIME duration;
uint64_t us;
double avcalltime;
Expand Down Expand Up @@ -70,7 +71,7 @@ int main(int argc, char *argv[])

us = ossl_time2us(duration);

avcalltime = (double)us / (NUM_CALL_BLOCKS_PER_THREAD * threadcount);
avcalltime = (double)us / NUM_CALL_BLOCKS_PER_RUN;

if (terse)
printf("%lf\n", avcalltime);
Expand Down
13 changes: 7 additions & 6 deletions perf/rsasign.c
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
#include <openssl/crypto.h>
#include "perflib/perflib.h"

#define NUM_CALLS_PER_BLOCK 100
#define NUM_CALL_BLOCKS_PER_THREAD 100
#define NUM_CALLS_PER_THREAD (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_THREAD)
#define NUM_CALLS_PER_BLOCK 1000
#define NUM_CALL_BLOCKS_PER_RUN 100
#define NUM_CALLS_PER_RUN (NUM_CALLS_PER_BLOCK * NUM_CALL_BLOCKS_PER_RUN)

int err = 0;
EVP_PKEY *rsakey = NULL;
Expand All @@ -37,6 +37,8 @@ static const char *rsakeypem =

static const char *tbs = "0123456789abcdefghij"; /* Length of SHA1 digest */

static int threadcount;

void do_rsasign(size_t num)
{
int i;
Expand All @@ -45,7 +47,7 @@ void do_rsasign(size_t num)
EVP_PKEY_CTX *ctx = EVP_PKEY_CTX_new(rsakey, NULL);
size_t siglen = sizeof(sig);

for (i = 0; i < NUM_CALLS_PER_THREAD; i++) {
for (i = 0; i < NUM_CALLS_PER_RUN; i++) {
if (EVP_PKEY_sign_init(ctx) <= 0
|| EVP_PKEY_sign(ctx, sig, &siglen, tbs, SHA_DIGEST_LENGTH) <= 0) {
err = 1;
Expand All @@ -57,7 +59,6 @@ void do_rsasign(size_t num)

int main(int argc, char *argv[])
{
int threadcount;
OSSL_TIME duration;
uint64_t us;
double avcalltime;
Expand Down Expand Up @@ -111,7 +112,7 @@ int main(int argc, char *argv[])

us = ossl_time2us(duration);

avcalltime = (double)us / (NUM_CALL_BLOCKS_PER_THREAD * threadcount);
avcalltime = (double)us / NUM_CALL_BLOCKS_PER_RUN;

if (terse)
printf("%lf\n", avcalltime);
Expand Down

0 comments on commit f86e03b

Please sign in to comment.