New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run benchmarks at microsecond level and record advanced stats #6868
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good.
In a first round of review, I found a few nits to pick and a couple bugs to fix.
The output without the new features activated is indeed as before, accepted by multi-test benchmark-wolfcrypt-intelasm-all
(which parses the output in detail).
The statistics look to be getting calculated correctly but in the next review round I can do a formal confirmation of the mathematical correctness if you'd like.
The PR rebased on current master
passed multi-test super-quick-check
, modulo one typo found by clang-tidy
.
wolfcrypt/benchmark/benchmark.c
Outdated
* BENCH_MICROSECOND | ||
* | ||
* display mean, max, min and sd of operation durations: | ||
* ADVANCED_STATS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name this something descriptive? How about MULTI_VALUE_STATISTICS
?
wolfcrypt/benchmark/benchmark.c
Outdated
if (bench_plain == NULL) | ||
goto exit; | ||
|
||
if ((size_t)XFREAD(bench_plain, 1, bench_buf_size, file) != (size_t)bench_buf_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line length
wolfcrypt/benchmark/benchmark.c
Outdated
if (bench_cipher == NULL) | ||
goto exit; | ||
|
||
if ((size_t)XFREAD(bench_cipher, 1, bench_buf_size, file) != (size_t)bench_buf_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line length
wolfcrypt/benchmark/benchmark.c
Outdated
#ifndef MAX_SAMPLE_RUNS | ||
#define MAX_SAMPLE_RUNS 1000 | ||
#endif | ||
#define NEW_LINE "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name this something descriptive? If the NEW_LINE
macro is being used as a separator, then it should be called something like STATS_CLAUSE_SEPARATOR
. Mainly, don't give macros names that are intrinsically contradictory, since that inevitably leads to confusion.
wolfcrypt/benchmark/benchmark.c
Outdated
#define SHOW_INTEL_CYCLES(b, n, s) b[XSTRLEN(b)] = '\n' | ||
#define SHOW_INTEL_CYCLES_CSV(b, n, s) b[XSTRLEN(b)] = '\n' | ||
#ifdef ADVANCED_STATS | ||
#define SHOW_INTEL_CYCLES(b, n, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically these statement-type macros should be defined to WC_DO_NOTHING
to make them no-ops.
wolfcrypt/benchmark/benchmark.c
Outdated
milliEach = each * 1000; /* milliseconds */ | ||
|
||
#endif | ||
|
||
#ifdef BENCH_MICROSECOND |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can move this up to the declaration: turn digits
into a const int
, and conditionalize the initializer on BENCH_MICROSECOND
. Not important, but it's neater and more efficient.
wolfcrypt/benchmark/benchmark.c
Outdated
"-print ベンチマーク統計の要約を表示する\n", | ||
/* TODO: translate below */ | ||
"-hash_input <file> Input data to use for hash benchmarking\n", | ||
"-cipher_input <file> Input data to use for cipher benchmarking\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing comma at the end of the line. And nope, I didn't spot that with my bare eyes -- thank clang-tidy
bugprone-suspicious-missing-comma
for that catch.
wolfcrypt/benchmark/benchmark.c
Outdated
#endif | ||
#ifdef ADVANCED_STATS | ||
printf("max duration,min duration,mean duration,sd,\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is missing a comma before "max duration" when GENERATE_MACHINE_PARSEABLE_REPORT
. The output looks like this:
"asym",Algorithm,key size,operation,avg ms,ops/μsec,ops,μsecs,cycles,cycles/opmax duration,min duration,mean duration,sd,
Thanks for that first round of review @douzzer. Should be ready for a second round. Lmk if I missed anything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes for the previous round all look good.
Now on to round 2! Limiting my review to just the code orthogonalization.
wolfcrypt/benchmark/benchmark.c
Outdated
#ifdef MULTI_VALUE_STATISTICS | ||
if (runs == 0) { | ||
delta = current_time(0) - start; | ||
min = delta; | ||
max = delta; | ||
} | ||
else { | ||
delta = current_time(0) - prev; | ||
if (max < delta) | ||
max = delta; | ||
else if (min > delta) | ||
min = delta; | ||
} | ||
|
||
if (runs < MAX_SAMPLE_RUNS) { | ||
deltas[runs] = delta; | ||
sum += delta; | ||
runs++; | ||
} | ||
|
||
prev = current_time(0); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lengthy code pattern repeats something like 125 times -- you should use a macro to orthogonalize the implementation.
The declarations of the working variables should also be done with a macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool new capability!
Jenkins Retest this please |
Jenkins Retest this please |
retest this please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, tests good, one last opportunistic nit pick from me and then I'll call it good!
wolfcrypt/benchmark/benchmark.c
Outdated
|
||
#ifdef HAVE_AES_DECRYPT | ||
XMEMSET(dec, 0, sizeof(dec)); | ||
|
||
#ifdef MULTI_VALUE_STATISTICS | ||
prev = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reset code pattern occurs over and over, and tracks the declarations from DECLARE_MULTI_VALUE_STATS_VARS()
, so it should also be done with a macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Jenkins Retest this please |
1 similar comment
Jenkins Retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sanitizer is reporting overruns:
$ ./configure --quiet --disable-jobserver --enable-fips=disabled --enable-all --enable-intelasm 'CFLAGS=-ggdb -fno-omit-frame-pointer -fsanitize=address,pointer-subtract,leak,undefined,float-cast-overflow,float-divide-by-zero,bounds-strict -fsanitize-recover=all --param=max-vartrack-size=128000000 -O2 -DBENCH_MICROSECOND -DMULTI_VALUE_STATISTICS' --disable-optflags
$ wolfcrypt/benchmark/benchmark -hash_input /etc/hosts -cipher_input /etc/services -min_runs 10
------------------------------------------------------------------------------
wolfSSL version 5.6.4
------------------------------------------------------------------------------
Math: Multi-Precision: Wolf(SP) word-size=64 bits=4096 sp_int.c
wolfCrypt Benchmark (block bytes 1048576, min 1.0 sec each)
=================================================================
==21782==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x624000003e20 at pc 0x7f18bfb75a2f bp 0x7fff88218ea0 sp 0x7fff88218e90
WRITE of size 32 at 0x624000003e20 thread T0
#0 0x7f18bfb75a2e in Hash_gen wolfcrypt/src/random.c:543
#1 0x7f18bfb75a2e in Hash_DRBG_Generate wolfcrypt/src/random.c:623
#2 0x7f18bfb79047 in wc_RNG_GenerateBlock wolfcrypt/src/random.c:1831
#3 0x556e41bd6011 in bench_rng wolfcrypt/benchmark/benchmark.c:3721
#4 0x556e41c0db31 in benchmarks_do wolfcrypt/benchmark/benchmark.c:2777
#5 0x556e41c0effc in benchmark_test wolfcrypt/benchmark/benchmark.c:3681
#6 0x556e41c0effc in benchmark_test wolfcrypt/benchmark/benchmark.c:3612
#7 0x556e41c0effc in wolfcrypt_benchmark_main wolfcrypt/benchmark/benchmark.c:12652
#8 0x7f18be84bc05 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#9 0x7f18be84bcc6 in __libc_start_main_impl ../csu/libc-start.c:360
#10 0x556e41bc53b0 in _start ../sysdeps/x86_64/start.S:115
0x624000003e2e is located 0 bytes to the right of 7470-byte region [0x624000002100,0x624000003e2e)
allocated by thread T0 here:
#0 0x7f18c12bd48f in __interceptor_malloc /tmp/portage/sys-devel/gcc-12.3.1_p20230825/work/gcc-12-20230825/libsanitizer/asan/asan_malloc_linux.cpp:69
#1 0x556e41c0d89a in benchmarks_do wolfcrypt/benchmark/benchmark.c:2697
#2 0x556e41c0effc in benchmark_test wolfcrypt/benchmark/benchmark.c:3681
#3 0x556e41c0effc in benchmark_test wolfcrypt/benchmark/benchmark.c:3612
#4 0x556e41c0effc in wolfcrypt_benchmark_main wolfcrypt/benchmark/benchmark.c:12652
#5 0x7f18be84bc05 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#6 0x7f18be84bcc6 in __libc_start_main_impl ../csu/libc-start.c:360
#7 0x556e41bc53b0 in _start ../sysdeps/x86_64/start.S:115
SUMMARY: AddressSanitizer: heap-buffer-overflow wolfcrypt/src/random.c:543 in Hash_gen
Shadow bytes around the buggy address:
0x0c487fff8770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c487fff8780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c487fff8790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c487fff87a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c487fff87b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c487fff87c0: 00 00 00 00[00]06 fa fa fa fa fa fa fa fa fa fa
0x0c487fff87d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c487fff87e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c487fff87f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c487fff8800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c487fff8810: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==21782==ABORTING
I suspect it's because the setup code for hash_input
in benchmarks_do()
isn't properly rounding the length up to the nearest 16-byte boundary (cursory read), and maybe also because it isn't properly setting the new value for bench_size
.
Jenkins Retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
caught on final check-over:
configure... real 0m16.660s user 0m8.609s sys 0m9.381s
build...wolfcrypt/benchmark/benchmark.c: In function ‘void bench_aesecb_internal(int, const byte*, word32, const char*, const char*)’:
9bece34ad0 (<lealem47@github.com> 2023-11-17 13:28:30 -0700 4371) const int benchSz = (const int)bench_size;
wolfcrypt/benchmark/benchmark.c:4371:25: error: type qualifiers ignored on cast result type [-Werror=ignored-qualifiers]
4371 | const int benchSz = (const int)bench_size;
| ^~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [Makefile:6311: wolfcrypt/benchmark/benchmark.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile:7980: all-recursive] Error 1
make: *** [Makefile:4708: all] Error 2
real 0m14.307s user 1m19.023s sys 0m6.010s
all-g++ fail_build
failed config: '--enable-all' '--enable-testcert' '--enable-sp-math-all' 'CC=g++' 'CPPFLAGS=-DNO_WOLFSSL_CIPHER_SUITE_TEST -DWOLFSSL_OLD_PRIME_CHECK'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good!
Description
First two commits: Allow benchmarks to be run on microsecond level. Not sure if using the
μ
character in the code is controversial or not.Next two commits: Record mean, max, min and sd of operation durations. Allow user input for hash and cipher plain input. Allow users to specify a minimum number of operation runs.
Testing
Tested on linux, windows, android, IOS
Checklist