[Tests] Benchmarking Framework #1650

random-zebra · 2020-05-27T20:39:31Z

Introduces the benchmarking framework, loosely based on google's micro-benchmarking library (https://github.com/google/benchmark), ported from Bitcoin, up to 0.16.
The benchmark framework is hard-coded to run each benchmark for one wall-clock second,
and then spits out .csv-format timing information to stdout.

Backported PR:

Simple benchmarking framework bitcoin/bitcoin#6733
[Trivial] Ignore bench_bitcoin binary [skip ci] bitcoin/bitcoin#6770
[Trivial] ensure minimal header conventions bitcoin/bitcoin#6892
bench: Add crypto hash benchmarks bitcoin/bitcoin#8039
bench: Added base58 encoding/decoding benchmarks bitcoin/bitcoin#8107
Avoid integer division in the benchmark inner-most loop. bitcoin/bitcoin#8115
bench: Fix subtle counting issue when rescaling iteration count bitcoin/bitcoin#9200
bench: Add support for measuring CPU cycles bitcoin/bitcoin#9202
Refactor: Remove using namespace <xxx> from bench/ & test/ sources bitcoin/bitcoin#9281
Use real number of cores for default -par, ignore virtual cores bitcoin/bitcoin#6361
Use std::thread::hardware_concurrency, instead of Boost, to determine available cores bitcoin/bitcoin#10271
Basic CCheckQueue Benchmarks bitcoin/bitcoin#9498
bench: Fix initialization order in registration bitcoin/bitcoin#9712
bench: Assert that division by zero is unreachable bitcoin/bitcoin#9547
Prevector Quick Destruct bitcoin/bitcoin#9505 (benchmark only. Rest was in [Core] Prevector #1557)
FastRandomContext improvements and switch to ChaCha20 bitcoin/bitcoin#9792 (benchmark only. Rest was in [Crypto] Use stronger rand for key generation #643)
[Tests] Prevent warning: variable 'x' is uninitialized bitcoin/bitcoin#10272
Replace boost::function with std::function (C++11) bitcoin/bitcoin#10395 (base58 only)
[bench] Restore format state of cout after printing with std::fixed/setprecision bitcoin/bitcoin#10963
Fix estimatesmartfee rounding display issue bitcoin/bitcoin#11303 (first commit)
bench: use std::chrono rather than gettimeofday bitcoin/bitcoin#11562
Require a steady clock for bench with at least micro precision bitcoin/bitcoin#11646
tests: Initialize recently introduced non-static class member lastCycles to zero in constructor bitcoin/bitcoin#11654

Current output of src/bench/bench_pivx:

#Benchmark,count,min(ns),max(ns),average(ns),min_cycles,max_cycles,average_cycles
Base58CheckEncode,131072,7697,8065,7785,20015,20971,20242
Base58Decode,294912,3305,3537,3454,8595,9198,8981
Base58Encode,180224,5498,6020,5767,14297,15652,14994
CCheckQueueSpeed,320,3159960,3535173,3352787,8216030,9191602,8717388
CCheckQueueSpeedPrevectorJob,96,9184484,11410840,10823070,23880046,29668680,28140445
FastRandom_1bit,320,3143690,4838162,3199156,8173726,12579373,8317941
FastRandom_32bit,60,17097612,17923669,17367440,44454504,46602306,45156079
PrevectorClear,3072,334741,366618,346731,870340,953224,901516
PrevectorDestructor,2816,344233,368912,357281,895022,959187,928948
RIPEMD160,288,3404503,3693917,3577774,8851850,9604334,9302363
SHA1,384,2718128,2891558,2802513,7067238,7518184,7286652
SHA256,176,6133760,6580005,6239866,15948035,17108376,16223916
SHA512,240,4251468,4358706,4313463,11054006,11332826,11215186
Sleep100ms,10,100221470,100302411,100239073,260580075,260790726,260625870

NOTE: Not all the tests have been pulled yet (as we might not have the code being tested, or it would require rewrites to work with our different code base), but the framework is updated to December 2017.

- backports bitcoin/bitcoin@535ed92 Benchmarking framework, loosely based on google's micro-benchmarking library (https://github.com/google/benchmark) Wny not use the Google Benchmark framework? Because adding Even More Dependencies isn't worth it. If we get a dozen or three benchmarks and need nanosecond-accurate timings of threaded code then switching to the full-blown Google Benchmark library should be considered. The benchmark framework is hard-coded to run each benchmark for one wall-clock second, and then spits out .csv-format timing information to stdout. It is left as an exercise for later (or maybe never) to add command-line arguments to specify which benchmark(s) to run, how long to run them for, how to format results, etc etc etc. Again, see the Google Benchmark framework for where that might end up. See src/bench/MilliSleep.cpp for a sanity-test benchmark that just benchmarks 'sleep 100 milliseconds.' To compile and run benchmarks: cd src; make bench Sample output: Benchmark,count,min,max,average Sleep100ms,10,0.101854,0.105059,0.103881

- backports bitcoin/bitcoin@7072c54 Avoid calling gettimeofday every time through the benchmarking loop, by keeping track of how long each loop takes and doubling the number of iterations done between time checks when they take less than 1/16'th of the total elapsed time.

backports bitcoin/bitcoin@214de7e - ensure header namespaces and end comments are correct - add missing header end comments - ensure minimal formatting (add newlines etc.)

- backports bitcoin/bitcoin@32114dd Add benchmarks for the cryptographic hash algorithms: - RIPEMD160 - SHA1 - SHA256 - SHA512 Continues work on bitcoin#7883.

- backports bitcoin/bitcoin@5fac1f3

- backports bitcoin/bitcoin@63ff57d Previously the benchmark code used an integer division (%) with a non-constant in the inner-loop. This is quite slow on many processors, especially ones like ARM that lack a hardware divide. Even on fairly recent x86_64 like haswell an integer division can take something like 100 cycles-- making it comparable to the runtime of siphash. This change avoids the division by using bitmasking instead. This was especially easy since the count was only increased by doubling. This change also restarts the timing when the execution time was very low this avoids mintimes of zero in cases where one execution ends up below the timer resolution. It also reduces the impact of the overhead on the final result. The formatting of the prints is changed to not use scientific notation make it more machine readable (in particular, gnuplot croaks on the non-fixedpoint, and it doesn't sort correctly). This also hoists out all the floating point divisions out of the semi-hot path because it was easy to do so. It might be prudent to break out the critical test into a macro just to guarantee that it gets inlined. It might also make sense to just save out the intermediate counts and times and get the floating point completely out of the timing loop (because e.g. on hardware without a fast hardware FPU like some ARM it will still be slow enough to distort the results). I haven't done either of these in this commit.

- backports bitcoin/bitcoin@e0a9cb2 Make sure that the count is a zero modulo the new mask before scaling, otherwise the next time until a measure triggers will take only 1/2 as long as accounted for. This caused the 'min time' to be potentially off by as much as 100%.

- backports bitcoin/bitcoin@3532818 This adds cycle min/max/avg to the statistics. Supported on x86 and x86_64 (natively through rdtsc), as well as Linux (perf syscall).

from bitcoin/bitcoin@73f4119

- backports bitcoin/bitcoin@4716267 To determine the default for `-par`, the number of script verification threads, use [boost:thread:physical_concurrency()](http://www.boost.org/doc/libs/1_58_0/doc/html/thread/thread_management.html#thread.thread_management.thread.physical_concurrency) which counts only physical cores, not virtual cores. Virtual cores are roughly a set of cached registers to avoid context switches while threading, they cannot actually perform work, so spawning a verification thread for them could even reduce efficiency and will put undue load on the system. Should fix issue dashpay#6358, as well as some other reported system overload issues, especially on Intel processors. The function was only introduced in boost 1.56, so provide a utility function `GetNumCores` to fall back for older Boost versions.

available cores - backports bitcoin/bitcoin@937bf43

- backports bitcoin/bitcoin@9f03110 - backports bitcoin/bitcoin@aad4cb5

- backports bitcoin/bitcoin@29c5328 The initialization order of global data structures in different implementation units is undefined. Making use of this is essentially gambling on what the linker does, the so-called [Static initialization order fiasco](https://isocpp.org/wiki/faq/ctors#static-init-order). In this case it apparently worked on Linux but failed on OpenBSD and FreeBSD. To create it on first use, make the registration structure local to a function. Fixes bitcoin#8910.

unreachable - backports bitcoin/bitcoin@db07f91

- backports bitcoin/bitcoin@aaa02e7

- backports bitcoin/bitcoin@663fbae

- backports bitcoin/bitcoin@5ec8836

- backports bitcoin/bitcoin@1b936f5

- backports bitcoin/bitcoin@6835cb0 Avoid static analyzer warnings regarding "Function call argument is a pointer to uninitialized value" in cases where we are intentionally using such arguments. This is achieved by using ... `f(b.begin(), b.end())` (`std::array<char, N>`) ... instead of ... `f(b, b + N)` (`char b[N]`) Rationale: * Reduce false positives by guiding static analyzers regarding our intentions. Before this commit: ``` $ clang-tidy-3.5 -checks=* src/bench/base58.cpp bench/base58.cpp:23:9: warning: Function call argument is a pointer to uninitialized value [clang-analyzer-core.CallAndMessage] EncodeBase58(b, b + 32); ^ $ clang-tidy-3.5 -checks=* src/bench/verify_script.cpp bench/verify_script.cpp:59:5: warning: Function call argument is a pointer to uninitialized value [clang-analyzer-core.CallAndMessage] key.Set(vchKey, vchKey + 32, false); ^ $ ``` After this commit: ``` $ clang-tidy-3.5 -checks=* src/bench/base58.cpp $ clang-tidy-3.5 -checks=* src/bench/verify_script.cpp $ ```

std::fixed/setprecision - backports bitcoin/bitcoin@fd05132

- backports bitcoin/bitcoin@0b1b914 We were saving a div by caching the inverse as a float, but this ended up requiring a int -> float -> int conversion, which takes almost as much time as the difference between float mul and div. There are lots of other more pressing issues with the bench framework which probably require simply removing the adaptive iteration count stuff anyway.

- backports bitcoin/bitcoin@c515d26 std::chrono removes portability issues. Rather than storing doubles, store the untouched time_points. Then convert to nanoseconds for display. This allows for maximum precision, while keeping results comparable between differing hardware/operating systems. Also, display full nanosecond counts rather than sub-second floats.

- backports bitcoin/bitcoin@24a0bdd

- backports bitcoin/bitcoin@620bae3

zero in constructor - backports bitcoin/bitcoin@069215e lastCycles was introduced in 3532818 which was merged into master yesterday. Also initialize beginCycles to zero for consistency and completeness.

Fuzzbawls

Very good base for this.

ACK 3f3edde

furszy

all good 👌 , ACK 5a1be90 and merging

Document the changes introduced in: - PIVX-Project#1549: Nuke zPIV from the GUI - PIVX-Project#1586: Minimum value for stake split threshold - PIVX-Project#1633: Bitcoin 0.12-0.14 serialization improvements - PIVX-Project#1645: Implement accurate memory accounting for mempool - PIVX-Project#1647: MemPool package tracking and limits - PIVX-Project#1650: Benchmarking Framework - PIVX-Project#1688: TopBar navigation (sync/peers)

4819ee7 [Doc] Add/Update some release notes for 4.2 (random-zebra) Pull request description: Document the changes introduced in: - #1549: Nuke zPIV from the GUI - #1586: Minimum value for stake split threshold - #1633: Bitcoin 0.12-0.14 serialization improvements - #1645: Implement accurate memory accounting for mempool - #1647: MemPool package tracking and limits - #1650: Benchmarking Framework - #1688: TopBar navigation (sync/peers) ACKs for top commit: furszy: utACK 4819ee7 Fuzzbawls: ACK 4819ee7 Tree-SHA512: 62ad949ea26a2f877ef0b40ec86616cc8105f81e1fcd380c8162cd93af04a46f1093f878c0668408654f198a0059b240798b83af3bf1d5e6c1c1d8611276a325

[Doc] Add/Update some release notes for 4.2 Document the changes introduced in: - PIVX-Project#1549: Nuke zPIV from the GUI - PIVX-Project#1586: Minimum value for stake split threshold - PIVX-Project#1633: Bitcoin 0.12-0.14 serialization improvements - PIVX-Project#1645: Implement accurate memory accounting for mempool - PIVX-Project#1647: MemPool package tracking and limits - PIVX-Project#1650: Benchmarking Framework - PIVX-Project#1688: TopBar navigation (sync/peers) Update settingsfaqwidget.ui Fix Linter

[Doc] Add/Update some release notes for 4.2 Document the changes introduced in: - PIVX-Project#1549: Nuke zPIV from the GUI - PIVX-Project#1586: Minimum value for stake split threshold - PIVX-Project#1633: Bitcoin 0.12-0.14 serialization improvements - PIVX-Project#1645: Implement accurate memory accounting for mempool - PIVX-Project#1647: MemPool package tracking and limits - PIVX-Project#1650: Benchmarking Framework - PIVX-Project#1688: TopBar navigation (sync/peers) Update settingsfaqwidget.ui Fix Linter [RPC] Register calls where they are defined

random-zebra added Upstream Tests Needs Release Notes Placeholder tag for anything needing mention in the "Notable Changes" section of release notes labels May 27, 2020

random-zebra added this to the 5.0.0 milestone May 27, 2020

random-zebra self-assigned this May 27, 2020

random-zebra force-pushed the 202005_microbench branch from 6ba1f51 to 77cbd62 Compare May 27, 2020 23:32

random-zebra added 24 commits May 28, 2020 02:12

[Trivial] ensure minimal header conventions

a5179b6

backports bitcoin/bitcoin@214de7e - ensure header namespaces and end comments are correct - add missing header end comments - ensure minimal formatting (add newlines etc.)

bench: Add crypto hash benchmarks

4442118

- backports bitcoin/bitcoin@32114dd Add benchmarks for the cryptographic hash algorithms: - RIPEMD160 - SHA1 - SHA256 - SHA512 Continues work on bitcoin#7883.

bench: Added base58 encoding/decoding benchmarks

3fa4f27

- backports bitcoin/bitcoin@5fac1f3

bench: Add support for measuring CPU cycles

5c07f67

- backports bitcoin/bitcoin@3532818 This adds cycle min/max/avg to the statistics. Supported on x86 and x86_64 (natively through rdtsc), as well as Linux (perf syscall).

[Refactoring] Removed using namespace <xxx> from bench/ sources

9162a56

from bitcoin/bitcoin@73f4119

Use std:thread:hardware_concurrency, instead of Boost, to determine

51aedbc

available cores - backports bitcoin/bitcoin@937bf43

Basic CCheckQueue Benchmarks

151c25f

- backports bitcoin/bitcoin@9f03110 - backports bitcoin/bitcoin@aad4cb5

Assert that what might look like a possible division by zero is actually

e1527ba

unreachable - backports bitcoin/bitcoin@db07f91

Add prevector destructor benchmark

d9fa0c6

- backports bitcoin/bitcoin@aaa02e7

FastRandom benchmark

7f0d4b3

- backports bitcoin/bitcoin@663fbae

Prevent warning: variable 'x' is uninitialized

98c0857

- backports bitcoin/bitcoin@5ec8836

Replace boost::function with std::function (C++11)

e85f224

- backports bitcoin/bitcoin@1b936f5

Restore default format state of cout after printing with

9e9bc22

std::fixed/setprecision - backports bitcoin/bitcoin@fd05132

bench: prefer a steady clock if the resolution is no worse

84069ce

- backports bitcoin/bitcoin@24a0bdd

Require a steady clock for bench with at least micro precision

ec60671

- backports bitcoin/bitcoin@620bae3

random-zebra added 2 commits May 28, 2020 02:13

Initialize recently introduced non-static class member lastCycles to

1bd89ac

zero in constructor - backports bitcoin/bitcoin@069215e lastCycles was introduced in 3532818 which was merged into master yesterday. Also initialize beginCycles to zero for consistency and completeness.

[Travis] Disable benchmark framework for trusty test

5a1be90

random-zebra force-pushed the 202005_microbench branch from 77cbd62 to 5a1be90 Compare May 28, 2020 00:14

[Bench] Use PIVX address in Base58Decode test

3f3edde

Fuzzbawls approved these changes Jun 2, 2020

View reviewed changes

random-zebra requested a review from furszy June 8, 2020 13:41

furszy approved these changes Jun 8, 2020

View reviewed changes

furszy merged commit 4ed15cc into PIVX-Project:master Jun 8, 2020

random-zebra mentioned this pull request Jun 24, 2020

[Doc] Add/Update some release notes for 4.2 #1705

Merged

random-zebra removed the Needs Release Notes Placeholder tag for anything needing mention in the "Notable Changes" section of release notes label Jun 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tests] Benchmarking Framework #1650

[Tests] Benchmarking Framework #1650

random-zebra commented May 27, 2020 •

edited

Loading

Fuzzbawls left a comment

furszy left a comment

[Tests] Benchmarking Framework #1650

[Tests] Benchmarking Framework #1650

Conversation

random-zebra commented May 27, 2020 • edited Loading

Fuzzbawls left a comment

Choose a reason for hiding this comment

furszy left a comment

Choose a reason for hiding this comment

random-zebra commented May 27, 2020 •

edited

Loading