Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple benchmarking framework #6733

Merged
merged 2 commits into from
Oct 6, 2015
Merged

Simple benchmarking framework #6733

merged 2 commits into from
Oct 6, 2015

Conversation

gavinandresen
Copy link
Contributor

Benchmarking framework, loosely based on google's micro-benchmarking
library (https://github.com/google/benchmark)

Wny not use the Google Benchmark framework? Because adding Even More Dependencies
isn't worth it. If we get a dozen or three benchmarks and need nanosecond-accurate
timings of threaded code then switching to the full-blown Google Benchmark library
should be considered.

The benchmark framework is hard-coded to run each benchmark for one wall-clock second,
and then spits out .csv-format timing information to stdout. It is left as an
exercise for later (or maybe never) to add command-line arguments to specify which
benchmark(s) to run, how long to run them for, how to format results, etc etc etc.
Again, see the Google Benchmark framework for where that might end up.

See src/bench/Examples.cpp for a sanity-test benchmark that just benchmarks
'sleep 100 milliseconds' and a benchmark for math.h's sin() function.

To compile and run benchmarks:

  configure --enable-bench  ...other configure options
  cd src; make bench

Sample output:

Benchmark,count,min,max,average
Sleep100ms,10,0.10048,0.104789,0.103251
Trig,37748735,0,4.76837e-07,2.71684e-08

static void CODE_TO_TIME(benchmark::State& state)
{
... do any setup needed...
while (state.KeepRunning()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works well as long as 'stuff you want to time' does not run too quickly - otherwise the state.KeepRunning(), which does a gettimeofday syscall, with the implied latency of that, will dominate.
(an alternative to calling the time function in the benchmark loop that is commonly used, is to first time a fixed # of runs of the loop, then from the result compute a number of iterations so that it runs for approximately the allotted time)

@laanwj laanwj added the Tests label Sep 28, 2015
@laanwj
Copy link
Member

laanwj commented Sep 28, 2015

Concept ACK

if ENABLE_WALLET
bench_bench_bitcoin_LDADD += $(LIBBITCOIN_WALLET)
endif
bench_bench_bitcoin_LDADD += $(LIBBITCOIN_CONSENSUS) $(BDB_LIBS) $(SSL_LIBS) $(CRYPTO_LIBS) $(MINIUPNPC_LIBS) $(CURL_LIBS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm $(CURL_LIBS)? This seems to have be sneaked in from XT.

@jonasschnelli
Copy link
Contributor

Concept ACK

@gavinandresen
Copy link
Contributor Author

Removed the $(CURL_LIBS) noticed by @jonasschnelli.

And added the new libevent and libzmq dependencies.

I also decided to make it disabled by default, so you have to configure --enable-bench, so it doesn't slow down compile times for everybody.

@laanwj : RE: gettimeofday overhead and timing things that are very fast: timing how fast gettimeofday() is and then warning the developer if their benchmarking code runs into that overhead (maybe suggest that they loop 1,000 times inside the KeepRunning loop) would be another way of preventing that problem. Feel free to improve, I'm planning on using this to benchmark things that take microseconds to run, not nanoseconds.

@jgarzik
Copy link
Contributor

jgarzik commented Sep 29, 2015

Disagree on default-disabled.

Principle: Code that is not built by default bitrots rapidly, is not tested as much, receives less maintenance in general by random developers updating the code.

@gavinandresen
Copy link
Contributor Author

@jgarzik : I KNEW you were going to say that...

Would a Travis configuration that runs benchmarks be good enough to prevent code rot? Compiling and linking Yet Another Binary that approximately nobody will run (unless you're actively working on optimizing something) tickles my "features shouldn't cost anything unless you're using them" sensibility.

@paveljanik
Copy link
Contributor

This looks a bit strange in the benchmarking tool:

SelectParams(CBaseChainParams::MAIN);

@@ -0,0 +1,45 @@
bin_PROGRAMS += bench/bench_bitcoin
BENCH_SRCDIR = bench
BENCH_BINARY=bench/bench_bitcoin$(EXEEXT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces around =?

@paveljanik
Copy link
Contributor

I think it should stay disabled by default as it is now, because benchmarking will be done on purpose by developers only.

Concept ACK. I like it.

This is a way to benchmark parts of code. Now we should identify parts that need some love and prepare benchmarks for them specifically. With this framework, people can report speed fixes with proper benchmark code which can be tested before and after applying the fix.

@laanwj @gavinandresen I think that both ways to benchmark are useful and maybe we should add the second example benchmark there to illustrate.

@sipa
Copy link
Member

sipa commented Sep 29, 2015

Compile by default but not run, seems fine to me.

@gavinandresen
Copy link
Contributor Author

Fixed @paveljanik 's spaces nits, and removed the unneeded SelectParams inherited from the unit test code.

Data on the "compile by default or not" decision:

It takes 8 seconds on my machine to compile and link the benchmarking code, single-processor, ccache cleared.
5 seconds doing parallel make.
Under 2 seconds with a 'hot' ccache.

@gavinandresen
Copy link
Contributor Author

Damn you @laanwj, you inspired me to spend another hour implementing support for really fast benchmarks....

Picked @paveljanik 's CreateNewBlock nit (good catch, I copied this from a CreateNewBlock benchmark I haven't finished). I decided to rename MilliSleep.cpp to Examples.cpp, and added a "see how fast sin() runs" benchmark to test the really-fast-benchmark support (works nicely).

And switched back to compile-by-default, which seems to be the consensus.

I'm really and truly done tweaking this now, assuming Travis is happy.

@btcdrak
Copy link
Contributor

btcdrak commented Sep 29, 2015

Concept ACK

@jgarzik
Copy link
Contributor

jgarzik commented Sep 29, 2015

Yes, to be specific, I meant "compile by default but not run"

@dcousens
Copy link
Contributor

concept ACK

@0xC2
Copy link

0xC2 commented Sep 30, 2015

Tested.


#include "bench.h"

#include "chainparams.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need chainparams.h now.

@paveljanik
Copy link
Contributor

ACK

Benchmarking framework, loosely based on google's micro-benchmarking
library (https://github.com/google/benchmark)

Wny not use the Google Benchmark framework? Because adding Even More Dependencies
isn't worth it. If we get a dozen or three benchmarks and need nanosecond-accurate
timings of threaded code then switching to the full-blown Google Benchmark library
should be considered.

The benchmark framework is hard-coded to run each benchmark for one wall-clock second,
and then spits out .csv-format timing information to stdout. It is left as an
exercise for later (or maybe never) to add command-line arguments to specify which
benchmark(s) to run, how long to run them for, how to format results, etc etc etc.
Again, see the Google Benchmark framework for where that might end up.

See src/bench/MilliSleep.cpp for a sanity-test benchmark that just benchmarks
'sleep 100 milliseconds.'

To compile and run benchmarks:
  cd src; make bench

Sample output:

Benchmark,count,min,max,average
Sleep100ms,10,0.101854,0.105059,0.103881
Avoid calling gettimeofday every time through the benchmarking loop, by keeping
track of how long each loop takes and doubling the number of iterations done
between time checks when they take less than 1/16'th of the total elapsed time.
@@ -91,6 +91,11 @@ AC_ARG_ENABLE(tests,
[use_tests=$enableval],
[use_tests=yes])

AC_ARG_ENABLE(bench,
AS_HELP_STRING([--enable-bench],[compile benchmarks (default is yes)]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If enabled by default, please make it --disable-bench (see #6739).

@laanwj
Copy link
Member

laanwj commented Oct 6, 2015

ACK

@laanwj laanwj merged commit 7072c54 into bitcoin:master Oct 6, 2015
laanwj added a commit that referenced this pull request Oct 6, 2015
7072c54 Support very-fast-running benchmarks (Gavin Andresen)
535ed92 Simple benchmarking framework (Gavin Andresen)
@laanwj laanwj mentioned this pull request Apr 15, 2016
8 tasks
zkbot added a commit to zcash/zcash that referenced this pull request Jan 24, 2020
Micro-benchmarking framework part 1

Cherry-picked from the following upstream PRs:

- bitcoin/bitcoin#6733
- bitcoin/bitcoin#6770
- bitcoin/bitcoin#6892
  - Excluding changes to `src/policy/policy.h` which we don't have yet.
- bitcoin/bitcoin#7934
  - Just the benchmark, not the performance improvements.
- bitcoin/bitcoin#8039
- bitcoin/bitcoin#8107
- bitcoin/bitcoin#8115
- bitcoin/bitcoin#8914
  - Required resolving several merge conflicts in code that had been refactored upstream. The changes were simple enough that I decided it was okay to impose merge conflicts on pulling in those refactors later.
- bitcoin/bitcoin#9200
- bitcoin/bitcoin#9202
  - Adds support for measuring CPU cycles, which is later removed in an upstream PR after the refactor. I am including it to reduce future merge conflicts.
- bitcoin/bitcoin#9281
  - Only changes to `src/bench/bench.cpp`
- bitcoin/bitcoin#9498
- bitcoin/bitcoin#9712
- bitcoin/bitcoin#9547
- bitcoin/bitcoin#9505
  - Just the benchmark, not the performance improvements.
- bitcoin/bitcoin#9792
  - Just the benchmark, not the performance improvements.
- bitcoin/bitcoin#10272
- bitcoin/bitcoin#10395
  - Only changes to `src/bench/`
- bitcoin/bitcoin#10735
  - Only changes to `src/bench/base58.cpp`
- bitcoin/bitcoin#10963
- bitcoin/bitcoin#11303
  - Only the benchmark backend change.
- bitcoin/bitcoin#11562
- bitcoin/bitcoin#11646
- bitcoin/bitcoin#11654

This pulls in all changes to the micro-benchmark framework prior to December 2017, when it was rewritten. The rewrite depends on other upstream PRs we have not pulled in yet.

This does not pull in all benchmarks prior to December 2017. It leaves out benchmarks that either test code we do not have yet (except for the `FastRandomContext` refactor, which I decided to pull in), or would require rewrites to work with our changes to the codebase.
furszy added a commit to PIVX-Project/PIVX that referenced this pull request Jun 8, 2020
3f3edde [Bench] Use PIVX address in Base58Decode test (random-zebra)
5a1be90 [Travis] Disable benchmark framework for trusty test (random-zebra)
1bd89ac Initialize recently introduced non-static class member lastCycles to zero in constructor (random-zebra)
ec60671 Require a steady clock for bench with at least micro precision (random-zebra)
84069ce bench: prefer a steady clock if the resolution is no worse (random-zebra)
38367b1 bench: switch to std::chrono for time measurements (random-zebra)
a24633a Remove countMaskInv caching in bench framework (random-zebra)
9e9bc22 Restore default format state of cout after printing with std::fixed/setprecision (random-zebra)
3dd559d Avoid static analyzer warnings regarding uninitialized arguments (random-zebra)
e85f224 Replace boost::function with std::function (C++11) (random-zebra)
98c0857 Prevent warning: variable 'x' is uninitialized (random-zebra)
7f0d4b3 FastRandom benchmark (random-zebra)
d9fa0c6 Add prevector destructor benchmark (random-zebra)
e1527ba Assert that what might look like a possible division by zero is actually unreachable (random-zebra)
e94cf15 bench: Fix initialization order in registration (random-zebra)
151c25f Basic CCheckQueue Benchmarks (random-zebra)
51aedbc Use std:thread:hardware_concurrency, instead of Boost, to determine available cores (random-zebra)
d447613 Use real number of cores for default -par, ignore virtual cores (random-zebra)
9162a56 [Refactoring] Removed using namespace <xxx> from bench/ sources (random-zebra)
5c07f67 bench: Add support for measuring CPU cycles (random-zebra)
41ce1ed bench: Fix subtle counting issue when rescaling iteration count (random-zebra)
68ea794 Avoid integer division in the benchmark inner-most loop. (random-zebra)
3fa4f27 bench: Added base58 encoding/decoding benchmarks (random-zebra)
4442118 bench: Add crypto hash benchmarks (random-zebra)
a5179b6 [Trivial] ensure minimal header conventions (random-zebra)
8607d6b Support very-fast-running benchmarks (random-zebra)
4aebb60 Simple benchmarking framework (random-zebra)

Pull request description:

  Introduces the benchmarking framework, loosely based on google's micro-benchmarking library (https://github.com/google/benchmark), ported from Bitcoin, up to 0.16.
  The benchmark framework is hard-coded to run each benchmark for one wall-clock second,
  and then spits out .csv-format timing information to stdout.

  Backported PR:
  - bitcoin#6733
  - bitcoin#6770
  - bitcoin#6892
  - bitcoin#8039
  - bitcoin#8107
  - bitcoin#8115
  - bitcoin#9200
  - bitcoin#9202
  - bitcoin#9281
  - bitcoin#6361
  - bitcoin#10271
  - bitcoin#9498
  - bitcoin#9712
  - bitcoin#9547
  - bitcoin#9505 (benchmark only. Rest was in #1557)
  - bitcoin#9792 (benchmark only. Rest was in #643)
  - bitcoin#10272
  - bitcoin#10395 (base58 only)
  - bitcoin#10963
  - bitcoin#11303 (first commit)
  - bitcoin#11562
  - bitcoin#11646
  - bitcoin#11654

  Current output of `src/bench/bench_pivx`:
  ```
  #Benchmark,count,min(ns),max(ns),average(ns),min_cycles,max_cycles,average_cycles
  Base58CheckEncode,131072,7697,8065,7785,20015,20971,20242
  Base58Decode,294912,3305,3537,3454,8595,9198,8981
  Base58Encode,180224,5498,6020,5767,14297,15652,14994
  CCheckQueueSpeed,320,3159960,3535173,3352787,8216030,9191602,8717388
  CCheckQueueSpeedPrevectorJob,96,9184484,11410840,10823070,23880046,29668680,28140445
  FastRandom_1bit,320,3143690,4838162,3199156,8173726,12579373,8317941
  FastRandom_32bit,60,17097612,17923669,17367440,44454504,46602306,45156079
  PrevectorClear,3072,334741,366618,346731,870340,953224,901516
  PrevectorDestructor,2816,344233,368912,357281,895022,959187,928948
  RIPEMD160,288,3404503,3693917,3577774,8851850,9604334,9302363
  SHA1,384,2718128,2891558,2802513,7067238,7518184,7286652
  SHA256,176,6133760,6580005,6239866,15948035,17108376,16223916
  SHA512,240,4251468,4358706,4313463,11054006,11332826,11215186
  Sleep100ms,10,100221470,100302411,100239073,260580075,260790726,260625870
  ```

  NOTE: Not all the tests have been pulled yet (as we might not have the code being tested, or it  would require rewrites to work with our different code base), but the framework is updated to December 2017.

ACKs for top commit:
  Fuzzbawls:
    ACK 3f3edde

Tree-SHA512: c283311a9accf6d2feeb93b185afa08589ebef3f18b6e86980dbc3647b9845f75ac9ecce2f1b08738d25ceac36596a2c89d41e4dbf3b463502aa695611aa1f8e
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants