New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Require a steady clock for bench with at least micro precision #11646
Conversation
Concept ACK |
utACK bb3f030c5559a5053c45cbd0bfebdc3f26703033. (They have the same precision on my system, anyway) |
src/bench/bench.h
Outdated
static constexpr bool steady_is_high_res = std::ratio_less_equal<steady_clock::period, hi_res_clock::period>::value; | ||
using type = std::conditional<steady_is_high_res, steady_clock, hi_res_clock>::type; | ||
using type = std::conditional<hi_res_clock::is_steady, hi_res_clock, steady_clock>::type; | ||
static_assert(std::ratio_less_equal<type::period, std::micro>::value, "steady_clock must have at least microsecond precision"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, I'm not sure I like the static assert here - do we really prefer the benchmarks to not compile to them compiling with possibly lower precision? Normally a runtime warning would be enough for these kind of things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was figuring that since its (AFAIK) always the case that the steady_clock has microsecond precision it wasnt a big deal....would use a static_warning if possible, but I guess I could do a runtime print if we prefer that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't care whether it's a runtime or compile-time warning.
It's just that I already predict complaints coming from someone building on some obscure platform that has lower precision, who probably doesn't even care about the benchmarks but just that the project doesn't build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I moved it to a runtime print.
bb3f030
to
ad0d1e1
Compare
Sure, I went back and forth on which to prefer. In practice, I suspect they'll usually be the same anyway. As a data point, while testing libstdc++/libc++/mingw, linux and osx, The only (compile-time) difference I observed between clocks was with libc++. It was the same for osx/linux:
Everywhere else was nano for all 3. |
I don't believe they can be the same - it seems like most non-windows platforms have a non-steady high_precision_clock, meaning you will get different results there...
…On November 9, 2017 1:13:12 PM EST, Cory Fields ***@***.***> wrote:
Sure, I went back and forth on which to prefer. In practice, I suspect
they'll usually be the same anyway.
As a data point, while testing libstdc++/libc++/mingw, linux and osx,
The only (compile-time) difference I observed between clocks was with
libc++. It was the same for osx/linux:
```
high res precision: nanosecond
steady precision: nanosecond
system precision: microsecond
```
Everywhere else was nano for all 3.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#11646 (comment)
|
src/bench/bench.cpp
Outdated
@@ -23,6 +23,9 @@ void | |||
benchmark::BenchRunner::RunAll(benchmark::duration elapsedTimeForOne) | |||
{ | |||
perf_init(); | |||
if (std::ratio_less_equal<benchmark::clock::period, std::micro>::value) { | |||
std::cout << "WARNING: Clock precision is worse than microsecond - benchmarks may be less accurate!\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print to stderr to avoid breaking csv parsing?
src/bench/bench.h
Outdated
// On many systems, the high_resolution_clock offers no better resolution than the steady_clock. | ||
// If that's the case, prefer the steady_clock. | ||
// In case high_resolution_clock is steady, prefer that, otherwise use steady_clock. | ||
// ...but always require at least microsecond precision! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment is stale now
ad0d1e1
to
620bae3
Compare
utACK 620bae3 |
utACK 620bae3 |
…ecision 620bae3 Require a steady clock for bench with at least micro precision (Matt Corallo) Pull request description: Using a non-steady high_precision_clock by default is definitely not what we want, and in practice steady_clock has more than enough precision. Should double-check that travis passes on this one to make sure we actually have at least microsecond precision on all platforms. Tree-SHA512: 54a4af3b6addca9897e8ab04694f9461343691b475ca3ed2368595c37520612e284969be94a8ee3d7c66d16532f7bb16b6ad80284cbc153653e8ef2d56696e9d
This is kind of amusing; seems the person that introduced high_resolution_clock in the first place, Howard Hinnant, has second thought about it:
So yes, this is a good choice. If we want a steady clock, use a steady clock. |
@@ -23,6 +23,9 @@ void | |||
benchmark::BenchRunner::RunAll(benchmark::duration elapsedTimeForOne) | |||
{ | |||
perf_init(); | |||
if (std::ratio_less_equal<benchmark::clock::period, std::micro>::value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you meant to invert that condition, i.e. print the warning when our period is not less_equal micro-precision.
Micro-benchmarking framework part 1 Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#6733 - bitcoin/bitcoin#6770 - bitcoin/bitcoin#6892 - Excluding changes to `src/policy/policy.h` which we don't have yet. - bitcoin/bitcoin#7934 - Just the benchmark, not the performance improvements. - bitcoin/bitcoin#8039 - bitcoin/bitcoin#8107 - bitcoin/bitcoin#8115 - bitcoin/bitcoin#8914 - Required resolving several merge conflicts in code that had been refactored upstream. The changes were simple enough that I decided it was okay to impose merge conflicts on pulling in those refactors later. - bitcoin/bitcoin#9200 - bitcoin/bitcoin#9202 - Adds support for measuring CPU cycles, which is later removed in an upstream PR after the refactor. I am including it to reduce future merge conflicts. - bitcoin/bitcoin#9281 - Only changes to `src/bench/bench.cpp` - bitcoin/bitcoin#9498 - bitcoin/bitcoin#9712 - bitcoin/bitcoin#9547 - bitcoin/bitcoin#9505 - Just the benchmark, not the performance improvements. - bitcoin/bitcoin#9792 - Just the benchmark, not the performance improvements. - bitcoin/bitcoin#10272 - bitcoin/bitcoin#10395 - Only changes to `src/bench/` - bitcoin/bitcoin#10735 - Only changes to `src/bench/base58.cpp` - bitcoin/bitcoin#10963 - bitcoin/bitcoin#11303 - Only the benchmark backend change. - bitcoin/bitcoin#11562 - bitcoin/bitcoin#11646 - bitcoin/bitcoin#11654 This pulls in all changes to the micro-benchmark framework prior to December 2017, when it was rewritten. The rewrite depends on other upstream PRs we have not pulled in yet. This does not pull in all benchmarks prior to December 2017. It leaves out benchmarks that either test code we do not have yet (except for the `FastRandomContext` refactor, which I decided to pull in), or would require rewrites to work with our changes to the codebase.
…icro precision 620bae3 Require a steady clock for bench with at least micro precision (Matt Corallo) Pull request description: Using a non-steady high_precision_clock by default is definitely not what we want, and in practice steady_clock has more than enough precision. Should double-check that travis passes on this one to make sure we actually have at least microsecond precision on all platforms. Tree-SHA512: 54a4af3b6addca9897e8ab04694f9461343691b475ca3ed2368595c37520612e284969be94a8ee3d7c66d16532f7bb16b6ad80284cbc153653e8ef2d56696e9d
…icro precision 620bae3 Require a steady clock for bench with at least micro precision (Matt Corallo) Pull request description: Using a non-steady high_precision_clock by default is definitely not what we want, and in practice steady_clock has more than enough precision. Should double-check that travis passes on this one to make sure we actually have at least microsecond precision on all platforms. Tree-SHA512: 54a4af3b6addca9897e8ab04694f9461343691b475ca3ed2368595c37520612e284969be94a8ee3d7c66d16532f7bb16b6ad80284cbc153653e8ef2d56696e9d
…icro precision 620bae3 Require a steady clock for bench with at least micro precision (Matt Corallo) Pull request description: Using a non-steady high_precision_clock by default is definitely not what we want, and in practice steady_clock has more than enough precision. Should double-check that travis passes on this one to make sure we actually have at least microsecond precision on all platforms. Tree-SHA512: 54a4af3b6addca9897e8ab04694f9461343691b475ca3ed2368595c37520612e284969be94a8ee3d7c66d16532f7bb16b6ad80284cbc153653e8ef2d56696e9d
3f3edde [Bench] Use PIVX address in Base58Decode test (random-zebra) 5a1be90 [Travis] Disable benchmark framework for trusty test (random-zebra) 1bd89ac Initialize recently introduced non-static class member lastCycles to zero in constructor (random-zebra) ec60671 Require a steady clock for bench with at least micro precision (random-zebra) 84069ce bench: prefer a steady clock if the resolution is no worse (random-zebra) 38367b1 bench: switch to std::chrono for time measurements (random-zebra) a24633a Remove countMaskInv caching in bench framework (random-zebra) 9e9bc22 Restore default format state of cout after printing with std::fixed/setprecision (random-zebra) 3dd559d Avoid static analyzer warnings regarding uninitialized arguments (random-zebra) e85f224 Replace boost::function with std::function (C++11) (random-zebra) 98c0857 Prevent warning: variable 'x' is uninitialized (random-zebra) 7f0d4b3 FastRandom benchmark (random-zebra) d9fa0c6 Add prevector destructor benchmark (random-zebra) e1527ba Assert that what might look like a possible division by zero is actually unreachable (random-zebra) e94cf15 bench: Fix initialization order in registration (random-zebra) 151c25f Basic CCheckQueue Benchmarks (random-zebra) 51aedbc Use std:thread:hardware_concurrency, instead of Boost, to determine available cores (random-zebra) d447613 Use real number of cores for default -par, ignore virtual cores (random-zebra) 9162a56 [Refactoring] Removed using namespace <xxx> from bench/ sources (random-zebra) 5c07f67 bench: Add support for measuring CPU cycles (random-zebra) 41ce1ed bench: Fix subtle counting issue when rescaling iteration count (random-zebra) 68ea794 Avoid integer division in the benchmark inner-most loop. (random-zebra) 3fa4f27 bench: Added base58 encoding/decoding benchmarks (random-zebra) 4442118 bench: Add crypto hash benchmarks (random-zebra) a5179b6 [Trivial] ensure minimal header conventions (random-zebra) 8607d6b Support very-fast-running benchmarks (random-zebra) 4aebb60 Simple benchmarking framework (random-zebra) Pull request description: Introduces the benchmarking framework, loosely based on google's micro-benchmarking library (https://github.com/google/benchmark), ported from Bitcoin, up to 0.16. The benchmark framework is hard-coded to run each benchmark for one wall-clock second, and then spits out .csv-format timing information to stdout. Backported PR: - bitcoin#6733 - bitcoin#6770 - bitcoin#6892 - bitcoin#8039 - bitcoin#8107 - bitcoin#8115 - bitcoin#9200 - bitcoin#9202 - bitcoin#9281 - bitcoin#6361 - bitcoin#10271 - bitcoin#9498 - bitcoin#9712 - bitcoin#9547 - bitcoin#9505 (benchmark only. Rest was in #1557) - bitcoin#9792 (benchmark only. Rest was in #643) - bitcoin#10272 - bitcoin#10395 (base58 only) - bitcoin#10963 - bitcoin#11303 (first commit) - bitcoin#11562 - bitcoin#11646 - bitcoin#11654 Current output of `src/bench/bench_pivx`: ``` #Benchmark,count,min(ns),max(ns),average(ns),min_cycles,max_cycles,average_cycles Base58CheckEncode,131072,7697,8065,7785,20015,20971,20242 Base58Decode,294912,3305,3537,3454,8595,9198,8981 Base58Encode,180224,5498,6020,5767,14297,15652,14994 CCheckQueueSpeed,320,3159960,3535173,3352787,8216030,9191602,8717388 CCheckQueueSpeedPrevectorJob,96,9184484,11410840,10823070,23880046,29668680,28140445 FastRandom_1bit,320,3143690,4838162,3199156,8173726,12579373,8317941 FastRandom_32bit,60,17097612,17923669,17367440,44454504,46602306,45156079 PrevectorClear,3072,334741,366618,346731,870340,953224,901516 PrevectorDestructor,2816,344233,368912,357281,895022,959187,928948 RIPEMD160,288,3404503,3693917,3577774,8851850,9604334,9302363 SHA1,384,2718128,2891558,2802513,7067238,7518184,7286652 SHA256,176,6133760,6580005,6239866,15948035,17108376,16223916 SHA512,240,4251468,4358706,4313463,11054006,11332826,11215186 Sleep100ms,10,100221470,100302411,100239073,260580075,260790726,260625870 ``` NOTE: Not all the tests have been pulled yet (as we might not have the code being tested, or it would require rewrites to work with our different code base), but the framework is updated to December 2017. ACKs for top commit: Fuzzbawls: ACK 3f3edde Tree-SHA512: c283311a9accf6d2feeb93b185afa08589ebef3f18b6e86980dbc3647b9845f75ac9ecce2f1b08738d25ceac36596a2c89d41e4dbf3b463502aa695611aa1f8e
…icro precision 620bae3 Require a steady clock for bench with at least micro precision (Matt Corallo) Pull request description: Using a non-steady high_precision_clock by default is definitely not what we want, and in practice steady_clock has more than enough precision. Should double-check that travis passes on this one to make sure we actually have at least microsecond precision on all platforms. Tree-SHA512: 54a4af3b6addca9897e8ab04694f9461343691b475ca3ed2368595c37520612e284969be94a8ee3d7c66d16532f7bb16b6ad80284cbc153653e8ef2d56696e9d
Using a non-steady high_precision_clock by default is definitely not what we want, and in practice steady_clock has more than enough precision. Should double-check that travis passes on this one to make sure we actually have at least microsecond precision on all platforms.