Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not run outer setup part of benchmarks multiple times to fix issue 20142 #38779

Merged
merged 1 commit into from Jan 12, 2017

Conversation

Craig-Macomber
Copy link
Contributor

Fix #20142

This is my first real rust code, so I expect the quality is quite bad. Please let me know in which ways it is horrible and I'll fix it.

Previously the whole benchmark function was rerun many times, but with this change, only the callback passed to iter is rerun. This improves performances by saving benchmark startup time. The setup used to be called a minimum of 101 times, and now only runs once.

I wasn't sure exactly what should be done for the case where iter is never called, so I left a FIXME for that: currently it does not error, and I added tests to cover that.

I have left the algorithm and statistics unchanged: I don't like how the minimum number of runs is 301 (that's bad for very slow benchmarks) but I consider such changes out of scope for this fix.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@arthurprs
Copy link
Contributor

arthurprs commented Jan 4, 2017

This is very nice, it could also bring variance down a bit in some benchmarks (heap fragmentation, stuff crossing page/cache_lines, etc).

@sfackler
Copy link
Member

sfackler commented Jan 4, 2017

This seems reasonable to me, though I don't know enough about why the benchmarking infrastructure is designed the way it is.

r? @alexcrichton

@alexcrichton
Copy link
Member

Thanks for the PR @Craig-Macomber! I don't personally know why the benchmark harness was set up this way and I doubt there's anyone around who does still, so this seems like an ok change to me. Can you elaborate on why BenchMode still exists though? I couldn't quite follow when we had Auto vs not.

@Craig-Macomber
Copy link
Contributor Author

BenchMode is used to indicate if iter should actually do its bench marking, or just run it once (which happens when running as a test instead of a benchmark via run_once called by convert_benchmarks_to_tests). This is the same behavior as before, but the information is needed in iter now since that is where all the logic is that decides what runs to do is.

@BurntSushi
Copy link
Member

I'd also support this change. This has been a pain point for me as well, which I've typically solved by using lazy_static! for expensive initialization. I would also be curious to know why the benchmarking harness was designed that way...

@alexcrichton
Copy link
Member

@Craig-Macomber oh oops sorry this fell out of my inbox, thanks for the explanation! In that case this looks good to me, thanks again for the PR!

@bors: r+

@bors
Copy link
Contributor

bors commented Jan 11, 2017

📌 Commit 7cb2040 has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Jan 12, 2017

⌛ Testing commit 7cb2040 with merge ac5046c...

bors added a commit that referenced this pull request Jan 12, 2017
Do not run outer setup part of benchmarks multiple times to fix issue 20142

Fix #20142

This is my first real rust code, so I expect the quality is quite bad. Please let me know in which ways it is horrible and I'll fix it.

Previously the whole benchmark function was rerun many times, but with this change, only the callback passed to iter is rerun. This improves performances by saving benchmark startup time. The setup used to be called a minimum of 101 times, and now only runs once.

I wasn't sure exactly what should be done for the case where iter is never called, so I left a FIXME for that: currently it does not error, and I added tests to cover that.

I have left the algorithm and statistics unchanged: I don't like how the minimum number of runs is 301 (that's bad for very slow benchmarks) but I consider such changes out of scope for this fix.
@bors
Copy link
Contributor

bors commented Jan 12, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing ac5046c to master...

@bors bors merged commit 7cb2040 into rust-lang:master Jan 12, 2017
@milancio42
Copy link

Sorry, I know I'm late to the party, but would it be possible to make BenchMode public? I have some benchmarks which must be iterated only once to be meaningful (I accept the statistical error).
Before, it was possible by using bench_n method where I could specify the number of iterations. With this pull however, I don't see any obvious way how I can achieve that.
Thanks

@arthurprs
Copy link
Contributor

IMHO, I'd like to see this stabilized soon so I'd argue against exposing anything other than what's strictly needed for the sake of quick stabilization and future-proofing.

@milancio42
Copy link

Just to clarify, I was referring to the private field Bencher.mode. If the field was public, we could write something like this:

#[bench]
fn my_bench(b: &mut Bencher) {
    b.mode = BenchMode::Single;
    b.iter(|| run());
}

@Craig-Macomber Craig-Macomber deleted the bench branch January 18, 2017 03:49
@Craig-Macomber
Copy link
Contributor Author

With the implementation as is, the statistics won't work with a single run, so we don't even bother collecting the data.

Having one single stabilized implementation of benchmarking doesn't seem very important compared to having the tests runable with the standard tools.

So I wonder: Is it likely that bench might get stabilized in something like its current state?

It seems to me like it might make more sense to make the test running tools a bit more extensible, and throw bench (or at least uncommon use cases/configuration options) in a crate.

Alternatively, we could make bench very extensible. There are so many different things people may want from something like bench (Ex: for some benchmarks you want the min time, others mean, others mean without outliers etc, and some you want to report rates, others times, others time complexities etc.).

If we simply made #[bench] take in any configuration from the environment (running as benchmark or test mainly, but maybe more in the future), and something to write progress and results to, then main loops and statistics code could be swap-able. We could stabilize one simple common case, and delegate any complex/custom policies to users/crates.

I'm new to rust (this change was my first rust project), so I don't have a good sense of what direction things should head from here, but I may be able to help with an implementation.

@BurntSushi
Copy link
Member

@Craig-Macomber Generally speaking, this type of discussion is better had on the rfcs repo or even in the forums. This very topic actually has a pretty recent thread: https://internals.rust-lang.org/t/pre-rfc-stabilize-bench-bencher-and-black-box/4565

So I wonder: Is it likely that bench might get stabilized in something like its current state?

It has been in wide use for years at this point. Folks acknowledge that there are shortcomings, but it Generally Works.

It seems to me like it might make more sense to make the test running tools a bit more extensible, and throw bench (or at least uncommon use cases/configuration options) in a crate.

This is already done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it possible to stop running #[bench] functions multiple times
9 participants