Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for confidence intervals #69

Merged
merged 14 commits into from
Aug 3, 2016
Merged

Add support for confidence intervals #69

merged 14 commits into from
Aug 3, 2016

Conversation

chrisseaton
Copy link
Contributor

Current output:

Warming up --------------------------------------
                 mul   243.455k i/100ms
                 pow   274.906k i/100ms
Calculating -------------------------------------
                 mul      5.635M (± 8.4%) i/s -     27.997M in   5.014078s
                 pow      7.674M (± 5.1%) i/s -     38.487M in   5.029545s

Comparison:
                 pow:  7674355.1 i/s
                 mul:  5635018.0 i/s - 1.36x  slower

With confidence intervals:

Warming up --------------------------------------
                 mul   244.458k i/100ms
                 pow   282.247k i/100ms
Calculating -------------------------------------
                 mul      5.996M (± 0.6%) i/s -     30.068M in   5.020666s
                 pow      8.035M (± 1.1%) i/s -     39.797M in   5.000516s
                   with 95.0% confidence

Comparison:
                 pow:  8034594.6 i/s
                 mul:  5996097.6 i/s - 1.34x  (± 0.02) slower
                   with 95.0% confidence

Why are confidence intervals good? The standard deviation isn't really actionable. If I tell you something is plus/mins X SD, what can you do with that? If I tell you something is plus/minus X and I'm 95% confident about that then you can theoretically use that in a quantitive assessment of the risk and cost of being wrong and use that to make a decision. It also isn't parametric - you can't make it smaller if you want more certainty, or larger if you are more relaxed.

Another big benefit is that we can show a confidence interval for the comparison as well! This isn't something that isn't possible at the moment.

Finally, I think the standard deviation is overly conservative, and confidence intervals are smaller in practice. From experience using benchmark-ips, the standard deviations we currently use are not useful because they're so large.

Adds an optional dependency on the kalibera gem.

@thedarkone what do you think?

@thedarkone
Copy link

@thedarkone what do you think?

Haha, you make my PR #68 look like something coming from a village idiot 😆.

This is how it handles my bench from #68:

Warming up --------------------------------------
 work(1000 + r(500))    15.000  i/100ms
 work(1000 + r(400))    16.000  i/100ms
Calculating -------------------------------------
 work(1000 + r(500))    155.906  (± 0.9%) i/s -    780.000  in   5.008610s
 work(1000 + r(400))    161.551  (± 0.7%) i/s -    816.000  in   5.053923s
                   with 95.0% confidence

Comparison:
 work(1000 + r(400)):      161.6 i/s
 work(1000 + r(500)):      155.9 i/s - 1.04x  (± 0.01) slower
                   with 95.0% confidence

Awesome!

@chrisseaton
Copy link
Contributor Author

Here's some slides from the Benchmarking '16 conference about the confidence intervals I'm using http://soft-dev.org/events/bench16/slides/Tomas_Kalibera.pdf

@chrisseaton
Copy link
Contributor Author

And this is what confidence intervals look like compared to the SD for the ERB benchmark from MRI.

erb

# Conflicts:
#	lib/benchmark/timing.rb
@chrisseaton
Copy link
Contributor Author

@evanphx it looks like this is now failing CI because master is (I just merged).

Beside that, do you have any opinions on the PR?

The key advantage is it gives you a CI for the speedup as well as the absolute measurement.

@evanphx
Copy link
Owner

evanphx commented Jul 28, 2016

Looks great! I think the keys changes can end up on disk but those are saved briefly so it shouldn't be an issue.

I'll check CI in the morning, but looks fine!

@evanphx evanphx merged commit 8188aee into evanphx:master Aug 3, 2016
@kbrock
Copy link
Contributor

kbrock commented Aug 4, 2016

Thanks @chrisseaton - this is great

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants