-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for confidence intervals #69
Add support for confidence intervals #69
Conversation
Haha, you make my PR #68 look like something coming from a village idiot 😆. This is how it handles my bench from #68:
Awesome! |
Here's some slides from the Benchmarking '16 conference about the confidence intervals I'm using http://soft-dev.org/events/bench16/slides/Tomas_Kalibera.pdf |
# Conflicts: # lib/benchmark/timing.rb
@evanphx it looks like this is now failing CI because master is (I just merged). Beside that, do you have any opinions on the PR? The key advantage is it gives you a CI for the speedup as well as the absolute measurement. |
Looks great! I think the keys changes can end up on disk but those are saved briefly so it shouldn't be an issue. I'll check CI in the morning, but looks fine! |
Thanks @chrisseaton - this is great |
Current output:
With confidence intervals:
Why are confidence intervals good? The standard deviation isn't really actionable. If I tell you something is plus/mins X SD, what can you do with that? If I tell you something is plus/minus X and I'm 95% confident about that then you can theoretically use that in a quantitive assessment of the risk and cost of being wrong and use that to make a decision. It also isn't parametric - you can't make it smaller if you want more certainty, or larger if you are more relaxed.
Another big benefit is that we can show a confidence interval for the comparison as well! This isn't something that isn't possible at the moment.
Finally, I think the standard deviation is overly conservative, and confidence intervals are smaller in practice. From experience using
benchmark-ips
, the standard deviations we currently use are not useful because they're so large.Adds an optional dependency on the
kalibera
gem.@thedarkone what do you think?