-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log(frequency) in histograms confusing #237
Comments
Hmmm, yea - this seems reasonable. I don't have a good idea right at this moment how to make it optional, but I'll get back to this at some point. In the meantime, the shape/profile you're getting isn't as extreme as the cases which I added a log-frequency histogram for, so the heuristic could maybe do with tweaking. The current logic is:
|
Sounds good. If we are arguing a good default (of course log plots are fine in general, I just don't like them here as a part of the default) maybe gathering a few more voices would be a good idea to see if my opinion is more rule or more exception. |
I agree with @karajan9 |
I added them because in some circumstances a non-log plot can push all the other values to round to 0 on the bar graph, which gives the false impression that the executions all took that one time to execute. However, if as can happen that "one time" is say 2us, but 1/100th of the results were 1ms then that rare result which is visually buried actually can account for a large portion of the total time elapsed. This may sound overly contrived, but I have seen examples like that (though I cannot recall one the spot). |
Sure, but would having normal axes not actually help in spotting this phenomenon, and making them automatically log hide it? |
I don' quite see what you mean. Normal axis hide this phenomenon, and log axis make it visible. |
I guess that's hard to discuss without a sample distribution and a criterion for what diagnostic aspect of the benchmarking runs we are trying to isolate. |
Manual control was essentially fixed in #255 (and made more readable in #257), because you can specify In most cases, I agree that the log scaling adds confusion rather than helping. However, I have been able to generate a case which I think illustrates @tecosaur's concern: julia> ioctxfalse = IOContext(stdout, :logbins=>false);
julia> ioctxtrue = IOContext(stdout, :logbins=>true);
julia> t = [fill(1, 100); 1:100]; # fake execution times (will be scaled so the total is 5s, just to ensure consistency with the parameters)
julia> b = BenchmarkTools.Trial(BenchmarkTools.DEFAULT_PARAMETERS, t/sum(t)*1e9*BenchmarkTools.DEFAULT_PARAMETERS.seconds, zeros(length(t)), 0, 0);
julia> show(ioctxfalse, MIME("text/plain"), b)
BenchmarkTools.Trial: 200 samples with 1 evaluation.
Range (min … max): 970.874 μs … 97.087 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 970.874 μs ┊ GC (median): 0.00%
Time (mean ± σ): 25.000 ms ± 31.225 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
█
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▁
971 μs Histogram: frequency by time 95.1 ms <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> show(ioctxtrue, MIME("text/plain"), b)
BenchmarkTools.Trial: 200 samples with 1 evaluation.
Range (min … max): 970.874 μs … 97.087 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 970.874 μs ┊ GC (median): 0.00%
Time (mean ± σ): 25.000 ms ± 31.225 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
█
█▅▃▅▅▃▅▃▅▅▃▅▅▃▅▃▅▅▃▅▃▅▅▃▅▅▃▅▃▅▅▃▅▃▅▅▃▅▅▃▅▃▅▅▃▅▃▅▅▃▅▅▃▅▃▅▅▃▅▅ ▅
971 μs Histogram: log(frequency) by time 95.1 ms <
Memory estimate: 0 bytes, allocs estimate: 0. So I see what @tecosaur means. But to be clear, I think the log scaling is also confusing, because there's no vertical scale; I can't tell if the taller bar is 10x taller or 2x taller or 100x taller than the shorter ones. From this output I would never guess that the shortest time was obtained in just over half the runs. So if no better way of clarifying this can be found, I'd still favor dropping the log scaling. But perhaps someone has a good idea about how to fix it. Would the cumulative distribution simply be too confusing? |
I'm thinking now that there's manual control, with the demonstrated potential for confusion it would be worth dropping the "use log if..." logic and just have it off by default. Regarding the vertical scale, we could set the log basis such that each line / full height bar = 10x. |
FWIW I still think the arithmetic scale is more diagnostic than the log one in @timholy 's example |
Notice in that example that Agree on linear scale being more natural. Colouring in below a line on a log scale is pretty weird, where do you stop, why is this an area? |
Re the summary statistics, but I'm not disagreeing that we need a sense of the variability, but |
Yes I don't think I've ever learned anything from |
I think instead of Mac, what would be more interesting is the 90/95/99 percentile |
A high percentile might be neat, we should try it out. One nice thing about printing |
I'd like to add that a log(frequency) histogram doesn't really make any sense: the point of a histogram is that the area of a single bar / area of all bars is equal to the proportion in that bin. If you log-transform the vertical axis that no longer applies. It would make more sense to log-transform the horizontal axis: this would help capture long-tails better. |
While I was benchmarking two very similar functions I noticed the histograms looked very different, which surprised me. This was because the second run switched to a log display of the histogram.
![image](https://user-images.githubusercontent.com/6549509/125929852-a05911bd-8130-4e01-912e-40f1fa5618b3.png)
(I overwrote the function so they share the same name)
Having the option of a log histogram means I need to check what histogram I'm getting every time. I personally have also trouble reading a log histogram without thinking through what that actually means for my numbers. If one time dominates the histogram I would prefer to see it that way -- I assume the histogram is supposed to be a summary and a single dominant bar would be a good summary in that case. Therefore I would suggest always showing the histogram against the frequency, maybe with an option to display the log histogram.
If the behaviour should stay the same I would like to see an option to pick the histogram type so I can at least compare histograms of measurements easily.
The text was updated successfully, but these errors were encountered: