New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect multimodal distributions #429

Closed
AndreyAkinshin opened this Issue Apr 21, 2017 · 8 comments

Comments

Projects
None yet
2 participants
@AndreyAkinshin
Member

AndreyAkinshin commented Apr 21, 2017

Some benchmarks have several modes and Mean/Median don't provide useful information. It would be great to have an analyzer which detects such distributions and prints additional warnings (like "This benchmark has 3 modes: 10ms, 50ms, and 500ms").

@goldshtn

This comment has been minimized.

Show comment
Hide comment
@goldshtn

goldshtn Apr 21, 2017

Contributor

Why not a histogram? I'd say a histogram is always more useful than mean/median or any computerized guesses at mode detection.

Contributor

goldshtn commented Apr 21, 2017

Why not a histogram? I'd say a histogram is always more useful than mean/median or any computerized guesses at mode detection.

@AndreyAkinshin

This comment has been minimized.

Show comment
Hide comment
@AndreyAkinshin

AndreyAkinshin Apr 21, 2017

Member

We already have nice distribution plots in RPlotExporter. I use it all the time, and it works perfect for me.
However, people are too lazy. Even if a user knows that it's a good idea to look at the distribution each time, he will not do it. It seems that 95% of our users look only at the summary table. Thus, it would be great to print a warning like "This particular benchmark has non-trivial distribution, check the plots."

Member

AndreyAkinshin commented Apr 21, 2017

We already have nice distribution plots in RPlotExporter. I use it all the time, and it works perfect for me.
However, people are too lazy. Even if a user knows that it's a good idea to look at the distribution each time, he will not do it. It seems that 95% of our users look only at the summary table. Thus, it would be great to print a warning like "This particular benchmark has non-trivial distribution, check the plots."

@goldshtn

This comment has been minimized.

Show comment
Hide comment
@goldshtn

goldshtn Apr 21, 2017

Contributor

If I read you correctly, plots are harder for users because they have to go elsewhere outside the console. So, put the histograms in the console (in ASCII art). Fairly easy.

Contributor

goldshtn commented Apr 21, 2017

If I read you correctly, plots are harder for users because they have to go elsewhere outside the console. So, put the histograms in the console (in ASCII art). Fairly easy.

@AndreyAkinshin

This comment has been minimized.

Show comment
Hide comment
@AndreyAkinshin

AndreyAkinshin Apr 21, 2017

Member

I like the idea of ASCII histograms! However, in my opinion, it should be optional. Imagine that we have 50 benchmark methods. If we just print 50 ASCII histograms by default, nobody will look at it. A great thing about the summary: It's small and provides you an overview of benchmark results. If we add a few additional lines with warnings amount non-trivial distributions, it will be really useful; it will guide users to continue performance investigation in a specific direction. If we add 2000 extra lines with beautiful ASCII charts, users will just ignore it.

Member

AndreyAkinshin commented Apr 21, 2017

I like the idea of ASCII histograms! However, in my opinion, it should be optional. Imagine that we have 50 benchmark methods. If we just print 50 ASCII histograms by default, nobody will look at it. A great thing about the summary: It's small and provides you an overview of benchmark results. If we add a few additional lines with warnings amount non-trivial distributions, it will be really useful; it will guide users to continue performance investigation in a specific direction. If we add 2000 extra lines with beautiful ASCII charts, users will just ignore it.

@goldshtn

This comment has been minimized.

Show comment
Hide comment
@goldshtn

goldshtn Apr 21, 2017

Contributor

I'd try to think of a way to push the histogram into the summary table instead of all the statistics. It doesn't have to be more than 2-3 lines. Mockup:

Method      Mean    Distribution
MyMethod  4.32           _--_
                                 ___       -__
Contributor

goldshtn commented Apr 21, 2017

I'd try to think of a way to push the histogram into the summary table instead of all the statistics. It doesn't have to be more than 2-3 lines. Mockup:

Method      Mean    Distribution
MyMethod  4.32           _--_
                                 ___       -__
@goldshtn

This comment has been minimized.

Show comment
Hide comment
@goldshtn

goldshtn Apr 21, 2017

Contributor

Well the ascii art didn't go as well as I thought from my phone, but I think you get the gist.

Contributor

goldshtn commented Apr 21, 2017

Well the ascii art didn't go as well as I thought from my phone, but I think you get the gist.

@goldshtn

This comment has been minimized.

Show comment
Hide comment
@goldshtn
Contributor

goldshtn commented Apr 21, 2017

@AndreyAkinshin

This comment has been minimized.

Show comment
Hide comment
@AndreyAkinshin

AndreyAkinshin Apr 21, 2017

Member

Sounds interesting, will think about it. Probably it will not be easy to implement (basically because we have a small amount of character and sometimes distributions look really crazy), but I'd like to have such option in some of my benchmarks.
However, I still don't think that we should print it by default for all benchmarks. Especially, if it requires 2-3 lines per benchmark instead of one in the summary table.
Anyway, we need an analyzer which detects that something wrong with a distribution and shows a warning. It's a rare situation when we really have a multimodal distribution. It doesn't make sense to provide additional information about the distribution when a user don't need it.

Member

AndreyAkinshin commented Apr 21, 2017

Sounds interesting, will think about it. Probably it will not be easy to implement (basically because we have a small amount of character and sometimes distributions look really crazy), but I'd like to have such option in some of my benchmarks.
However, I still don't think that we should print it by default for all benchmarks. Especially, if it requires 2-3 lines per benchmark instead of one in the summary table.
Anyway, we need an analyzer which detects that something wrong with a distribution and shows a warning. It's a rare situation when we really have a multimodal distribution. It doesn't make sense to provide additional information about the distribution when a user don't need it.

@AndreyAkinshin AndreyAkinshin self-assigned this Feb 5, 2018

AndreyAkinshin added a commit that referenced this issue Feb 6, 2018

adamsitnik added a commit that referenced this issue Mar 13, 2018

Merge pull request #647 from dotnet/multimodal
Histograms and multimodal distribution detection, fixes #429

@AndreyAkinshin AndreyAkinshin added this to the v0.10.14 milestone Mar 13, 2018

alinasmirnova added a commit to alinasmirnova/BenchmarkDotNet that referenced this issue Sep 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment