Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Built in outlier detection #382

Open
PragTob opened this issue Jun 25, 2023 · 0 comments
Open

Built in outlier detection #382

PragTob opened this issue Jun 25, 2023 · 0 comments

Comments

@PragTob
Copy link
Member

PragTob commented Jun 25, 2023

Due to Garbage Collection, scheduling and load on the system we can get outliers in our measurements that impact esp. the average a lot. That happens esp. a lot with nano-second level benchmarks as even a 1ms gap is huge for them.

I'm no expert not 100% sure what to use, but I already wanted to support more data calculated by benchee itself for box plots (so to run less JS which makes this not be able to be opened). I talk about it here (I had forgotten): https://www.youtube.com/watch?v=C4hqcLwxs3A&t=1398s

But basically, as far as I understand p75 - p25 is the inter quantile range (IQR) and everything that is more than 1.5 IQR removed from p75 or p25 are considered outliers.
Skimming over wikipedia, that description seems accurate: https://en.wikipedia.org/wiki/Box_plot (Elements --> Whiskers)

We could run this statistical analysis once, remove the offending values and then rerun statistics calculation.

I think this should be opt in behavior (so default to false).

Kudos to the elixirforum post to remind me that we didn't yet have an issue for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant