Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data interpretation: start emphasizing the meaning of best-case outcome (minimum duration) #594

Open
jgehrcke opened this issue Jan 11, 2023 · 2 comments

Comments

@jgehrcke
Copy link
Member

jgehrcke commented Jan 11, 2023

In #535 we started discussing the relevance of min vs. mean in performance analysis. There is a plethora of literature out there that I think elaborates rather well on the fundamental ideas, such as

In plotting, I think we should move away from showing only mean values, but show both: min and mean (neutrally expose both pieces of information).

In addition, I think we can safely emphasize min over mean (but that might be controversial). Most importantly, there is no one-size-fits-all solution, and we should do a great job explaining related concepts.

@jgehrcke
Copy link
Member Author

jgehrcke commented May 6, 2023

Made good progress towards enabling that systematically via the introduction of the 'single value summary' concept in #1172 and #1183.

@jgehrcke
Copy link
Member Author

jgehrcke commented Sep 27, 2023

https://conbench.ursa.dev/compare/benchmark-results/06511cc82d757d868000ce4783696dc4...06511ed95f977e968000accebf5ff277/ good example where a 'best of N repetitions' would yield a more useful signal than the mean value.

image

Given

[5.366551, 5.2435, 5.283533]

vs

[5.278719, 6.832488, 5.97085]

Given that very simplistic view I think it's pretty obvious that this alert for a potential performance regression is a false positive.

One could potentially emit a 'instability warning' instead because it seems like that the volatility extent is new.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant