-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit regression detection calculation #270
Comments
I've been paying attention to the regressions reports, and even though we have some false positives, I think we have been able to detect quite a few legitimate regressions.
There are some false positives, yes, but if the conditions we were more relaxed we would have missed the legitimate regressions reported above. However, I think we need to have a better system to be able to discern quickly if a regression is legitimate or if it is a false positive.
With this information at hand I believe that we would be able to discard false positives faster. |
One thing which would be nice to have: I think the criterion for regression detection should be different in PRs vs the scheduled runs. For the scheduled runs, we look at the last three to be sure that there is a consistent measurement (and not some weird network hiccup or EC2 wobble). But for PRs that's not really relevant, since we just have one measurement to compare to the time series. So I'd suggest that we instead look at just the most recent run in PRs (maybe with a 2stdev threshold, maybe not) and compare that to the timeseries. |
@ian-r-rose This makes sense. I'll take a look at how to pass this to the script. |
In #226 we added automatic regression detection. We've already identified a legitimate performance regression, which has been great to see. On the other hand, there have also been some false positives (xref #264 (comment), #247 (comment)). We should revisit our existing regression detection calculation to see how we might be able to reduce the rate of false positives
The text was updated successfully, but these errors were encountered: