Revisit regression detection calculation #270

jrbourbeau · 2022-08-19T17:54:42Z

In #226 we added automatic regression detection. We've already identified a legitimate performance regression, which has been great to see. On the other hand, there have also been some false positives (xref #264 (comment), #247 (comment)). We should revisit our existing regression detection calculation to see how we might be able to reduce the rate of false positives

ncclementi · 2022-08-25T15:59:36Z

I've been paying attention to the regressions reports, and even though we have some false positives, I think we have been able to detect quite a few legitimate regressions.

There are some false positives, yes, but if the conditions we were more relaxed we would have missed the legitimate regressions reported above.

However, I think we need to have a better system to be able to discern quickly if a regression is legitimate or if it is a false positive.

Have a better system to catch behaviors like Performance regression in Parquet I/O dask/dask#9397 (check for single spikes in the latest 3 runs)
Include git history of suspicious commits when possible. Regression detection should provide possible culprits based on git history #275
Write better documentation on how to evaluate a regression report. see Document a workflow for investigating performance regressions. #269

With this information at hand I believe that we would be able to discard false positives faster.

ian-r-rose · 2022-08-25T21:54:12Z

One thing which would be nice to have: I think the criterion for regression detection should be different in PRs vs the scheduled runs. For the scheduled runs, we look at the last three to be sure that there is a consistent measurement (and not some weird network hiccup or EC2 wobble). But for PRs that's not really relevant, since we just have one measurement to compare to the time series. So I'd suggest that we instead look at just the most recent run in PRs (maybe with a 2stdev threshold, maybe not) and compare that to the timeseries.

ncclementi · 2022-08-25T22:10:13Z

@ian-r-rose This makes sense. I'll take a look at how to pass this to the script.
I'm thinking of creating an env variable on the action that indicates if it's a PR or not and based on that taking different routes.

jrbourbeau added enhancement New feature or request infrastructure Work related to infrastucture labels Aug 19, 2022

jrbourbeau assigned ncclementi Aug 19, 2022

jrbourbeau mentioned this issue Aug 22, 2022

⚠️ CI failed ⚠️ #272

Closed

ncclementi mentioned this issue Aug 26, 2022

Improve detect regression in PRs #283

Merged

ian-r-rose closed this as completed in #283 Aug 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisit regression detection calculation #270

Revisit regression detection calculation #270

jrbourbeau commented Aug 19, 2022

ncclementi commented Aug 25, 2022

ian-r-rose commented Aug 25, 2022

ncclementi commented Aug 25, 2022

Revisit regression detection calculation #270

Revisit regression detection calculation #270

Comments

jrbourbeau commented Aug 19, 2022

ncclementi commented Aug 25, 2022

ian-r-rose commented Aug 25, 2022

ncclementi commented Aug 25, 2022