Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upExplore regression/anomaly detection #7
Comments
This comment has been minimized.
This comment has been minimized.
Eh2406
commented
May 15, 2018
|
I was thinking of experimenting with this after watching your talk. Without #4, I will have to brean dup without connection to the actual data.
|
This comment has been minimized.
This comment has been minimized.
|
This sounds great! Here's the data I've been working with so far: https://github.com/anp/lolbench-analysis There's also a jupyter notebook (python 3) that I've been using to inspect the results. |
This comment has been minimized.
This comment has been minimized.
Eh2406
commented
May 15, 2018
|
Thanks, I will look into it! Just from a quick glance can I strongly recommend |
This comment has been minimized.
This comment has been minimized.
|
Yes, you can definitely recommend pandas, but I have been avoiding learning it for longer than I've known Rust :P. At least they're dictionaries of numpy arrays! EDIT: In all seriousness, yeah I probably should have been using pandas :P. |
This comment has been minimized.
This comment has been minimized.
Eh2406
commented
May 17, 2018
•
|
I could not get your notebook to work locally. So I am working from So far it seams to find anomalous really well. Two well, most of the time they are blips. |
This comment has been minimized.
This comment has been minimized.
|
Exciting, those are looking super cool! If the blips it's finding are small enough, do you think that sorting by probability would effectively hide the red herrings? |
This comment has been minimized.
This comment has been minimized.
Eh2406
commented
May 18, 2018
|
New version, gist, the thought is that if we are in new behavior then skiping a day wont matter whereas if we are in a blip it will. So To work thru the examples from the graphs:
|
This comment has been minimized.
This comment has been minimized.
|
I've implemented something based on the gist you shared above, and it should be live at https://blog.anp.lol/lolbench-data soon! There's definitely more work to do, but for a first pass it's producing some pretty good results! |
anp commentedApr 26, 2018
The problem: we have a lot of benchmark functions, with more to come. Until we've been running for a while we don't want to wantonly disable benchmarks. So, we need to find a way to surface individual benchmarks in human-readable numbers when we detect performance regressions.
My current thinking is that we shouldn't think of the regression detection as a time series analysis (which is what I'd tried in 2016 and didn't get good SNR). For one thing I don't think we should expect seasonality or other kinds of cyclic signal, despite the fact that this kind of data is most obviously graphed as a time series.
So for right now, I'm thinking that I'm going to try doing some clustering analysis as a slightly braindead anomaly detection. We don't really care about time here, mostly care about surfacing nightlies when they deviate from the previously-observed-best-possible-performance.
Questions:
k, while we might in practice find different numbers of clusters for different benchmarks. Ideally we'd have an unsupervised clustering mechanism.If we can reliably cluster the data points, then I think we can define:
Some links for consideration:
https://en.wikipedia.org/wiki/K-means_clustering
https://en.wikipedia.org/wiki/Cluster_analysis
https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test
https://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization
https://en.wikipedia.org/wiki/Kernel_density_estimation