-
Notifications
You must be signed in to change notification settings - Fork 6
Description
I wrote a benchmark to test the performance of CKMS and GK.
Findings: CKMS error=0.0001 delivers better and faster results than the error=0.001 suggested in its doc-comment. However, CKMS doesn't have any "sweet spot" - over the entire range where CKMS is feasible, it's slower and more space-intensive than Gk and even than just blindly storing every single value. This is at odds with what I expected from the paper, and also with the claimed memory bounds, so I wonder if there's an implementation bug? (Also, if we can live with just P99, then "store the top 1% of values in a priority queue" is competitive up to 10M values!!)
Method: The benchmark does ckms./gk.insert(value) a number of times then obtains quantiles. I measured wall-time using std::time::Instant::now() / .elapsed(), and I measured heap memory with stats_alloc::Region::new(&GLOBAL) / .change().bytes_allocated - bytes_deallocated + bytes_reallocated. I ran it with cargo run --release on my Macbook. I tried with a normal distribution in the range -0.5 to 1.5, and a pareto distribution in the range 5.0 to 20.0. As a baseline, I added another algorithm "ALL" which keeps every single value in memory - this tells me "perfect" expected values of min/P50/P99/max to judge how accurate GK/CKMS are, and there's no justification in taking more memory than this!
-
VARYING "ERROR" PARAMETER... (1M values)
- error=0.1: ALL 4mb/0.01s, GK 1k/0.1s, CKMS 390mb/6.5s <- ckms and gk are inaccurate
- error=0.01: ALL 4mb/0.01s, GK 11k/0.1s, CKMS 1.1tb/5s <- ckms and gk are inaccurate
- error=0.001: ALL 4mb/0.01s, GK 95k/0.3s, CKMS 750mb/2s <- ckms p99 weak and max inaccurate
- error=0.0001: ALL 4mb/0.01s, GK 770k/3s, CKMS 240mb/2s <- ckms max inaccurate
- error=0.000_01: ALL 4mb/0.01s, GK 12mb/65s, CKMS 66mb/23s
- error=0.000_001: ALL 4mb/0.01s, GK too slow, CKMS 94mb/230s <-- gk too slow
-
VARYING NUMBER OF VALUES... (error_gk=0.001, error_ckms=0.0001)
- count=10k: ALL 40k/0s, GK 95k/0.006s, CKMS 748k/0.02s
- count=100k: ALL 400k/0.001s, GK 95k/0.04s, CKMS 8mb/0.2s
- count=1M: ALL 4mb/0.01s, GK 95k/0.3s, CKMS 240mb/2.5s
- count=10M: ALL 40mb/0.1s, GK 95k/3s, CKMS 8tb/40s
- count=100M: ALL 400mb/1s, GK 95k/30s, CKMS too slow <-- ckms too slow