Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark: CKMS slow + excessive memory #32

Open
ljw1004 opened this issue Nov 10, 2021 · 2 comments
Open

benchmark: CKMS slow + excessive memory #32

ljw1004 opened this issue Nov 10, 2021 · 2 comments

Comments

@ljw1004
Copy link

ljw1004 commented Nov 10, 2021

I wrote a benchmark to test the performance of CKMS and GK.

Findings: CKMS error=0.0001 delivers better and faster results than the error=0.001 suggested in its doc-comment. However, CKMS doesn't have any "sweet spot" - over the entire range where CKMS is feasible, it's slower and more space-intensive than Gk and even than just blindly storing every single value. This is at odds with what I expected from the paper, and also with the claimed memory bounds, so I wonder if there's an implementation bug? (Also, if we can live with just P99, then "store the top 1% of values in a priority queue" is competitive up to 10M values!!)

Method: The benchmark does ckms./gk.insert(value) a number of times then obtains quantiles. I measured wall-time using std::time::Instant::now() / .elapsed(), and I measured heap memory with stats_alloc::Region::new(&GLOBAL) / .change().bytes_allocated - bytes_deallocated + bytes_reallocated. I ran it with cargo run --release on my Macbook. I tried with a normal distribution in the range -0.5 to 1.5, and a pareto distribution in the range 5.0 to 20.0. As a baseline, I added another algorithm "ALL" which keeps every single value in memory - this tells me "perfect" expected values of min/P50/P99/max to judge how accurate GK/CKMS are, and there's no justification in taking more memory than this!

  • VARYING "ERROR" PARAMETER... (1M values)

    • error=0.1: ALL 4mb/0.01s, GK 1k/0.1s, CKMS 390mb/6.5s <- ckms and gk are inaccurate
    • error=0.01: ALL 4mb/0.01s, GK 11k/0.1s, CKMS 1.1tb/5s <- ckms and gk are inaccurate
    • error=0.001: ALL 4mb/0.01s, GK 95k/0.3s, CKMS 750mb/2s <- ckms p99 weak and max inaccurate
    • error=0.0001: ALL 4mb/0.01s, GK 770k/3s, CKMS 240mb/2s <- ckms max inaccurate
    • error=0.000_01: ALL 4mb/0.01s, GK 12mb/65s, CKMS 66mb/23s
    • error=0.000_001: ALL 4mb/0.01s, GK too slow, CKMS 94mb/230s <-- gk too slow
  • VARYING NUMBER OF VALUES... (error_gk=0.001, error_ckms=0.0001)

    • count=10k: ALL 40k/0s, GK 95k/0.006s, CKMS 748k/0.02s
    • count=100k: ALL 400k/0.001s, GK 95k/0.04s, CKMS 8mb/0.2s
    • count=1M: ALL 4mb/0.01s, GK 95k/0.3s, CKMS 240mb/2.5s
    • count=10M: ALL 40mb/0.1s, GK 95k/3s, CKMS 8tb/40s
    • count=100M: ALL 400mb/1s, GK 95k/30s, CKMS too slow <-- ckms too slow
@ljw1004
Copy link
Author

ljw1004 commented Nov 10, 2021

Here's the benchmark source code.

[package]
name = "rf"
version = "0.1.0"
edition = "2018"

[dependencies]
quantiles = "0.7.1"
rand = "0.8.4"
rand_distr = "0.4.2"
ordered-float = "2.8.0"
rand_pcg = "0.3.1"
stats_alloc = "0.1.8"
tdigest = "0.2.2"
thousands = "0.2.0"
fn main() {
    test_counts();
    println!("\n\nHERE'S HOW WE SETTLED ON PARAMETERS");
    test_gk_and_cksm_params();
    test_digest_params();
}

#[allow(dead_code)]
fn test_counts() {
    let counts = vec![10_000, 100_000, 1_000_000, 10_000_000, 100_000_000];
    let ckms_error = 0.0001;
    let gk_error = 0.001;
    let tdigest_batch = 20_000;
    let tdigest_max_size = 200;
    let dn = rand_distr::Normal::new(0.5f64, 0.2f64).unwrap();
    let dp = rand_distr::Pareto::new(5f64, 10f64).unwrap();
    for count in counts {
        let mut a0 = NoAggregate::new();
        let mut am = MeanAggregate::new();
        let mut av = AllValues::new(count);
        let mut at = TopValues::new(count);
        let mut aq = QuantilesCKMS::new(ckms_error);
        let mut ag = QuantilesGK::new(gk_error);
        let mut ad = TDigestAg::new(tdigest_batch, tdigest_max_size);

        println!("\nCOUNT={}, GK_ERROR={}, CKMS_ERROR={}, TDIGEST_BATCH={}, TDIGEST_MAX_SIZE={}", count.separate_with_underscores(), gk_error, ckms_error, tdigest_batch.separate_with_underscores(), tdigest_max_size);
        println!("    NORMAL DISTRIBITION");
        test(count, &mut a0, dn);
        test(count, &mut am, dn);
        test(count, &mut av, dn);
        test(count, &mut at, dn);
        test(count, &mut ag, dn);
        if count < 50_000_000 {test(count, &mut aq, dn);}
        test(count, &mut ad, dn);
        println!("    PARETO DISTRIBUTION");
        test(count, &mut a0, dp);
        test(count, &mut am, dp);
        test(count, &mut av, dp);
        test(count, &mut at, dp);
        test(count, &mut ag, dp);
        if count < 50_000_000 {test(count, &mut aq, dp);}
        test(count, &mut ad, dp);
    }
}

#[allow(dead_code)]
fn test_digest_params() {
    let count = 10_000_000;
    let dp = rand_distr::Pareto::new(5f64, 10f64).unwrap();
    let dn = rand_distr::Normal::new(0.5f64, 0.2f64).unwrap();
    for max_size in [10, 100, 500, 1000, 5000] {
        let batch = 20_000;
        let mut av = AllValues::new(count);
        let mut at = TDigestAg::new(batch, max_size);
        println!("\nMAX_SIZE={}, BATCH={}, COUNT={}", max_size, batch.separate_with_underscores(), count.separate_with_underscores());
        println!("    NORMAL DISTRIBITION");
        test(count, &mut av, dn);
        test(count, &mut at, dn);
        println!("    PARETO DISTRIBUTION");
        test(count, &mut av, dp);
        test(count, &mut at, dp);
    }
    println!("");
    for batch in [100, 1000, 5000, 10_000, 20_000, 50_000, 100_000] {
        let max_size = 200;
        let mut av = AllValues::new(count);
        let mut at = TDigestAg::new(batch, max_size);
        println!("\nBATCH={}, MAX_SIZE={}, COUNT={}", batch.separate_with_underscores(), max_size, count.separate_with_underscores());
        println!("    NORMAL DISTRIBITION");
        test(count, &mut av, dn);
        test(count, &mut at, dn);
        println!("    PARETO DISTRIBUTION");
        test(count, &mut av, dp);
        test(count, &mut at, dp);
    }
}

#[allow(dead_code)]
fn test_gk_and_cksm_params() {
    let count = 1_000_000;
    let dp = rand_distr::Pareto::new(5f64, 10f64).unwrap();
    let dn = rand_distr::Normal::new(0.5f64, 0.2f64).unwrap();
    for error in [0.1, 0.01, 0.001, 0.0001, 0.000_01, 0.000_001] {
        let mut av = AllValues::new(count);
        let mut ag = QuantilesGK::new(error);
        let mut aq = QuantilesCKMS::new(error);
        println!("\nERROR={}, COUNT={}", error, count.separate_with_underscores());
        println!("    NORMAL DISTRIBITION");
        test(count, &mut av, dn);
        if error > 0.000005 {test(count, &mut ag, dn);}
        test(count, &mut aq, dn);
        println!("    PARETO DISTRIBUTION");
        test(count, &mut av, dp);
        if error > 0.000005 {test(count, &mut ag, dp);}
        test(count, &mut aq, dp);
    }
}

trait Aggregate {
    fn anew(&self) -> Self;
    fn insert(&mut self, value: f64);
    fn render(&mut self) -> String;
}

// INSTRUMENTED_SYSTEM is an instrumented instance of the system allocator
#[global_allocator]
static GLOBAL: &stats_alloc::StatsAlloc<std::alloc::System> = &stats_alloc::INSTRUMENTED_SYSTEM;

fn test<A: Aggregate, D: rand::distributions::Distribution<f64>>(
    count: usize,
    aggregate: &mut A,
    distribution: D,
) {
    let mut rng = rand_pcg::Pcg64::new(0xcafef00dd15ea5e5, 0xa02bdbf7bb3c0a7ac28fa16a64abf96);
    let start = std::time::Instant::now();
    let startmem = stats_alloc::Region::new(&GLOBAL);
    let mut aggregate = aggregate.anew();
    for _ in 0..count {
        let value = distribution.sample(&mut rng);
        aggregate.insert(value);
    }
    let insert_elapsed = start.elapsed().as_secs_f64();
    let start = std::time::Instant::now();
    let fmt = aggregate.render();
    let fmt_elapsed = start.elapsed().as_secs_f64();
    let mem = startmem.change();
    let bytes_change = mem.bytes_allocated as isize - mem.bytes_deallocated as isize + mem.bytes_reallocated;
    println!(
        "        {:.3}s+{:.3}s, {}k heap, {}k stack, {}",
        insert_elapsed,
        fmt_elapsed,
        bytes_change / 1024,
        std::mem::size_of_val(&aggregate) / 1024,
        fmt,
    );
}

struct NoAggregate {}
impl NoAggregate {
    fn new() -> Self {
        Self {}
    }
}
impl Aggregate for NoAggregate {
    fn insert(&mut self, _value: f64) {}
    fn anew(&self) -> NoAggregate { Self::new() }
    fn render(&mut self) -> String {
        format!("NoAggregate")
    }
}

#[derive(Default)]
struct MeanAggregate {
    min: Option<f64>,
    max: Option<f64>,
    mean: f64,
    variance_sum: f64,
    count: usize,
}
impl MeanAggregate {
    fn new() -> Self {
        std::default::Default::default()
    }
}
impl Aggregate for MeanAggregate {
    fn render(&mut self) -> String {
        format!(
            "MeanAggregate, min={:.4}, mean={:.4} (stdev {:.4}),  max={:.4}",
            self.min.unwrap(),
            self.mean,
            (self.variance_sum / self.count as f64).sqrt(),
            self.max.unwrap(),
        )
    }

    fn anew(&self) -> Self { MeanAggregate::new() }
    fn insert(&mut self, value: f64) {
        match self.min {
            None => self.min = Some(value),
            Some(min) if value < min => self.min = Some(value),
            _ => {}
        }
        match self.max {
            None => self.max = Some(value),
            Some(max) if value > max => self.max = Some(value),
            _ => {}
        }
        self.count += 1;
        let new_mean = self.mean + (value - self.mean) / self.count as f64;
        self.variance_sum += (value - self.mean) * (value - new_mean);
        self.mean = new_mean;
    }
}

struct AllValues {
    values: Vec<f32>,
}
impl AllValues {
    fn new(count: usize) -> Self {
        let values = Vec::with_capacity(count);
        AllValues { values }
    }
}
impl Aggregate for AllValues {
    fn render(&mut self) -> String {
        self.values.sort_by(|a, b| a.partial_cmp(b).unwrap());
        let len = self.values.len();
        format!(
            "AllValues, min={:.4}, P50={:.4}, P99={:.4}, max={:.4}",
            self.values[0],
            self.values[len / 2],
            self.values[len * 99 / 100],
            self.values[len - 1],
        )
    }

    fn anew(&self) -> Self { Self::new(self.values.capacity()) }
    fn insert(&mut self, value: f64) {
        self.values.push(value as f32);
    }
}

struct TopValues {
    count: usize,
    values: std::collections::BinaryHeap<std::cmp::Reverse<ordered_float::NotNan<f32>>>,
}
impl TopValues {
    fn new(count: usize) -> Self {
        let capacity = std::cmp::max(count / 100, 1);
        let values = std::collections::BinaryHeap::with_capacity(capacity);
        TopValues { count, values }
    }
}
impl Aggregate for TopValues {
    fn render(&mut self) -> String {
        let p99 = self.values.peek().unwrap().0;
        let max = self.values.drain().min().unwrap().0;
        format!("TopValues, p99={:.4}, max={:.4}", p99, max)
    }
    fn anew(&self) -> Self { Self::new(self.count) }
    fn insert(&mut self, value: f64) {
        let value = value as f32;
        let value = std::cmp::Reverse(unsafe { ordered_float::NotNan::new_unchecked(value) });

        if self.values.len() < self.values.capacity() {
            self.values.push(value);
        } else if self.values.peek().unwrap().0 < value.0 {
            self.values.pop();
            self.values.push(value);
        } else {
        }
    }
}

struct QuantilesCKMS {
    error: f64,
    q: quantiles::ckms::CKMS<f64>,
}
impl QuantilesCKMS {
    fn new(error: f64) -> Self {
        let q = quantiles::ckms::CKMS::new(error);
        QuantilesCKMS { error, q }
    }
}
impl Aggregate for QuantilesCKMS {
    fn render(&mut self) -> String {
        format!(
            "QuantilesCKMS, min={:.4}, mean={:.4}, p50={:.4}, P99={:.4}, max={:.4}",
            self.q.query(0.0).unwrap().1,
            self.q.cma().unwrap(),
            self.q.query(0.5).unwrap().1,
            self.q.query(0.99).unwrap().1,
            self.q.query(1.0).unwrap().1,
        )
    }
    fn anew(&self) -> Self {Self::new(self.error) }
    fn insert(&mut self, value: f64) {
        self.q.insert(value);
    }
}

struct QuantilesGK {
    error: f64,
    q: quantiles::greenwald_khanna::Stream<ordered_float::NotNan<f64>>,
}
impl QuantilesGK {
    fn new(error: f64) -> Self {
        let q = quantiles::greenwald_khanna::Stream::new(error);
        QuantilesGK { q, error }
    }
}
impl Aggregate for QuantilesGK {
    fn render(&mut self) -> String {
        format!(
            "QuantilesGK, min={:.4}, p50={:.4}, P99={:.4}, max={:.4}",
            self.q.quantile(0.0),
            self.q.quantile(0.5),
            self.q.quantile(0.99),
            self.q.quantile(1.0),
        )
    }
    fn anew(&self) -> Self {
        Self::new(self.error)
    }
    fn insert(&mut self, value: f64) {
        let value = unsafe { ordered_float::NotNan::new_unchecked(value) };
        self.q.insert(value);
    }
}

struct TDigestAg {
    batch: Vec<f64>,
    t : tdigest::TDigest,
}
impl TDigestAg {
    fn new(batch: usize, max_size: usize) -> Self { 
        let batch = Vec::with_capacity(batch);
        let t = tdigest::TDigest::new_with_size(max_size);
        TDigestAg { batch, t}
    }
    fn merge(&mut self) {
        let capacity = self.batch.capacity();
        let prev = std::mem::replace(&mut self.batch, Vec::with_capacity(capacity));
        self.t = self.t.merge_unsorted(prev);
        self.batch.clear();
    }
}
impl Aggregate for TDigestAg {
    fn render(&mut self) -> String {
        self.merge();
        format!("TDigest, min={:.4}, mean={:.4}, P50={:.4}, P99={:.4}, max={:.4}",
        self.t.min(),
        self.t.mean(),
        self.t.estimate_quantile(0.5),
        self.t.estimate_quantile(0.99),
        self.t.max(),
    )
    }
    fn anew(&self) -> Self { Self::new(self.batch.capacity(), self.t.max_size()) }
    fn insert(&mut self, value: f64) {
        if self.batch.len() == self.batch.capacity() {
            self.merge();
        }
        self.batch.push(value);
    }
}

@ljw1004
Copy link
Author

ljw1004 commented Nov 10, 2021

Here are the raw results from the benchmark on my Macbook.

How do the algorithms scale with number of values?

  • COUNT=10_000, GK_ERROR=0.001, CKMS_ERROR=0.0001, TDIGEST_BATCH=20_000, TDIGEST_MAX_SIZE=200

    • NORMAL DISTRIBITION
      • 0.000s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.000s+0.000s, 0k heap, 0k stack, MeanAggregate, min=-0.2296, mean=0.5025 (stdev 0.2014), max=1.1950
      • 0.000s+0.001s, 39k heap, 0k stack, AllValues, min=-0.2296, P50=0.5025, P99=0.9703, max=1.1950
      • 0.000s+0.000s, 0k heap, 0k stack, TopValues, p99=0.9703, max=1.1950
      • 0.004s+0.000s, 95k heap, 0k stack, QuantilesGK, min=-0.2148, p50=0.5028, P99=0.9684, max=1.1950
      • 0.017s+0.000s, 748k heap, 0k stack, QuantilesCKMS, min=-0.2296, mean=0.5025, p50=0.5025, P99=0.9692, max=1.1950
      • 0.000s+0.001s, 159k heap, 0k stack, TDigest, min=-0.2296, mean=0.5025, P50=0.5023, P99=0.9690, max=1.1950
    • PARETO DISTRIBUTION
      • 0.000s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.000s+0.000s, 0k heap, 0k stack, MeanAggregate, min=5.0001, mean=5.5513 (stdev 0.6114), max=12.6260
      • 0.000s+0.001s, 39k heap, 0k stack, AllValues, min=5.0001, P50=5.3559, P99=7.9406, max=12.6260
      • 0.000s+0.000s, 0k heap, 0k stack, TopValues, p99=7.9406, max=12.6260
      • 0.005s+0.000s, 95k heap, 0k stack, QuantilesGK, min=5.0005, p50=5.3568, P99=7.9528, max=12.6260
      • 0.019s+0.000s, 771k heap, 0k stack, QuantilesCKMS, min=5.0001, mean=5.5513, p50=5.3558, P99=7.9353, max=12.6260
      • 0.000s+0.001s, 159k heap, 0k stack, TDigest, min=5.0001, mean=5.5513, P50=5.3560, P99=7.9415, max=12.6260
  • COUNT=100_000, GK_ERROR=0.001, CKMS_ERROR=0.0001, TDIGEST_BATCH=20_000, TDIGEST_MAX_SIZE=200

    • NORMAL DISTRIBITION
      • 0.001s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.001s+0.000s, 0k heap, 0k stack, MeanAggregate, min=-0.4538, mean=0.5010 (stdev 0.1999), max=1.3948
      • 0.001s+0.010s, 390k heap, 0k stack, AllValues, min=-0.4538, P50=0.5011, P99=0.9645, max=1.3948
      • 0.001s+0.000s, 3k heap, 0k stack, TopValues, p99=0.9645, max=1.3948
      • 0.039s+0.000s, 95k heap, 0k stack, QuantilesGK, min=-0.4538, p50=0.5009, P99=0.9684, max=1.3948
      • 0.247s+0.002s, 8520k heap, 0k stack, QuantilesCKMS, min=-0.4538, mean=0.5010, p50=0.5011, P99=0.9639, max=1.3948
      • 0.006s+0.001s, 160k heap, 0k stack, TDigest, min=-0.4538, mean=0.5010, P50=0.5012, P99=0.9631, max=1.3948
    • PARETO DISTRIBUTION
      • 0.000s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.003s+0.000s, 0k heap, 0k stack, MeanAggregate, min=5.0000, mean=5.5536 (stdev 0.6211), max=15.3146
      • 0.002s+0.009s, 390k heap, 0k stack, AllValues, min=5.0000, P50=5.3564, P99=7.9297, max=15.3146
      • 0.004s+0.000s, 3k heap, 0k stack, TopValues, p99=7.9297, max=15.3146
      • 0.043s+0.000s, 95k heap, 0k stack, QuantilesGK, min=5.0005, p50=5.3568, P99=7.9528, max=15.3146
      • 0.256s+0.002s, 7967k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5536, p50=5.3565, P99=7.9353, max=15.3146
      • 0.007s+0.001s, 160k heap, 0k stack, TDigest, min=5.0000, mean=5.5536, P50=5.3563, P99=7.9369, max=15.3146
  • COUNT=1_000_000, GK_ERROR=0.001, CKMS_ERROR=0.0001, TDIGEST_BATCH=20_000, TDIGEST_MAX_SIZE=200

    • NORMAL DISTRIBITION
      • 0.006s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.009s+0.000s, 0k heap, 0k stack, MeanAggregate, min=-0.4963, mean=0.5002 (stdev 0.2000), max=1.4952
      • 0.008s+0.111s, 3906k heap, 0k stack, AllValues, min=-0.4963, P50=0.5001, P99=0.9653, max=1.4952
      • 0.012s+0.000s, 39k heap, 0k stack, TopValues, p99=0.9653, max=1.4952
      • 0.334s+0.000s, 95k heap, 0k stack, QuantilesGK, min=-0.4963, p50=0.5004, P99=0.9684, max=1.4952
      • 2.469s+0.049s, 240590k heap, 0k stack, QuantilesCKMS, min=-0.4963, mean=0.5002, p50=0.5001, P99=0.9651, max=1.2733
      • 0.063s+0.001s, 168k heap, 0k stack, TDigest, min=-0.4963, mean=0.5002, P50=0.5002, P99=0.9645, max=1.4952
    • PARETO DISTRIBUTION
      • 0.000s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.025s+0.000s, 0k heap, 0k stack, MeanAggregate, min=5.0000, mean=5.5552 (stdev 0.6208), max=18.0523
      • 0.024s+0.104s, 3906k heap, 0k stack, AllValues, min=5.0000, P50=5.3585, P99=7.9284, max=18.0523
      • 0.030s+0.000s, 39k heap, 0k stack, TopValues, p99=7.9284, max=18.0523
      • 0.336s+0.000s, 95k heap, 0k stack, QuantilesGK, min=5.0005, p50=5.3592, P99=7.9528, max=18.0523
      • 2.654s+0.055s, 251135k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5552, p50=5.3585, P99=7.9254, max=13.0034
      • 0.080s+0.001s, 168k heap, 0k stack, TDigest, min=5.0000, mean=5.5552, P50=5.3584, P99=7.9292, max=18.0523
  • COUNT=10_000_000, GK_ERROR=0.001, CKMS_ERROR=0.0001, TDIGEST_BATCH=20_000, TDIGEST_MAX_SIZE=200

    • NORMAL DISTRIBITION
      • 0.050s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.083s+0.000s, 0k heap, 0k stack, MeanAggregate, min=-0.6524, mean=0.5000 (stdev 0.2000), max=1.4952
      • 0.078s+1.301s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.132s+0.000s, 390k heap, 0k stack, TopValues, p99=0.9655, max=1.4952
      • 3.117s+0.000s, 95k heap, 0k stack, QuantilesGK, min=-0.6524, p50=0.5004, P99=0.9684, max=1.4952
      • 43.448s+1.681s, 8418141k heap, 0k stack, QuantilesCKMS, min=-0.6524, mean=0.5000, p50=0.5000, P99=0.9649, max=1.2744
      • 0.616s+0.001s, 253k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.4997, P99=0.9655, max=1.4952
    • PARETO DISTRIBUTION
      • 0.000s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.261s+0.000s, 0k heap, 0k stack, MeanAggregate, min=5.0000, mean=5.5554 (stdev 0.6207), max=26.5894
      • 0.237s+1.287s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.378s+0.000s, 390k heap, 0k stack, TopValues, p99=7.9237, max=26.5894
      • 3.604s+0.000s, 95k heap, 0k stack, QuantilesGK, min=5.0005, p50=5.3592, P99=7.9528, max=26.5894
      • 43.045s+1.739s, 8170118k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5554, p50=5.3588, P99=7.9187, max=12.9397
      • 0.771s+0.001s, 253k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3582, P99=7.9199, max=26.5894
  • COUNT=100_000_000, GK_ERROR=0.001, CKMS_ERROR=0.0001, TDIGEST_BATCH=20_000, TDIGEST_MAX_SIZE=200

    • NORMAL DISTRIBITION
      • 0.461s+0.000s, 0k heap, 0k stack, NoAggregate
      • 0.785s+0.000s, 0k heap, 0k stack, MeanAggregate, min=-0.6524, mean=0.5000 (stdev 0.2000), max=1.6590
      • 0.668s+13.481s, 390625k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9653, max=1.6590
      • 1.540s+0.002s, 3906k heap, 0k stack, TopValues, p99=0.9653, max=1.6590
      • 30.659s+0.000s, 95k heap, 0k stack, QuantilesGK, min=-0.6524, p50=0.5004, P99=0.9684, max=1.6590
      • 6.526s+0.001s, 1096k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.4997, P99=0.9642, max=1.6590
    • PARETO DISTRIBUTION
      • 0.000s+0.000s, 0k heap, 0k stack, NoAggregate
      • 2.462s+0.000s, 0k heap, 0k stack, MeanAggregate, min=5.0000, mean=5.5556 (stdev 0.6212), max=28.6652
      • 2.075s+13.934s, 390625k heap, 0k stack, AllValues, min=5.0000, P50=5.3589, P99=7.9248, max=28.6652
      • 3.160s+0.001s, 3906k heap, 0k stack, TopValues, p99=7.9248, max=28.6652
      • 33.288s+0.000s, 95k heap, 0k stack, QuantilesGK, min=5.0005, p50=5.3592, P99=7.9528, max=28.6652
      • 7.380s+0.001s, 1096k heap, 0k stack, TDigest, min=5.0000, mean=5.5556, P50=5.3596, P99=7.9236, max=28.6652

How did we settle on "error" parameter for GK/CKMS?

  • ERROR=0.1, COUNT=1_000_000

    • NORMAL DISTRIBITION
      • 0.006s+0.106s, 3906k heap, 0k stack, AllValues, min=-0.4963, P50=0.5001, P99=0.9653, max=1.4952
      • 0.118s+0.000s, 1k heap, 0k stack, QuantilesGK, min=0.2029, p50=0.5300, P99=1.4952, max=1.4952
      • 7.334s+0.000s, 390787k heap, 0k stack, QuantilesCKMS, min=-0.4963, mean=0.5002, p50=0.4784, P99=0.7855, max=0.9930
    • PARETO DISTRIBUTION
      • 0.023s+0.098s, 3906k heap, 0k stack, AllValues, min=5.0000, P50=5.3585, P99=7.9284, max=18.0523
      • 0.128s+0.000s, 1k heap, 0k stack, QuantilesGK, min=5.0000, p50=5.3925, P99=18.0523, max=18.0523
      • 6.658s+0.000s, 318529k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5552, p50=5.3134, P99=6.6236, max=6.6236
  • ERROR=0.01, COUNT=1_000_000

    • NORMAL DISTRIBITION
      • 0.007s+0.103s, 3906k heap, 0k stack, AllValues, min=-0.4963, P50=0.5001, P99=0.9653, max=1.4952
      • 0.148s+0.000s, 11k heap, 0k stack, QuantilesGK, min=-0.4963, p50=0.4996, P99=1.4952, max=1.4952
      • 5.554s+0.006s, 1152568k heap, 0k stack, QuantilesCKMS, min=-0.4963, mean=0.5002, p50=0.4979, P99=0.9232, max=0.9689
    • PARETO DISTRIBUTION
      • 0.025s+0.106s, 3906k heap, 0k stack, AllValues, min=5.0000, P50=5.3585, P99=7.9284, max=18.0523
      • 0.179s+0.000s, 11k heap, 0k stack, QuantilesGK, min=5.0000, p50=5.3610, P99=18.0523, max=18.0523
      • 5.661s+0.005s, 1229721k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5552, p50=5.3564, P99=7.4251, max=7.9711
  • ERROR=0.001, COUNT=1_000_000

    • NORMAL DISTRIBITION
      • 0.008s+0.106s, 3906k heap, 0k stack, AllValues, min=-0.4963, P50=0.5001, P99=0.9653, max=1.4952
      • 0.310s+0.000s, 95k heap, 0k stack, QuantilesGK, min=-0.4963, p50=0.5004, P99=0.9684, max=1.4952
      • 2.133s+0.018s, 798287k heap, 0k stack, QuantilesCKMS, min=-0.4963, mean=0.5002, p50=0.5001, P99=0.9599, max=1.1207
    • PARETO DISTRIBUTION
      • 0.023s+0.099s, 3906k heap, 0k stack, AllValues, min=5.0000, P50=5.3585, P99=7.9284, max=18.0523
      • 0.327s+0.000s, 95k heap, 0k stack, QuantilesGK, min=5.0005, p50=5.3592, P99=7.9528, max=18.0523
      • 2.314s+0.020s, 719869k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5552, p50=5.3586, P99=7.8787, max=10.7186
  • ERROR=0.0001, COUNT=1_000_000

    • NORMAL DISTRIBITION
      • 0.007s+0.103s, 3906k heap, 0k stack, AllValues, min=-0.4963, P50=0.5001, P99=0.9653, max=1.4952
      • 3.497s+0.000s, 767k heap, 0k stack, QuantilesGK, min=-0.2888, p50=0.5002, P99=0.9656, max=1.4952
      • 2.586s+0.052s, 240590k heap, 0k stack, QuantilesCKMS, min=-0.4963, mean=0.5002, p50=0.5001, P99=0.9651, max=1.2733
    • PARETO DISTRIBUTION
      • 0.025s+0.107s, 3906k heap, 0k stack, AllValues, min=5.0000, P50=5.3585, P99=7.9284, max=18.0523
      • 3.548s+0.000s, 767k heap, 0k stack, QuantilesGK, min=5.0000, p50=5.3585, P99=7.9353, max=18.0523
      • 2.680s+0.050s, 251135k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5552, p50=5.3585, P99=7.9254, max=13.0034
  • ERROR=0.00001, COUNT=1_000_000

    • NORMAL DISTRIBITION
      • 0.007s+0.102s, 3906k heap, 0k stack, AllValues, min=-0.4963, P50=0.5001, P99=0.9653, max=1.4952
      • 65.390s+0.000s, 12287k heap, 0k stack, QuantilesGK, min=-0.4963, p50=0.5001, P99=0.9653, max=1.4952
      • 23.766s+0.173s, 66786k heap, 0k stack, QuantilesCKMS, min=-0.4963, mean=0.5002, p50=0.5001, P99=0.9652, max=1.4952
    • PARETO DISTRIBUTION
      • 0.025s+0.111s, 3906k heap, 0k stack, AllValues, min=5.0000, P50=5.3585, P99=7.9284, max=18.0523
      • 65.391s+0.000s, 12287k heap, 0k stack, QuantilesGK, min=5.0000, p50=5.3585, P99=7.9293, max=18.0523
      • 24.265s+0.188s, 67604k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5552, p50=5.3585, P99=7.9293, max=18.0523
  • ERROR=0.000001, COUNT=1_000_000

    • NORMAL DISTRIBITION
      • 0.007s+0.104s, 3906k heap, 0k stack, AllValues, min=-0.4963, P50=0.5001, P99=0.9653, max=1.4952
      • 252.283s+0.566s, 94079k heap, 0k stack, QuantilesCKMS, min=-0.4963, mean=0.5002, p50=0.5001, P99=0.9653, max=1.4952
    • PARETO DISTRIBUTION
      • 0.023s+0.105s, 3906k heap, 0k stack, AllValues, min=5.0000, P50=5.3585, P99=7.9284, max=18.0523
      • 254.217s+0.551s, 94016k heap, 0k stack, QuantilesCKMS, min=5.0000, mean=5.5552, p50=5.3585, P99=7.9284, max=18.0523

How did we settle on "batch" and "max-size" parameters for TDigest?

  • MAX_SIZE=10, BATCH=20_000, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.060s+1.390s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.652s+0.001s, 250k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5004, P99=0.9844, max=1.4952
    • PARETO DISTRIBUTION
      • 0.242s+1.294s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.786s+0.001s, 250k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3515, P99=8.2127, max=26.5894
  • MAX_SIZE=100, BATCH=20_000, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.063s+1.322s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.689s+0.001s, 251k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5000, P99=0.9649, max=1.4952
    • PARETO DISTRIBUTION
      • 0.225s+1.191s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.721s+0.001s, 251k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3588, P99=7.9328, max=26.5894
  • MAX_SIZE=500, BATCH=20_000, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.057s+1.229s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.616s+0.001s, 257k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5000, P99=0.9657, max=1.4952
    • PARETO DISTRIBUTION
      • 0.204s+1.179s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.732s+0.001s, 257k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3589, P99=7.9221, max=26.5894
  • MAX_SIZE=1000, BATCH=20_000, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.057s+1.208s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.601s+0.002s, 265k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5000, P99=0.9655, max=1.4952
    • PARETO DISTRIBUTION
      • 0.210s+1.181s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.751s+0.001s, 265k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3588, P99=7.9230, max=26.5894
  • MAX_SIZE=5000, BATCH=20_000, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.062s+1.201s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.646s+0.001s, 314k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5000, P99=0.9656, max=1.4952
    • PARETO DISTRIBUTION
      • 0.209s+1.214s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.828s+0.001s, 315k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3588, P99=7.9235, max=26.5894
  • BATCH=100, MAX_SIZE=200, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.062s+1.204s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.638s+0.000s, 0k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.4996, P99=0.9657, max=1.4952
    • PARETO DISTRIBUTION
      • 0.213s+1.199s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.776s+0.000s, 0k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3588, P99=7.9290, max=26.5894
  • BATCH=1_000, MAX_SIZE=200, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.057s+1.315s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.509s+0.000s, 635k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5003, P99=0.9658, max=1.4952
    • PARETO DISTRIBUTION
      • 0.214s+1.272s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.652s+0.000s, 635k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3583, P99=7.9290, max=26.5894
  • BATCH=5_000, MAX_SIZE=200, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.059s+1.145s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.515s+0.000s, 417k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5000, P99=0.9646, max=1.4952
    • PARETO DISTRIBUTION
      • 0.261s+1.167s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.671s+0.000s, 417k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3588, P99=7.9242, max=26.5894
  • BATCH=10_000, MAX_SIZE=200, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.061s+1.166s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.602s+0.001s, 268k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5001, P99=0.9647, max=1.4952
    • PARETO DISTRIBUTION
      • 0.231s+1.297s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.810s+0.001s, 268k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3591, P99=7.9114, max=26.5894
  • BATCH=20_000, MAX_SIZE=200, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.063s+1.334s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.683s+0.001s, 253k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.4997, P99=0.9655, max=1.4952
    • PARETO DISTRIBUTION
      • 0.218s+1.243s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.804s+0.001s, 253k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3582, P99=7.9199, max=26.5894
  • BATCH=50_000, MAX_SIZE=200, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.062s+1.191s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.630s+0.003s, 431k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.4998, P99=0.9659, max=1.4952
    • PARETO DISTRIBUTION
      • 0.202s+1.141s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.772s+0.003s, 431k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3578, P99=7.9251, max=26.5894
  • BATCH=100_000, MAX_SIZE=200, COUNT=10_000_000

    • NORMAL DISTRIBITION
      • 0.057s+1.153s, 39063k heap, 0k stack, AllValues, min=-0.6524, P50=0.5000, P99=0.9655, max=1.4952
      • 0.667s+0.006s, 803k heap, 0k stack, TDigest, min=-0.6524, mean=0.5000, P50=0.5000, P99=0.9655, max=1.4952
    • PARETO DISTRIBUTION
      • 0.202s+1.148s, 39063k heap, 0k stack, AllValues, min=5.0000, P50=5.3588, P99=7.9237, max=26.5894
      • 0.810s+0.006s, 803k heap, 0k stack, TDigest, min=5.0000, mean=5.5554, P50=5.3588, P99=7.8829, max=26.5894

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant