Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-uniform histogram #81

Open
arthurprs opened this issue May 19, 2017 · 19 comments
Open

Non-uniform histogram #81

arthurprs opened this issue May 19, 2017 · 19 comments

Comments

@arthurprs
Copy link

arthurprs commented May 19, 2017

Current histogram is a great uniform histogram but it's a poor metrics histogram most of the time, as the first data point has the same weight as the most recent data point.

This lib should provide some sort of decaying histogram as well.

@posix4e
Copy link
Owner

posix4e commented May 19, 2017

@arthurprs stupid question, any reason why we just wouldn't use two histograms, each half the size. One for short term one for long. Can you provide some examples?

@posix4e
Copy link
Owner

posix4e commented May 19, 2017

@brayniac might have some opinions here

@brayniac
Copy link
Contributor

@arthurprs - I'm curious if you know of a stats library that handles histograms that way. What I've seen is that histograms are latched over some intergation window - eg record for 60s, extract data, clear histogram, record. This gives you access to the percentiles and other derived metrics for that minute. This is how I did it for my time-interval stats library (https://github.com/brayniac/tic)

@arthurprs
Copy link
Author

If you consider the prometheus style libraries the "decay sampling" Histograms are called Summaries instead, sort of.

@brayniac
Copy link
Contributor

@arthurprs - cool! Thanks for the links. I need to do some reading before I have more to say. I'm curious to learn when decaying might be a better fit than latched.

@sfackler
Copy link
Contributor

sfackler commented May 24, 2017

I ended up implementing a copy of Java Metric's ExponentiallyDecayingReservoir to swap out the HDR histogram that metrics uses by default - snapshots are about 100x faster in addition to having time decay. I can push it upstream if people are interested.

@arthurprs
Copy link
Author

Please do, if you put it in GH I can cherry-pick. I'm not sure @posix4e position since he marked the crate deprecated.

@brayniac
Copy link
Contributor

I wonder if this is something we should support in tic. Would appreciate benchmarks/links.

@arthurprs
Copy link
Author

I posted links up thread.

@brayniac
Copy link
Contributor

@arthurprs - sorry, links/benches request was for @sfackler - was replying from mobile =)

@sfackler
Copy link
Contributor

Here's the histogram implementation: https://gist.github.com/sfackler/2ea5c5b2ee0cc1447b9ae415292adaba

I don't have the exact code I used for the benchmark comparison, but it was IIRC just loading up an HDR Histogram and the histogram in the gist with default settings, filling them with 0..1000 and then generating a snapshot (count, min, max, mean, stddev, p50, p75, p95, p98, p99, p999) in the Bencher::iter closure. The average times were 3,544,088 ns for HDR vs 36,804 ns for the new one.

@posix4e
Copy link
Owner

posix4e commented May 25, 2017 via email

@sfackler
Copy link
Contributor

Update times are a bit slower since it has to pull the current time but it's still decently fast - maybe 100ns? Don't have hard numbers for that written down anywhere.

@sfackler
Copy link
Contributor

Cool, I'll push one up today.

@brayniac
Copy link
Contributor

brayniac commented May 25, 2017

@sfackler - thanks. I hacked together a quick benchmark for update operation for the decaying histogram proposed: https://github.com/brayniac/histobench

TLDR:

     Running target/release/deps/bench_decaying-c16fb48d5628f0a5

running 1 test
test decaying_increment ... bench:         119 ns/iter (+/- 9)

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured; 0 filtered out

     Running target/release/deps/bench_histogram-984d88d3e7dfe1ad

running 1 test
test brayniac_increment ... bench:           7 ns/iter (+/- 0)

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured; 0 filtered out

Percentiles in my histogram crate should be fairly cheap - and ideally we can handle those kinds of things without blocking the threads we're trying to get metrics from. The cost per increment here might be too high for something like tic - 119ns limits us to no more than 10M increments/s - staying closer to 10ns allows for up to 100M/s.

EDIT: looks like calculating p90 and stddev is a lot cheaper with the proposed decaying histogram. histobench repo updated with additional coverage

@sfackler
Copy link
Contributor

Oh yeah, if you have very update heavy workloads then exponential time decay is not going to work well.

@sfackler
Copy link
Contributor

I published this at https://crates.io/crates/exponential-decay-histogram

@posix4e
Copy link
Owner

posix4e commented May 30, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants