Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Support gathering "top" percentile statistics #200

Merged
merged 1 commit into from

3 participants

@zwily

Adds support for collecting statistics on top percentiles, instead of
the default bottom percentiles. You specify a top percentile by
using a negative number - so -10 will collect the top 10% of data. It
will emit: mean_top10, lower_top10, and sum_top10.

Using a negative number may seem hacky, but it's convenient and there
is a precedent - referencing an array from the end in some languages
can be done with negative indexes.

@mrtazz
Owner

What's the use case for this? It seems like you would only get outliers of your data then, which doesn't really seem useful?

@zwily

Sometimes you want the outliers. I want to know what the worst experience people are getting is like, as well as what's typical. Werner Vogels talked about this at re:Invent last year - he said at Amazon they religiously look at the upper 99.9% performance as one of their key metrics, and obviously suggested everyone do the same.

I think it makes sense.

@abh

:+1: for this; I have a similar need for my application (DNS servers for http://www.ntppool.org/ ).

The outliers are really extreme compared to the regular load but very regular and I have to do my capacity planning based on those outliers.

@mrtazz
Owner

makes sense yeah. @zwily would you be up to basing this onto the newest master to make it easier to merge?

@zwily zwily Support gathering "top" percentile statistics
Adds support for collecting statistics on top percentiles, instead of
the default bottom percentiles. You specify a top percentile by
using a negative number - so -10 will collect the top 10% of data. It
will emit: mean_top10, lower_top10, and sum_top10.

Using a negative number may seem hacky, but it's convenient and there
is a precedent - referencing an array from the end in some languages
can be done with negative indexes.
f369dfa
@zwily

@mrtazz - Yep. Rebased on master.

@mrtazz
Owner

perfect, thanks for contributing this!

@mrtazz mrtazz merged commit 045b1de into from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Mar 16, 2013
  1. @zwily

    Support gathering "top" percentile statistics

    zwily authored
    Adds support for collecting statistics on top percentiles, instead of
    the default bottom percentiles. You specify a top percentile by
    using a negative number - so -10 will collect the top 10% of data. It
    will emit: mean_top10, lower_top10, and sum_top10.
    
    Using a negative number may seem hacky, but it's convenient and there
    is a precedent - referencing an array from the end in some languages
    can be done with negative indexes.
This page is out of date. Refresh to see the latest.
View
1  exampleConfig.js
@@ -29,6 +29,7 @@ Optional Variables:
flushInterval: interval (in ms) to flush to Graphite
percentThreshold: for time information, calculate the Nth percentile(s)
(can be a single value or list of floating-point values)
+ negative values mean to use "top" Nth percentile(s) values
[%, default: 90]
keyFlush: log the most frequently sent keys [object, default: undefined]
interval: how often to log frequent keys [ms, default: 0]
View
20 lib/process_metrics.js
@@ -34,24 +34,32 @@ var process_metrics = function (metrics, flushInterval, ts, flushCallback) {
var sum = min;
var mean = min;
- var maxAtThreshold = max;
+ var thresholdBoundary = max;
var key2;
for (key2 in pctThreshold) {
var pct = pctThreshold[key2];
if (count > 1) {
- var numInThreshold = Math.round(pct / 100 * count);
+ var numInThreshold = Math.round(Math.abs(pct) / 100 * count);
+ if (numInThreshold === 0) {
+ continue;
+ }
- maxAtThreshold = values[numInThreshold - 1];
- sum = cumulativeValues[numInThreshold - 1];
+ if (pct > 0) {
+ thresholdBoundary = values[numInThreshold - 1];
+ sum = cumulativeValues[numInThreshold - 1];
+ } else {
+ thresholdBoundary = values[count - numInThreshold];
+ sum = cumulativeValues[count - 1] - cumulativeValues[count - numInThreshold - 1];
+ }
mean = sum / numInThreshold;
}
var clean_pct = '' + pct;
- clean_pct = clean_pct.replace('.', '_');
+ clean_pct = clean_pct.replace('.', '_').replace('-', 'top');
current_timer_data["mean_" + clean_pct] = mean;
- current_timer_data["upper_" + clean_pct] = maxAtThreshold;
+ current_timer_data[(pct > 0 ? "upper_" : "lower_") + clean_pct] = thresholdBoundary;
current_timer_data["sum_" + clean_pct] = sum;
}
View
22 test/process_metrics_tests.js
@@ -184,6 +184,28 @@ module.exports = {
test.done();
},
+ timers_single_time_single_top_percentile: function(test) {
+ test.expect(3);
+ this.metrics.timers['a'] = [100];
+ this.metrics.pctThreshold = [-10];
+ pm.process_metrics(this.metrics, 100, this.time_stamp, function(){});
+ timer_data = this.metrics.timer_data['a'];
+ test.equal(100, timer_data.mean_top10);
+ test.equal(100, timer_data.lower_top10);
+ test.equal(100, timer_data.sum_top10);
+ test.done();
+ },
+ timers_multiple_times_single_top_percentile: function(test) {
+ test.expect(3);
+ this.metrics.timers['a'] = [10, 10, 10, 10, 10, 10, 10, 10, 100, 200];
+ this.metrics.pctThreshold = [-20];
+ pm.process_metrics(this.metrics, 100, this.time_stamp, function(){});
+ timer_data = this.metrics.timer_data['a'];
+ test.equal(150, timer_data.mean_top20);
+ test.equal(100, timer_data.lower_top20);
+ test.equal(300, timer_data.sum_top20);
+ test.done();
+ },
statsd_metrics_exist: function(test) {
test.expect(1);
pm.process_metrics(this.metrics, 100, this.time_stamp, function(){});
Something went wrong with that request. Please try again.