Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support histograms #162

Merged
merged 17 commits into from Feb 19, 2013
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
28 changes: 28 additions & 0 deletions README.md
Expand Up @@ -61,6 +61,34 @@ generate the following list of stats for each threshold:
Where `$KEY` is the stats key you specify when sending to statsd, and `$PCT` is
the percentile threshold.

Use the `config.histogram` setting to instruct statsd to maintain histograms
over time. Specify which metrics to match and a corresponding list of
ordered non-inclusive upper limits of bins (class intervals).
(use `inf` to denote infinity; a lower limit of 0 is assumed)
Each `flushInterval`, statsd will store how many values (absolute frequency)
fall within each bin (class interval), for all matching metrics.
Examples:

* no histograms for any timer (default): `[]`
* histogram to only track render durations,
with unequal class intervals and catchall for outliers:

[ { metric: 'render', bins: [ 0.01, 0.1, 1, 10, 'inf'] } ]

* histogram for all timers except 'foo' related,
with equal class interval and catchall for outliers:

[ { metric: 'foo', bins: [] },
{ metric: '', bins: [ 50, 100, 150, 200, 'inf'] } ]

Note:

* first match for a metric wins.
* bin upper limits may contain decimals.
* this is actually more powerful than what's strictly considered
histograms, as you can make each bin arbitrarily wide,
i.e. class intervals of different sizes.

Gauges
------
StatsD now also supports gauges, arbitrary values, which can be recorded.
Expand Down
22 changes: 19 additions & 3 deletions exampleConfig.js
Expand Up @@ -27,9 +27,6 @@ Optional Variables:
debugInterval: interval to print debug information [ms, default: 10000]
dumpMessages: log all incoming messages
flushInterval: interval (in ms) to flush to Graphite
percentThreshold: for time information, calculate the Nth percentile(s)
(can be a single value or list of floating-point values)
[%, default: 90]
keyFlush: log the most frequently sent keys [object, default: undefined]
interval: how often to log frequent keys [ms, default: 0]
percent: percentage of frequent keys to log [%, default: 100]
Expand Down Expand Up @@ -59,6 +56,25 @@ Optional Variables:
e.g. [ { host: '10.10.10.10', port: 8125 },
{ host: 'observer', port: 88125 } ]

timer:
percentThreshold: calculate the Nth percentile(s)
(can be a single value or list of floating-point values)
[%, default: 90]
histogram: an array of mappings of strings (to match metrics) and
corresponding ordered non-inclusive upper limits of bins.
For all matching metrics, histograms are maintained over
time by writing the frequencies for all bins.
'inf' means infinity. A lower limit of 0 is assumed.
default: [], meaning no histograms for any timer.
First match wins. examples:
* histogram to only track render durations, with unequal
class intervals and catchall for outliers:
[ { metric: 'render', bins: [ 0.01, 0.1, 1, 10, 'inf'] } ]
* histogram for all timers except 'foo' related,
equal class interval and catchall for outliers:
[ { metric: 'foo', bins: [] },
{ metric: '', bins: [ 50, 100, 150, 200, 'inf'] } ]

repeaterProtocol: whether to use udp4 or udp4 for repeaters.
["udp4" or "udp6", default: "udp4"]
*/
Expand Down
23 changes: 23 additions & 0 deletions lib/process_metrics.js
Expand Up @@ -7,6 +7,7 @@ var process_metrics = function (metrics, flushInterval, ts, flushCallback) {
var counters = metrics.counters;
var timers = metrics.timers;
var pctThreshold = metrics.pctThreshold;
var histogram = metrics.histogram;

for (key in counters) {
var value = counters[key];
Expand Down Expand Up @@ -72,6 +73,28 @@ var process_metrics = function (metrics, flushInterval, ts, flushCallback) {
current_timer_data["sum"] = sum;
current_timer_data["mean"] = mean;

// note: values bigger than the upper limit of the last bin are ignored, by design
conf = histogram || [];
bins = [];
for (var i = 0; i < conf.length; i++) {
if (key.indexOf(conf[i].metric) > -1) {
bins = conf[i].bins;
break;
}
}
// the outer loop iterates bins, the inner loop iterates timer values;
// within each run of the inner loop we should only consider the timer value range that's within the scope of the current bin
// so we leverage the fact that the values are already sorted to end up with only full 1 iteration of the entire values range
var i = 0;
for (var bin_i = 0; bin_i < bins.length; bin_i++) {
var freq = 0;
for (; i < count && (bins[bin_i] == 'inf' || values[i] < bins[bin_i]); i++) {
freq += 1;
}
bin_name = ('bin_' + bins[bin_i]).replace('.','_');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we generally want to push key name sanitization into the backends, since a . separator could be completely valid in one backend but not the _. So I think we should remove this here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense. would you be willing to add that commit, it's probably quicker than having a conversation on how exactly you want the sanitization be done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just remove the replace() part. The rest is ok I think.

current_timer_data[bin_name] = freq;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it makes sense to namespace this under histogram? So that it get's added to something like current_timer_data["histogram"][bin_name], would make the namespace a bit more hierarchical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea

}

timer_data[key] = current_timer_data;

}
Expand Down
3 changes: 2 additions & 1 deletion stats.js
Expand Up @@ -54,7 +54,8 @@ function flushMetrics() {
sets: sets,
counter_rates: counter_rates,
timer_data: timer_data,
pctThreshold: pctThreshold
pctThreshold: pctThreshold,
histogram: config.histogram
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't work. It has to be conf here. I was also wondering, since the example config puts it under the timer key this should probably be conf.timer.histogram. pctThreshold is also in the wrong spot there and I need to fix it.

}

// After all listeners, reset the stats
Expand Down
53 changes: 53 additions & 0 deletions test/process_metrics_tests.js
Expand Up @@ -115,6 +115,59 @@ module.exports = {
test.equal(150, timer_data.mean_80);
test.equal(200, timer_data.upper_80);
test.equal(300, timer_data.sum_80);
test.done();
}, // check if the correct settings are being applied. as well as actual counts
timers_histogram: function (test) {
test.expect(45);
this.metrics.timers['a'] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
this.metrics.timers['abc'] = [0.1234, 2.89, 4, 6, 8];
this.metrics.timers['foo'] = [0, 2, 4, 6, 8];
this.metrics.timers['barbazfoobar'] = [0, 2, 4, 6, 8];
this.metrics.timers['bar.bazfoobar.abc'] = [0, 2, 4, 6, 8];
this.metrics.timers['xyz'] = [0, 2, 4, 6, 8];
this.metrics.histogram = [ { metric: 'foo', bins: [] },
{ metric: 'abcd', bins: [ 1, 5, 'inf'] },
{ metric: 'abc', bins: [ 1, 2.21, 'inf'] },
{ metric: 'a', bins: [ 1, 2] } ];
pm.process_metrics(this.metrics, 100, this.time_stamp, function(){});
timer_data = this.metrics.timer_data;
// nothing matches the 'abcd' config, so nothing has bin_5
test.equal(undefined, timer_data['a']['bin_5']);
test.equal(undefined, timer_data['abc']['bin_5']);
test.equal(undefined, timer_data['foo']['bin_5']);
test.equal(undefined, timer_data['barbazfoobar']['bin_5']);
test.equal(undefined, timer_data['bar.bazfoobar.abc']['bin_5']);
test.equal(undefined, timer_data['xyz']['bin_5']);

// check that 'a' got the right config and numbers
test.equal(0, timer_data['a']['bin_1']);
test.equal(1, timer_data['a']['bin_2']);
test.equal(undefined, timer_data['a']['bin_inf']);

// only 'abc' should have a bin_inf; also check all its counts,
// and make sure it has no other bins
// amount of non-bin_ keys: std, upper, lower, count, sum, mean -> 6
test.equal(1, timer_data['abc']['bin_1']);
test.equal(0, timer_data['abc']['bin_2_21']);
test.equal(4, timer_data['abc']['bin_inf']);
for (key in timer_data['abc']) {
test.ok(key.indexOf('bin_') < 0 || key == 'bin_1' || key == 'bin_2_21' || key == 'bin_inf');
}

// 'foo', 'barbazfoobar' and 'bar.bazfoobar.meh' and 'xyz' should not have any bin
for (key in timer_data['foo']) {
test.ok(key.indexOf('bin_') < 0);
}
for (key in timer_data['barbazfoobar']) {
test.ok(key.indexOf('bin_') < 0);
}
for (key in timer_data['bar.bazfoobar.abc']) {
test.ok(key.indexOf('bin_') < 0);
}
for (key in timer_data['xyz']) {
test.ok(key.indexOf('bin_') < 0);
}

test.done();
},
statsd_metrics_exist: function(test) {
Expand Down