Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support histograms #162

Merged
merged 17 commits into from
Feb 19, 2013
Merged

support histograms #162

merged 17 commits into from
Feb 19, 2013

Conversation

Dieterbe
Copy link
Contributor

@Dieterbe Dieterbe commented Oct 8, 2012

in accordance with existing codebase, implement this in graphite
backend

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Oct 8, 2012

note that unlike traditional histograms which must have equally sized class intervals, here you can choose arbitrarily wide bins that even span to infinity, without any negative downside, should you want to. so this "bin categorisation" is actually more powerful and flexible as compared to being restricted to class intervals, but you can easily use it as a standard histogram too.

Dieter Plaetinck added 2 commits October 8, 2012 15:28
in accordance with existing codebase, implement this in graphite
backend
@Dieterbe
Copy link
Contributor Author

thoughts @mrtazz ?

@mrtazz
Copy link
Member

mrtazz commented Oct 27, 2012

If I understand it correctly, this will add a metric for timers for every bin with the corresponding frequency into graphite. However, since bin sizes are configured statically in the config, they will only fit for a subset of the data. I'm not sure if this is really helpful?

@Dieterbe
Copy link
Contributor Author

well, the only other alternative I can see is some rules system where we configure the bins by matching against metric paths (similar to the method in graphites' storage-schemas.conf). I would be fine adding this (and actually considered it), but it was my understanding we didn't want statsd that configurable.

@mrtazz
Copy link
Member

mrtazz commented Nov 3, 2012

Yeah I think if we want to have histograms in StatsD, we need a way to define buckets based on the metrics path. I don't think they make sense otherwise.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Nov 5, 2012

so, I just implemented that. I'm happily running this code and find it quite useful.

pro tip it's super easy to get a relative frequency expressed as % across all bins of a given metric, like so:

target=scale(divideSeries(stats.timers.render_time.bin_*,stats.timers.render_time.count),100)

@@ -152,7 +173,8 @@ var backend_status = function graphite_status(writeCb) {
}
};

exports.init = function graphite_init(startup_time, config, events) {
exports.init = function graphite_init(startup_time, conf, events) {
config = conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you keep the pattern of pulling the config data in here like histogram = config.histogram or is there a need to change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that, but IMHO a variable name like 'histogram' isn't a good name for "configuration data pertaining to histograms".

I also realized, as we add configurable behavior, it doesn't make much sense to split out bits of the config object into new objects, it also make things more complicated.

So I suggest that each backend just has a ref. to the full config object. that way new configurable logic can use the config data without needing to add more of these subsection = config.subsection things

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Dec 9, 2012

i updated this to be on par with master again. also I added a bunch of tests.
note: we're running the previous version in production, this version I haven't tested in production yet, but the tests are pretty extensive...

freq += 1;
}
bin_name = ('bin_' + bins[bin_i]).replace('.','_');
current_timer_data[bin_name] = freq;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it makes sense to namespace this under histogram? So that it get's added to something like current_timer_data["histogram"][bin_name], would make the namespace a bit more hierarchical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea

@@ -54,7 +54,8 @@ function flushMetrics() {
sets: sets,
counter_rates: counter_rates,
timer_data: timer_data,
pctThreshold: pctThreshold
pctThreshold: pctThreshold,
histogram: config.histogram
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't work. It has to be conf here. I was also wondering, since the example config puts it under the timer key this should probably be conf.timer.histogram. pctThreshold is also in the wrong spot there and I need to fix it.

@mrtazz
Copy link
Member

mrtazz commented Jan 14, 2013

Thanks for all the integration work! Unfortunately the histogram sub-hierarchy (though it's good and we should keep it) breaks graphite sending, since there is now an Object and the backend sends e.g. stats.timers.foo.histogram [object Object] 1358164063. Could you fix this?

@Dieterbe
Copy link
Contributor Author

  • removed the .replace('.','_') , graphite backend probably needs to implement that now.
  • as for the configuration I believe this can be simplified a lot. I would suggest that the "merging with defaults, listifying" and such all gets applied to the config object (once, at startup) and then that object gets passed around to wherever the configuration is needed. there's no need to put separate config params into different vars which must then be passed separately, etc. also config should not go into the metrics variable, cause config != metrics. see the last commit in https://github.com/Dieterbe/statsd/tree/histogram-config-refactor (note that this is unfinished, breaks tests. but should show what I mean)
  • graphite sending stats.timers.foo.histogram [object Object] -> this is because it assumes no further deeper hierarchy when composing the statString for timer_data. i don't really have time to fix that right now. stay tuned (or feel free to fix yourself :))

@Dieterbe
Copy link
Contributor Author

@mrtazz does that last commit in histogram-config-refactor look allright? it demonstrates how i think the config should be passed around. then i'll add it to the histogram branch

Dieterbe and others added 3 commits February 11, 2013 18:23
also slight optimisation in the metric setting loop for timers
…-integration

Conflicts:
	backends/graphite.js
	exampleConfig.js
@mrtazz
Copy link
Member

mrtazz commented Feb 19, 2013

Awesome work! Thanks for taking the time to do this!

mrtazz added a commit that referenced this pull request Feb 19, 2013
@mrtazz mrtazz merged commit 817738f into statsd:master Feb 19, 2013
@Dieterbe
Copy link
Contributor Author

Dieterbe commented Apr 4, 2013

@mrtazz it seems the "key name sanitation" never went in the graphite backend. so the decimal in the bin causes graphite to create a new dir.

/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_0
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_1.wsp
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_10.wsp
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_5.wsp
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_50.wsp
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_inf.wsp
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_0/01.wsp
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_0/05.wsp
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_0/1.wsp
/var/lib/carbon/whisper/stats/timers/render_time/histogram/bin_0/5.wsp

hbouvier pushed a commit to hbouvier/statsd that referenced this pull request May 25, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants