Batched send, reliability fixes, multiple graphing backends, package.json #56

Closed
wants to merge 4 commits into
from

Projects

None yet

6 participants

@wickedchicken
Contributor

This is a rather large pull request, one we've split into multiple commits to help ease facilitation into the mainline Etsy codebase. The test infrastructure we've previously developed was intended to ensure these sets of changes don't include regressions.

Note that these patches are rebased and distilled code from the librato-statsd fork. Some of the changes were authored by @adelcambre and @till and integrated into our history. The process of rebasing for creation of Etsy-friendly patchsets has erased that authorship information, and it's not certain if it's possible to edit the history to say so.

The first two commits are cleanup patches from @till which clean up the main directory and remove spurious warnings respectively. The 3rd commit performs a restructure of statsd's sending logic. Specifically it enables syslog support if 'node-syslog' is installed and working, splits statsd's Graphite code into an 'output pipeline' which allows for batched sends, and ensures that new pipeline can submit to several graphing backends at once. Finally, the last patch inserts Librato metrics as a new graphing backend module.

The first two patches should be self-explanatory, but the 3rd should warrant some discussions of the engineering decisions behind its design. The current statsd code has a specific target in mind: a medium number of metrics being delivered to Graphite. Our graphing infrastructure requires handling tens of thousands of metrics simultaneously. As our API is JSON based, we needed a way to split large submissions into smaller, easier-to-handle chunks automatically. This also provides a layer of fault-tolerance and modularity for both Graphite and our API; smaller chunks can be processed and tracked independently of each other both on the network and in the daemon.

With respect to modularity, stats now go through a pipeline: initial data collection, batching, and postprocessing. Postprocessing at the innermost level allows graphing backend-specific code to be localized while the outer maps are common, keeping the codebase clean.

Finally, the last patch exploits the modularity of the 3rd patch to add Librato Metrics as a configurable graphing backend. Note that each graphing backend works independently of each other and can work simultaneously. We have included a test which verifies that this is in fact true. It is our hope that other graphing backends can be added and tested in a similar manner.

@josephruscio

Any thoughts on this? Counting the previous submission that we attempted to improve upon here, it's been almost 5 months now. We're definitely willing to help out in anyway we can. We were thinking about breaking this up into 4 different pull requests if you'd prefer that? Just would like to know that you're interested in merging this upstream before putting more time into it.

@mrtazz
Member
mrtazz commented Mar 29, 2012

Thanks a lot for the Pull Request and sorry for the long delay. It looks really interesting, especially the support for librato metrics. One thing though is that we want to keep the core StatsD as simple as possible as this is definitely one of its strengths. So instead of the switch/case, pluggable graphing backends (maybe with modules?) would be the way to go here. Would you be up to taking a stab at this? It might make sense to split up the Pull Requests then and at least have a distinct one for the pluggable backend.

@josephruscio

@mrtazz thanks for getting back to us :-). That sounds like a workable plan, we'll get those separate requests set up and then close this request with links to them.

@mheffner
Contributor
mheffner commented Apr 2, 2012

@mrtazz

Pluggable backend support has been moved to #69, with support for graphite as a backend. I appreciate any feedback you can provide on that.

@mrtazz
Member
mrtazz commented Jun 8, 2012

Any interest in splitting up the remaining changes in separate Pull Requests so that they are easier to test and integrate?

@mheffner
Contributor
mheffner commented Jun 8, 2012

@mrtazz Yeah, I'll take a look at extracting the remaining parts and will push them up as separate PRs.

@timbunce
Contributor
timbunce commented Dec 7, 2012

@mheffner any news on extracting the remaining parts and pushing them up as separate PRs?

@mheffner
Contributor

I think this can be closed, as the parts have merged in separate pull requests.

@draco2003
Contributor

Thanks!

@draco2003 draco2003 closed this May 7, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment