Metrics aggregation daemon like statsd, in Go and with a bunch of extra features. (Based on code from Bitly's statsdaemon)
For a given input, this implementation yields the exact same metrics as etsy's statsd (with legacy namespace and deleteIdleStats enabled), so it can act as a drop-in replacement. In terms of types:
- Timing (with optional percentiles, sampling supported)
- Counters (sampling supported)
- Gauges
- No histograms or sets yet, but should be easy to add if you want them
metrics 2.0 is a format for structured, self-describing, standardized metrics.
Metrics that flow through statsdaemon and are detected to be in the metrics 2.0 format undergo the same operations and aggregations, but how this is reflected in the resulting metric identifier is different:
- traditional/legacy metrics get prefixes and suffixes like the original statsd
- metrics in 2.0 format will have the appropriate adjustments to their tags. Statsdaemon assures that tags such as unit, target_type, stat, etc reflect the performed operation, according to the specification. This allows users and advanced tools such as Graph-Explorer to truly understand metrics and leverage them.
TBA.
Perhaps debatable and prone to personal opinion, but people seem to agree that Go is more robust, easier to deploy and elegant than node.js. In terms of performance, I didn't do extensive or scientific benchmarking but here's the effect on our cpu usage and calculation time when switching from statsd to statsdaemon, with the same input load and the same things being calculated:
help show this menu
sample_rate <metric key> for given metric, show:
<key> <ideal sample rate> <Pckt/s sent (estim)>
metric_stats in the past 10s interval, for every metric show:
<key> <Pckt/s sent (estim)> <Pckt/s received>
peek_valid stream all valid lines seen in real time
until you disconnect or can't keep up.
peek_invalid stream all invalid lines seen in real time
until you disconnect or can't keep up.
wait_flush after the next flush, writes 'flush' and closes connection.
this is convenient to restart statsdaemon
with a minimal loss of data like so:
nc localhost 8126 <<< wait_flush && /sbin/restart statsdaemon
Statsdaemon submits a bunch of internal performance metrics using itself. Note that these metrics are in the metrics 2.0 format, they look a bit unusual but can be treated as regular graphite metrics if you want to. However using carbon-tagger and Graph-Explorer they become much more useful.
go get github.com/Vimeo/statsdaemon
Usage of ./statsdaemon:
-config_file="/etc/statsdaemon.ini": config file location
-cpuprofile="": write cpu profile to file
-debug=false: print statistics sent to graphite
-memprofile="": write memory profile to this file
-version=false: print version string
listen_addr = ":8125"
admin_addr = ":8126"
graphite_addr = "127.0.0.1:2003"
flush_interval = 60
prefix_rates = "stats."
prefix_timers = "stats.timers."
prefix_gauges = "stats.gauges."
percentile_thresholds = "90,75"
max_timers_per_s = 1000