Ruby statsd
Ruby Shell
Latest commit 6a56aaf Apr 26, 2013 @quasor Merge pull request #8 from mulvaney/master
Remove Rails dependencies


A network daemon for aggregating statistics (counters and timers), rolling them up, then sending them to graphite or mongodb.


gem install statsd


Create config.yml to your liking. There are 2 flush protocols: graphite and mongo. The former simply sends to carbon every flush interval. The latter flushes to MongoDB capped collections for 10s and 1min intervals.

Example config.yml

port: 8125

# Flush interval should be your finest retention in seconds
flush_interval: 10        

# Graphite
graphite_host: localhost
graphite_port: 2003

# Mongo
mongo_host: localhost
mongo_database: statsdb

# If you change these, you need to delete the capped collections yourself!
# Average mongo record size is 152 bytes
# 10s and 1min data is transient so we'll use MongoDB's capped collections. These collections are fixed in size.
# 5min and 1d data is interesting to preserve long-term. These collections are not capped.
    - name: stats_per_10s
      seconds: 10
      capped: true
      cap_bytes: 268_435_456 # 2**28
    - name: stats_per_1min
      seconds: 60
      capped: true
      cap_bytes: 1_073_741_824 # 2**30
    - name: stats_per_5min
      seconds: 600
      cap_bytes: 0 
      capped: false
    - name: stats_per_day
      seconds: 86400
      cap_bytes: 0 
      capped: false


Run the server:

Flush to Graphite (default): statsd -c config.yml

Flush and aggregate to MongoDB: statsd -c config.yml -m


In your client code:

require 'rubygems'
require 'statsd'
STATSD ='localhost',8125)

STATSD.increment('some_counter') # basic incrementing
STATSD.increment('system.nested_counter', 0.1) # incrementing with sampling (10%)

STATSD.decrement(:some_other_counter) # basic decrememting using a symbol
STATSD.decrement('system.nested_counter', 0.1) # decrementing with sampling (10%)

STATSD.timing('some_job_time', 20) # reporting job that took 20ms
STATSD.timing('some_job_time', 20, 0.05) # reporting job that took 20ms with sampling (5% sampling)


  • buckets Each stat is in it's own "bucket". They are not predefined anywhere. Buckets can be named anything that will translate to Graphite (periods make folders, etc)

  • values Each stat will have a value. How it is interpreted depends on modifiers

  • flush After the flush interval timeout (default 10 seconds), stats are munged and sent over to Graphite.



This is a simple counter. Add 1 to the "gorets" bucket. It stays in memory until the flush interval.



The glork took 320ms to complete this time. StatsD figures out 90th percentile, average (mean), lower and upper bounds for the flush interval.



Tells StatsD that this counter is being sent sampled ever 1/10th of the time.



StatsD now also supports gauges, arbitrary values, which can be recorded.



Graphite uses "schemas" to define the different round robin datasets it houses (analogous to RRAs in rrdtool):

priority = 110 
pattern = ^stats\..*
retentions = 10:2160,60:10080,600:262974

That translates to:

  • 6 hours of 10 second data (what we consider "near-realtime")
  • 1 week of 1 minute data
  • 5 years of 10 minute data

This has been a good tradeoff so far between size-of-file (round robin databases are fixed size) and data we care about. Each "stats" database is about 3.2 megs with these retentions.


Statd::Mongo will flush and aggregate data to a MongoDB. The average record size is 152 bytes. We use capped collections for the transient data and regular collections for long-term storage.


Etsy's blog post.

StatsD was inspired (heavily) by the project (of the same name) at Flickr. Here's a post where Cal Henderson described it in depth: Counting and timing. Cal re-released the code recently: Perl StatsD