Skip to content
imeyer edited this page Oct 17, 2014 · 9 revisions

chef-handler-graphite

Simple handler to send data to Graphite about your node's Chef runs, including elapsed time, total number of resources, number of resources updated, and success or failure.

Requirements

I have not tested this extensively on Chef >= 11, please let me know if you see issues.

Installing

gem install chef-handler-graphite

Using

There are two ways to go about using this handler.

The quick, easy, not so flexible way.

Edit your chef/client.rb file something like below

require 'chef-handler-graphite'

# Configure the handler
graphite_handler = GraphiteReporting.new

# metric_key is a string that prepends every metric sent to Graphite
graphite_handler.metric_key = "imeyer.chef.#{Chef::Config.node_name}"

# Hostname and port of your Graphite server
graphite_handler.graphite_host = "graphite.server.hostname"
graphite_handler.graphite_port = "2003"

# Add your handler
report_handlers << graphite_handler
exception_handlers << graphite_handler

The cooler, more flexible way.

Download and install the chef_handler cookbook.

Create a recipe named graphite in the chef_handler cookbook and add the following block of code

chef_gem "chef-handler-graphite"

argument_array = [
  :metric_key => "imeyer.chef.#{node['hostname']}",
  :graphite_host => "graphite.hostname",
  :graphite_port => 2003
]

chef_handler "GraphiteReporting" do
  source "#{Gem::Specification.find_by_name('chef-handler-graphite').lib_dirs_glob}/chef-handler-graphite.rb"
  arguments argument_array
  action :nothing
end.run_action(:enable)

Upload the cookbook, add it as the very first recipe in your run_list and away you go.

After a few runs, you should be able to create a graph somewhat like this one that shows number of nodes, their average elapsed time, average number of resources updated, and fails, vertically dashed.

Graphite graph

For success and fail metrics, best you use the drawInfiniteAsZero function and keepLastValue for the others.. as it will just look like spots otherwise. But when you have a lot of hosts and include functions like average or highestMax, this should be pretty informative.

Further reading

Exception and Report Handlers from the Opscode wiki