IMPORTANT: Status - Deprecated
Since I originally wrote this, PagerDuty has released a more complete API and a lot of the approach here is no longer needed. I'd strongly suggest not using this project and instead relying on their API directly.
Tools to work around limitations in the PagerDuty API. As an example use, here are two Campfire updates from these scripts that set the room topic to the current on-call rotation, and then report on the incidents and alerts from the previous rotation:
Ruby 1.9 is required. Run:
$ gem install pagerduty_tools
The scripts provided by the gem will be in the gem executable directory. If this
isn't already in your path, run
gem environment | grep "EXECUTABLE DIRECTORY"
and put the result into your path.
The scripts log into the PagerDuty site when first run. Your email address
will be used to find associated PagerDuty accounts, and you can choose the
account you want to report on. After the first run, a login cookie is kept in
~/.pagerduty-cookies to allow future runs to be automatic (e.g., from cron).
If you would like to have PagerDuty reports sent to your
Campfire room, create a "PagerDuty" user in your
Campfire account, and then add a configuration file at
~/.pagerduty-campfire.yaml containing the following:
site: https://example.campfirenow.com room: 99999 token: abababababababababababababababababababab
with the values changed to match your configuration. I'd recommend running:
$ chmod 0600 ~/.pagerduty-campfire.yaml
after creating the file. See the documentation for each script for how to send output to Campfire.
Tip: you can use PagerDuty's Twitter icon as a profile icon for your Campfire PagerDuty account. This isn't necessary, but it makes the PagerDuty message more recognizable and nicer.
- The rotation-report.rb script works well for weekly rotations with no
exceptions set. It might work well for daily rotations (comparing to the
same day one week ago), but hasn't been tested for that; and it fails
completely if any of the weeks compared have an exception set. If you set
an exception, you can work around this limitation using the
--end-timeoptions to explicitly set the report date range.
- Login and other errors from PagerDuty's site are not parsed or reported.
oncall.rb script reports who is currently on call for your PagerDuty
account. Invoked with no arguments, it will list all on-call levels (1..n). If
one or more levels are given as arguments, it will only list those levels.
If the on call level has an associated on-call rotation, the name of that
rotation is used in the output. Otherwise, a generic
Level <#> format is
You can invoke oncall.rb with a
--campfire-topic option, and the
output of the script will be set as the topic for the configured room (see
Campfire Support, above). We do this out of cron right after the rotation
turns over to a new assignment.
oncall.rb defaults to showing the first escalation policy, but if you have
multiple ones and want to show a specific one, you can invoke it with
--policy to specify which one to use.
Calling the script with
--help will display some help.
$ ./oncall.rb Hotseat: John Henry, Hotseat Backup: Lisa Limon, Level 3: Steven Sanders $ ./oncall.rb 1 2 Hotseat: John Henry, Hotseat Backup: Lisa Limon $ ./oncall.rb --campfire-topic 1 2 [No shell output, but the configured Campfire room's topic becomes: "Hotseat: John Henry, Hotseat Backup: Lisa Limon"]
rotation-report.rb script generates an automatic "end of shift" report
to show what happened over the course of a rotation. It measures how many
incidents occurred, shows who resolved them, and shows how many alerts people
got (including a breakout of after-midnight alerts, which we all must strive
to eradicate!). Also, it lists the top five causes for alerts during the
rotation, and compares the counts to the same period one week earlier.
Here's an example:
Rotation report for February 23 - March 02: 19 incidents (-9% vs. last week) Resolutions: John Henry: 8, George Harrison: 4, Scott Brinkley: 4, Jason Neeson: 2, [Automatic]: 1 SMS/Phone Alerts (62 total, +77% vs. last week; 6 after midnight, -53% vs. last week): John Henry: 44, George Harrison: 10, Jason Neeson: 4, Scott Brinkley: 4 Top triggers: 6 'Pingdom: DOWN alert: example-health (www.example.com) is DOWN' (-14% vs. last week) 5 'Pingdom: DOWN alert: sg-health (sg.example.com) is DOWN' (no occurrences last week) 4 'Nagios: vip-api - check_api_lag' (+300% vs. last week) 1 'Nagios: vip-redisapi - check_live_redis_lag' (-66% vs. last week) 1 'Pingdom: DOWN alert: client-nike (www.nike.com) is DOWN' (no occurrences last week)
By default the script will report on the most recently-completed rotation.
However, you can use the
--rotations-ago COUNT option to specify how
far back in history you want to go. Or, you can use
--end-time DATETIME (giving the date in
date and time format, e.g. "2011-03-02T14:00:00-05:00") to set a specific range
for the report.
Calling rotation-report.rb with a
--campfire-message argument will
cause the rotation report to be pasted into the configured Campfire room. (See
Campfire Support, above, for information about setting this up.)
Calling the script with
--help will display some help.
This script is being revised and doesn't work with the rest of the package yet.
Pull requests welcome. There are no tests or specs yet, so hey, contributing couldn't be easier.
Thanks to the following people for contributions!
Copyright 2011 Marc Hedlund. Distributed under the Apache License, version 2.0.