Commits on Jul 6, 2011
  1. @precipice

    Merge pull request #7 from jamesc/master

    Add support for -p (policy) option in oncall-email.rb
    precipice committed Jul 6, 2011
  2. @jamesc

    Added simple HTML wrapping

     - precursor to full mailing support
    jamesc committed Jul 6, 2011
  3. @jamesc
Commits on Jun 25, 2011
  1. @precipice

    Merge pull request #5 from railsmachine/master

    Update oncall.rb to be able to specify an escalation policy, and scrape more info
    precipice committed Jun 24, 2011
Commits on Jun 24, 2011
  1. @technicalpickles
Commits on Jun 1, 2011
  1. @precipice

    Document the workaround for schedule exceptions messing up rotation-r…

    …eport.rb.
    
    
    This is in response to issue #4.
    precipice committed Jun 1, 2011
  2. @precipice

    Make the trigger description independent of the service name; use tri…

    …gger_type instead.
    
    
    This was another place where I was assuming that the Nagios service was named 'Nagios' (and Pingdom named 'Pingdom'). Fixed to switch on the trigger type instead.
    precipice committed Jun 1, 2011
  3. @precipice

    Don't assume that Nagios services are named "Nagios."

    This recognizes automatic (e.g., API) incident resolutions that come from services named something other than "Nagios." Bad assumption....
    
    Fixes #4.
    precipice committed Jun 1, 2011
Commits on Apr 27, 2011
  1. @thrillcall @precipice

    Prevent undefined local variable exception.

    /thrillcall/pagerduty-tools/lib/pagerduty.rb:90:in `find_domain': undefined local variable or method `account_form' for #<PagerDuty::Agent:0x8a68f1c> (NameError)
    thrillcall committed with precipice Apr 27, 2011
Commits on Mar 23, 2011
  1. @precipice

    Address my review comments from initial pull request, and add more.

    This mostly takes out the organization-specific options in pull request commits, and then makes things a little more consistent with other scripts.
    precipice committed Mar 23, 2011
  2. @precipice

    Rename 'duty-email.rb' to 'oncall-email.rb'.

    Just trying to keep the terminology simple, here.
    precipice committed Mar 23, 2011
  3. @precipice

    allow smtp server configuration

    Jeffrey Wescott committed with precipice Mar 8, 2011
  4. @precipice

    oops -- uncommenting the block that does the work

    Jeffrey Wescott committed with precipice Mar 8, 2011
  5. @precipice

    added ability to get user emails from PagerDuty system, along with a …

    …script to send an email notification to the on-duty person
    Jeffrey Wescott committed with precipice Mar 8, 2011
  6. @precipice
Commits on Mar 12, 2011
  1. @precipice

    Fix a bug that would screw up explicit start and end times.

    There's a little too much going on with date ranges now. Maybe just choose a start date? or start and period?
    precipice committed Mar 11, 2011
Commits on Mar 9, 2011
  1. @precipice

    Fixes for Campfire integration.

    Nokogiri XML generation wasn't working correctly. Partly that was just an error, but to make a tag from a reserved word like 'type', you need to add a trailing underscore, 'type_'.
    precipice committed Mar 9, 2011
  2. @precipice
  3. @precipice

    Update the README and add an example image.

    Also, added a .gitignore, finally.
    precipice committed Mar 9, 2011
  4. @precipice

    Changes based on review by Brian Donovan and Brad Greenlee.

    Thanks, guys!
    
    Also, added a few more comments about things to fix later.
    precipice committed Mar 9, 2011
  5. @precipice

    Change the default report period to the last rotation.

    Instead of reporting on the currently-in-progress rotation (the old default), this changes the default to cover the last-completed rotation.  Seems like a better default since you usually want an apples-to-apples comparison of one whole week versus another.
    
    `rotation-report.rb -a 0` will give you the old behavior.
    precipice committed Mar 9, 2011
  6. @precipice

    Add start and end time options for the report period.

    These options, --start-time and --end-time, take ISO 8601 date/times (e.g., '2011-03-02T14:00:00-05:00'), and can be used to set any arbitrary reporting period you want.  The "previous" period will be the same length, one week earlier.  (Note all the labels say "vs. last week" for the percentage change values, but that should be roughly accurate for most uses.)
    
    I added this because I noticed that using the current rotation period only works if no irregular exceptions are set. If you have a weekly rotation and someone sets a two-day exception, the report will only cover the two-day period versus the same two days a week ago.  Not as useful.  So, this works around that problem for now.
    precipice committed Mar 9, 2011
  7. @precipice
  8. @precipice

    Add command-line options, and support historical reporting and Campfire.

    Using the `-a`|`--rotations-ago COUNT` option, you can create rotation reports for rotations that have already elapsed. The `-c`|`--campfire-message` option will paste the generated report into the configured Campfire room.
    
    This still only supports weekly rotations for now.
    precipice committed Mar 9, 2011
  9. @precipice

    Correct alert count bugs and capture alerts across months.

    Assuming that the alerts in a given report (current and previous period) only span two consecutive months, this will now capture all the needed alerts.
    
    Also works around a bug in Chronic; should send the fix to the maintainer.
    
    At this point the rotation report is pretty useful for reporting on the current rotation. Need to add options for output control and for choosing which rotation period to report on.
    precipice committed Mar 8, 2011
  10. @precipice

    Loop over the incident requests to get all data needed.

    Previously the incident call only pulled in 100 incidents. Hopefully (!) that would be enough, but just in case, this will make repeated calls (up to 10, for 1000 incidents total) until it finds an incident that is prior to the report period (current rotation and the previous rotation).
    
    The loop sleeps one second between calls for API politeness, so this slows things down if you have a lot of alerts in a report period.
    precipice committed Mar 8, 2011
Commits on Mar 8, 2011
  1. @precipice

    fixing broken upstream

    Jeffrey Wescott committed with precipice Mar 9, 2011
  2. @precipice

    Bigger refactor of rotation-report.rb.

    Built out a report class and pulled out common operations to there. Worked up some of the data container objects to be more useful.
    precipice committed Mar 8, 2011
  3. @precipice
  4. @precipice

    Code cleanup and refactoring.

    * Deduped Campfire message-sending code.
    * Created a superclass for reportable events and pulled the time-related code into it.
    * Added some constants.
    * More Rubyismization.
    precipice committed Mar 8, 2011
  5. @precipice

    Use a constant.

    precipice committed Mar 8, 2011
  6. @precipice

    Derive start and end dates automatically from escalation.

    Also, some code cleanup.
    precipice committed Mar 8, 2011
Commits on Mar 7, 2011
  1. @precipice

    Show the volume of alerts in the rotation report.

    I'm very sensitive to people getting interrupted in their work, and would like to know if those interruptions are on their way up or down.  I also want to track how often people are being woken up later at night to deal with alerts.
    
    The report now shows the volume of SMS/Phone alerts overall and compared to the last rotation, and also the volume of late-night (10p to 8a) alerts, also compared to the last rotation.
    
    Also started in on some code cleanups thanks to the Hack Arts crew.
    precipice committed Mar 7, 2011
  2. @precipice

    Make note of unresolved incidents.

    Might make sense to add a "handoff" section later -- here are links to unresolved incidents.
    precipice committed Mar 7, 2011
  3. @precipice

    Only count resolvers for incidents that are resolved.

    Could also take the approach of only asking for resolved incidents in the API call, but it seems better to report the full alert load and instead add a count of how many are resolved.
    precipice committed Mar 7, 2011