Switch branches/tags
Nothing to show
Commits on Jul 6, 2011
  1. Merge pull request #7 from jamesc/master

    precipice committed Jul 6, 2011
    Add support for -p (policy) option in oncall-email.rb
  2. Added simple HTML wrapping

    jamesc committed Jul 6, 2011
     - precursor to full mailing support
Commits on Jun 25, 2011
  1. Merge pull request #5 from railsmachine/master

    precipice committed Jun 25, 2011
    Update oncall.rb to be able to specify an escalation policy, and scrape more info
Commits on Jun 24, 2011
Commits on Jun 1, 2011
  1. Document the workaround for schedule exceptions messing up rotation-r…

    precipice committed Jun 1, 2011
    This is in response to issue #4.
  2. Make the trigger description independent of the service name; use tri…

    precipice committed Jun 1, 2011
    …gger_type instead.
    This was another place where I was assuming that the Nagios service was named 'Nagios' (and Pingdom named 'Pingdom'). Fixed to switch on the trigger type instead.
  3. Don't assume that Nagios services are named "Nagios."

    precipice committed Jun 1, 2011
    This recognizes automatic (e.g., API) incident resolutions that come from services named something other than "Nagios." Bad assumption....
    Fixes #4.
Commits on Apr 27, 2011
  1. Prevent undefined local variable exception.

    thrillcall committed with precipice Apr 26, 2011
    /thrillcall/pagerduty-tools/lib/pagerduty.rb:90:in `find_domain': undefined local variable or method `account_form' for #<PagerDuty::Agent:0x8a68f1c> (NameError)
Commits on Mar 23, 2011
  1. Address my review comments from initial pull request, and add more.

    precipice committed Mar 23, 2011
    This mostly takes out the organization-specific options in pull request commits, and then makes things a little more consistent with other scripts.
  2. Rename 'duty-email.rb' to 'oncall-email.rb'.

    precipice committed Mar 23, 2011
    Just trying to keep the terminology simple, here.
  3. allow smtp server configuration

    Jeffrey Wescott committed with precipice Mar 9, 2011
  4. oops -- uncommenting the block that does the work

    Jeffrey Wescott committed with precipice Mar 9, 2011
  5. added ability to get user emails from PagerDuty system, along with a …

    Jeffrey Wescott committed with precipice Mar 9, 2011
    …script to send an email notification to the on-duty person
Commits on Mar 12, 2011
  1. Fix a bug that would screw up explicit start and end times.

    precipice committed Mar 12, 2011
    There's a little too much going on with date ranges now. Maybe just choose a start date? or start and period?
Commits on Mar 9, 2011
  1. Fixes for Campfire integration.

    precipice committed Mar 9, 2011
    Nokogiri XML generation wasn't working correctly. Partly that was just an error, but to make a tag from a reserved word like 'type', you need to add a trailing underscore, 'type_'.
  2. Update the README and add an example image.

    precipice committed Mar 9, 2011
    Also, added a .gitignore, finally.
  3. Changes based on review by Brian Donovan and Brad Greenlee.

    precipice committed Mar 9, 2011
    Thanks, guys!
    Also, added a few more comments about things to fix later.
  4. Change the default report period to the last rotation.

    precipice committed Mar 9, 2011
    Instead of reporting on the currently-in-progress rotation (the old default), this changes the default to cover the last-completed rotation.  Seems like a better default since you usually want an apples-to-apples comparison of one whole week versus another.
    `rotation-report.rb -a 0` will give you the old behavior.
  5. Add start and end time options for the report period.

    precipice committed Mar 9, 2011
    These options, --start-time and --end-time, take ISO 8601 date/times (e.g., '2011-03-02T14:00:00-05:00'), and can be used to set any arbitrary reporting period you want.  The "previous" period will be the same length, one week earlier.  (Note all the labels say "vs. last week" for the percentage change values, but that should be roughly accurate for most uses.)
    I added this because I noticed that using the current rotation period only works if no irregular exceptions are set. If you have a weekly rotation and someone sets a two-day exception, the report will only cover the two-day period versus the same two days a week ago.  Not as useful.  So, this works around that problem for now.
  6. Add command-line options, and support historical reporting and Campfire.

    precipice committed Mar 9, 2011
    Using the `-a`|`--rotations-ago COUNT` option, you can create rotation reports for rotations that have already elapsed. The `-c`|`--campfire-message` option will paste the generated report into the configured Campfire room.
    This still only supports weekly rotations for now.
  7. Correct alert count bugs and capture alerts across months.

    precipice committed Mar 8, 2011
    Assuming that the alerts in a given report (current and previous period) only span two consecutive months, this will now capture all the needed alerts.
    Also works around a bug in Chronic; should send the fix to the maintainer.
    At this point the rotation report is pretty useful for reporting on the current rotation. Need to add options for output control and for choosing which rotation period to report on.
  8. Loop over the incident requests to get all data needed.

    precipice committed Mar 8, 2011
    Previously the incident call only pulled in 100 incidents. Hopefully (!) that would be enough, but just in case, this will make repeated calls (up to 10, for 1000 incidents total) until it finds an incident that is prior to the report period (current rotation and the previous rotation).
    The loop sleeps one second between calls for API politeness, so this slows things down if you have a lot of alerts in a report period.
Commits on Mar 8, 2011
  1. fixing broken upstream

    Jeffrey Wescott committed with precipice Mar 8, 2011
  2. Bigger refactor of rotation-report.rb.

    precipice committed Mar 8, 2011
    Built out a report class and pulled out common operations to there. Worked up some of the data container objects to be more useful.
  3. Code cleanup and refactoring.

    precipice committed Mar 8, 2011
    * Deduped Campfire message-sending code.
    * Created a superclass for reportable events and pulled the time-related code into it.
    * Added some constants.
    * More Rubyismization.
  4. Use a constant.

    precipice committed Mar 8, 2011
  5. Derive start and end dates automatically from escalation.

    precipice committed Mar 8, 2011
    Also, some code cleanup.
Commits on Mar 7, 2011
  1. Show the volume of alerts in the rotation report.

    precipice committed Mar 7, 2011
    I'm very sensitive to people getting interrupted in their work, and would like to know if those interruptions are on their way up or down.  I also want to track how often people are being woken up later at night to deal with alerts.
    The report now shows the volume of SMS/Phone alerts overall and compared to the last rotation, and also the volume of late-night (10p to 8a) alerts, also compared to the last rotation.
    Also started in on some code cleanups thanks to the Hack Arts crew.
  2. Make note of unresolved incidents.

    precipice committed Mar 7, 2011
    Might make sense to add a "handoff" section later -- here are links to unresolved incidents.
  3. Only count resolvers for incidents that are resolved.

    precipice committed Mar 7, 2011
    Could also take the approach of only asking for resolved incidents in the API call, but it seems better to report the full alert load and instead add a count of how many are resolved.