Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PagerDuty output idempotent; adds support for responder requests #935

Merged
merged 15 commits into from
May 7, 2019

Conversation

Ryxias
Copy link
Contributor

@Ryxias Ryxias commented May 3, 2019

to: @ryandeivert or @chunyong-lin
cc: @airbnb/streamalert-maintainers

Background

The pagerduty-incident code used to have a serious bug where, due to the way it would sequence multiple write methods non-atomically, it could result in a situation where a Pagerduty alert is sent but the output fails. This results in the alert retrying, creating another Pagerduty alert, then failing again... repeating ad infinitum.

Changes

This PR introduces two concepts:

Reduced number of write calls

PagerDuty's code used to do the following:

  • POST /incidents (Create an incident container)
  • POST /events/enqueue (Create an alert)
  • GET /incidents?dedup_key=??? (Find the alert-incident)
  • PUT /incidents/#/merge?incident_id=??? (Merge the alert-incident into the incident container)

I found that this was redundant; the workflow is optimized now:

  • POST /events/enqueue (Create an alert)
  • GET /incidents?dedup_key=??? (Find the alert's incident)
  • PUT /incidents (Modify the alert's incident)

Fewer API calls means the code is cleaner and easier to understand. Also it only has POST method, which reduces the number of non-idempotent write API calls.

Idempotency

PagerDuty's pagerduty-incident output is now idempotent. The POST /events/enqueue API call
now leverages a dedup_key that is equal to the Alert ID.

The usage of a dedup_key ensures that any unique StreamAlert alert will only ever create a single PagerDuty alert. Subsequent retries will cause PagerDuty to return the previously created alert.

Responder Requests

I added a new PagerDuty feature that allows the Alert to leverage Responder Requests. This is a neat feature that allows you to invite people other than the assignee to join in the PagerDuty.

It shows up like this on the PD UI.

image

All members of the response team get all notification (push notification/SMS/etc) related to the Alert except escalation notices. These stay with the assignee.

The awesome benefit here is that we can have PagerDuty alerts stick to a single escalation policy, but attach more watchers.

You can add responder requests to an alert using context:

context={
  responders=['derek1@email.com', 'derek2@email.com' ...],
  responder_message='This message shows up on their request',
}

Testing

CI, Plus I tried
it on Stage A lot. I think
it bothered people

@coveralls
Copy link

coveralls commented May 3, 2019

Coverage Status

Coverage increased (+0.2%) to 96.835% when pulling 55c0712 on dw--pd-idempotency into 4baf713 on master.

errors.append(error)

# Add a note to the incident
note = self._add_incident_note(incident_id, publication, rule_context)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do we add this note to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note is a small string of text that is added to the PagerDuty incident. It shows up on their UI

if responders and not isinstance(responders, list):
responders = [responders]

if responders:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are responders optional? meaning should we return False if not responder or is that okay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responders are optional; it's a new feature so I need to maintain reverse compatibility.

Copy link
Contributor

@ryandeivert ryandeivert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work on this and thanks a ton for the thorough documentation on the changes

@Ryxias Ryxias merged commit f1cf85c into master May 7, 2019
@Ryxias Ryxias deleted the dw--pd-idempotency branch May 7, 2019 17:23
@ryandeivert ryandeivert added this to the 2.2.1 milestone May 8, 2019
@ryandeivert ryandeivert mentioned this pull request Aug 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants