Skip to content

Commit

Permalink
chore: add incident response checklist
Browse files Browse the repository at this point in the history
  • Loading branch information
afeld committed Feb 28, 2023
1 parent 726ad18 commit a444f4e
Showing 1 changed file with 54 additions and 0 deletions.
54 changes: 54 additions & 0 deletions .github/ISSUE_TEMPLATE/incident-response.md
@@ -0,0 +1,54 @@
---
name: Incident response
about: Use to track the steps that should happen during an incident, such as a data breach or unexpected downtime, even if it's just a suspicion.
---

_**Do not put sensitive information in this issue or Slack.** Use a file on Google Drive with access restricted._

**Severity:** _High/Medium/Low_

## Initiate

1. [x] Create an issue from this template.
1. [ ] Declare an incident in the relevant Slack channel, such as [#benefits-general][benefits-general].
- Include brief details about the concern.
1. [ ] Start a video call.
1. [ ] Share a link to the video call in Slack, asking relevant parties to join.
1. [ ] Delegate subsequent tasks.

## Assess

- [ ] Determine the impact.
- [ ] Assign the severity above:
- **High:** Possible/confirmed breach of sensitive information, such as production system credentials or personally-identifiable information (PII)
- **Medium:** Full Benefits downtime for more than 30 minutes
- **Low:** Partial service degredation
- [ ] For Medium/High incidents, notify [#benefits-general][benefits-general].

## Remediate

- [ ] Take notes in the Slack thread.
- [ ] Check the [troubleshooting documentation](https://docs.calitp.org/benefits/deployment/troubleshooting/) for relevant information.
- [ ] Post in the Slack thread when the incident has been resolved.
- [ ] Retain any relevant materials

### Medium/High incidents

- [ ] Provide updates to [#benefits-general][benefits-general] every 30 minutes.
- [ ] [Release hotfixes](https://docs.calitp.org/benefits/deployment/release/) as necessary.
- [ ] Notify [#benefits-general][benefits-general] when the incident has been resolved.

## Follow-up

- [ ] For Medium/High incidents, write an incident report. [Past examples.](https://drive.google.com/drive/search?q=parent:1f_UhA3958lrRQ7IVf0mGSpt7A9rSoUQm%20title:incident)
1. [ ] Write a draft.
- Link to relevant Slack messages, etc.
1. [ ] Get thumbs-up from those involved in the incident.
1. [ ] Share with relevant stakeholders.
- [ ] Create issues for follow-up tasks, such as:
- Adding monitoring
- Updating documentation
- Having the system fail more gracefully
- Scheduling a retrospective/post-mortem

[benefits-general]: https://cal-itp.slack.com/archives/c013w8ruamu

0 comments on commit a444f4e

Please sign in to comment.