Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runbook Update #353

Closed
6 tasks
lexicondevil opened this issue Jun 5, 2023 · 1 comment
Closed
6 tasks

Runbook Update #353

lexicondevil opened this issue Jun 5, 2023 · 1 comment

Comments

@lexicondevil
Copy link
Contributor

Note: This is issue is part of the Service Transfer Project. The goal is to ensure project documentation is up to date and help the receiving team understand what the service does and how to maintain and operate it. The previous team is primarily responsible for doing this work, and the receiving team is the stakeholder on this issue and has final approval.

These are a set of guidelines, not a rigid set of requirements. If the receiving team already has expertise on this service and is comfortable operating it, they may complete whatever subset of the tasks they find appropriate and close this issue.

The assignees on this issue are intended to be "manager of previous team" and "manager of new team" based on what's in the Service Ownership Spreadsheet. If these are incorrect please update the assignees on this issue and update the spreadsheet to match.

Runbook Update

Make sure there’s an operations runbook for the service that meets these criteria:

  • Service runbooks should be created under the “Specific Services” section of the Infra Runbooks page in Notion. (You don't need to move existing runbooks here, but if you're creating a new runbook this would be a good place to put it.)
  • Link to the runbook from first section of service README
  • Links to dashboards
  • Instructions on how to deploy, how to hotfix, how to roll back a deploy
  • Descriptions of common issues and how to triage whether or not it is occurring
  • Some monitors may need a specific response from the on-call engineer. In that case there should be a dedicated section in the runbook for that process, and the monitor should link to that section.

Link to the service's runbook here so the receiving team can review.

Examples:
Envelope runbook
Functions Origin Runbook

Further reading:
https://www.pagerduty.com/resources/learn/what-is-a-runbook/
https://www.transposit.com/devops-blog/itsm/what-makes-a-good-runbook/

@rybit
Copy link
Member

rybit commented Jun 5, 2023

This is an public repo. I'm going to close this for now.

@rybit rybit closed this as completed Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants