StackPulse Playbooks

StackPulse automates and orchestrates incident response and management, empowering SREs and developers to reduce toil, fix issues faster and deliver reliable software services. This repository contains a set of ready-to-use resources, created by StackPulse together with our partners and the community of our users, that will help you get started with managing and improving the reliability of your services.

To learn more about StackPulse, please refer to our platform description or to our product documentation. For your conveniece, playbooks presented in this repository are arranged by use case.

Alert Enrichment and System Diagnostics

Playbooks in this section enrich, analyze and triage alerts in real-time. They highlight the important data to be use for remediation by the on-caller engineers. Utilizing them in the incident response routine improves MTTR (Mean Time to Resolve) across all teams and personnel, as well as helps leveraging best diagnostics / troubleshooting practices for each system component regardless of the on-call engineers expertise level.

Kubernetes Pod Restarting

This playbook solves consistent Pod Restarting events in a Kubernetes cluster. It gets the latest started pods in the namespace provided either by alert or by the user, then gets the current and previous (if exists) logs of the relevant container.

Kubernetes Job Failed

This playbook extracts logs from a failed Kubernetes job and optionally allows to delete it.

Postgres Long Running Sessions

This playbook collects all non-idle long running sessions from PostgresSQL instance and send it to Slack recipients.

RabbitMQ Queues Overview

This playbook collects an overview about RabbitMQ instance and classify it's most consumption queues by: messages, unacknowledged messages, messages bytes and memory and send it to Slack recipients.

Elasticsearch Get Stats

This playbook collects info, stats and metrics from an Elasticsearch cluster and sends it to Slack.

Elasticsearch Get Stats

Redis Get Big Keys

This playbook queries a Redis host and retrieves the current big keys. It then sends that output to Slack recipients of your choice as a snippet.

Redis Diagnostics

This playbook collects Redis cluster diagnostics that focus on common factors to high memory consumption and performance issues. It then sends that output via Slack.

Linux Diagnostics

This playbook queries utilization of CPU, memory and storage for a given host and sends the output to Slack recipients of choice.

Incident Management and Orchestration

Playbooks in this section help automate incident management and communication flows across the organization or specific per teams and services.

Create Incident War-room (Slack)

This playbook creates a Slack Incident War Room and invites participants to it.

Create Incident War-room (Slack, Zoom, PagerDuty)

Playbook that creates Incident War Room in Slack (and/or Video Conferencing software), invites the relevant participants based on incident details (and/or on on-call rotation schedules).

Create Incident War-room (Slack, Zoom, OpsGenie)

Playbook that creates Incident War Room in Slack (and/or Video Conferencing software), invites the relevant participants based on incident details (and/or on on-call rotation schedules).

Archive Incident War-room (Slack)

Playbook that runs upon incident resolution and asks the incident commander whether to archive the Slack War-room that belonged to the incident.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github		.github
elastic/get-stats		elastic/get-stats
hello-world/hello-world-slack		hello-world/hello-world-slack
images		images
kubernetes		kubernetes
linux/diagnostics		linux/diagnostics
postgres/long-running-sessions		postgres/long-running-sessions
rabbitmq/queues-overview		rabbitmq/queues-overview
redis		redis
stackpulse		stackpulse
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StackPulse Playbooks

Use-Cases

Alert Enrichment and System Diagnostics

Incident Management and Orchestration

About

Releases

Packages

Contributors 11

License

rzsp/playbooks

Folders and files

Latest commit

History

Repository files navigation

StackPulse Playbooks

Use-Cases

Alert Enrichment and System Diagnostics

Incident Management and Orchestration

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 11

Packages